Trie(pronounced as “try”): Trie(also known as the digital tree or prefix tree) is a sorted and efficient tree-based special data structure that is used to store and retrieve keys in a dataset of strings It is based on the prefix of a string. It can be visualized as a graph consisting of nodes and edges. Each node of a trie can have as many as 26 pointers/references.
Applications of Trie data structure:
It has a wide variety of applications in data compression, computational biology, longest prefix matching algorithm used for routing tables for IP addresses, implementation of the dictionary, pattern searching, storing/querying XML documents, etc.
Real-time applications of Trie data structure:
1. Browser History: Web browsers keep track of the history of websites visited by the user So when the prefix of a previously visited URL is written in the address bar the user would be given suggestions of the website to visit.
Trie is used by storing the number of visits to a website as the key value and organizing this history on the Trie data structure
2. AutoComplete: It is one of the most important applications of trie data structure. This feature speeds up interactions between a user and the application and greatly enhances the user experience. Auto Complete feature is used by web browsers, email, search engines, code editors, command-line interpreters(CLI), and word processors.
Trie provides the alphabetical ordering of data by keys. Trie is used because it is the fastest for auto-complete suggestions, even in the worst case, it is O(n) (where n is the string length ) times faster than the alternate imperfect hash table algorithm and also overcomes the problem of key collisions in imperfect hash tables.
3. Spell Checkers/Auto-correct: It is a 3-step process that includes :
- Checking for the word in the data dictionary.
- Generating potential suggestions.
- Sorting the suggestions with higher priority on top.
Trie stores the data dictionary and makes it easier to build an algorithm for searching the word from the dictionary and provides the list of valid words for the suggestion.
4. Longest Prefix Matching Algorithm(Maximum Prefix Length Match): This algorithm is used in networking by the routing devices in IP networking. Optimization of network routes requires contiguous masking that bound the complexity of lookup a time to O(n), where n is the length of the URL address in bits.
To speed up the lookup process, Multiple Bit trie schemes were developed that perform the lookups of multiple bits faster.
Advantages of Trie data structure:
- Trie allows us to input and finds strings in O(l) time, where l is the length of a single word. It is faster as compared to both hash tables and binary search trees.
- It provides alphabetical filtering of entries by the key of the node and hence makes it easier to print all words in alphabetical order.
- Trie takes less space when compared to BST because the keys are not explicitly saved instead each key requires just an amortized fixed amount of space to be stored.
- Prefix search/Longest prefix matching can be efficiently done with the help of trie data structure.
- Since trie doesn’t need any hash function for its implementation so they are generally faster than hash tables for small keys like integers and pointers.
- Tries support ordered iteration whereas iteration in a hash table will result in pseudorandom order given by the hash function which is usually more cumbersome.
- Deletion is also a straightforward algorithm with O(l) as its time complexity, where l is the length of the word to be deleted.
Disadvantages of Trie data structure:
- The main disadvantage of the trie is that it takes a lot of memory to store all the strings. For each node, we have too many node pointers which are equal to the no of characters in the worst case.
- An efficiently constructed hash table(i.e. a good hash function and a reasonable load factor) has O(1) as lookup time which is way faster than O(l) in the case of a trie, where l is the length of the string.