Implement a trie with insert
, search
, and startsWith
methods.
Note:
You may assume that all inputs are consist of lowercase letters a-z
.
This article is for intermediate level users. It introduces the following ideas:\nThe data structure Trie (Prefix tree) and most common operations with it.
\nTrie (we pronounce "try") or prefix tree is a tree data structure, which is used for retrieval of a key in a dataset of strings.\nThere are various applications of this very efficient data structure such as :
\nFigure 1. Google Suggest in action.
\nFigure 2. A spell checker used in word processor.
\nFigure 3. Longest prefix matching algorithm uses Tries in Internet Protocol (IP) routing to select an entry from a forwarding table.
\nFigure 4. T9 which stands for Text on 9 keys, was used on phones to input texts during the late 1990s.
\nFigure 5. Tries is used to solve Boggle efficiently by pruning the search space.
\nThere are several other data structures, like balanced trees and hash tables, which give us the possibility to search for a word in a dataset of strings. Then why do we need trie?\nAlthough hash table has time complexity for looking for a key, it is not efficient in the following operations :
\nAnother reason why trie outperforms hash table, is that as hash table increases in size, there are lots of hash collisions and the search time complexity could deteriorate to , where is the number of keys inserted.\nTrie could use less space compared to Hash Table when storing many keys with the same prefix.\nIn this case using trie has only time complexity, where is the key length.\nSearching for a key in a balanced tree costs time complexity.
\nTrie is a rooted tree. Its nodes have the following fields:
\nFigure 6. Representation of a key "leet" in trie.
\nJava
\nclass TrieNode {\n\n // R links to node children\n private TrieNode[] links;\n\n private final int R = 26;\n\n private boolean isEnd;\n\n public TrieNode() {\n links = new TrieNode[R];\n }\n\n public boolean containsKey(char ch) {\n return links[ch -\'a\'] != null;\n }\n public TrieNode get(char ch) {\n return links[ch -\'a\'];\n }\n public void put(char ch, TrieNode node) {\n links[ch -\'a\'] = node;\n }\n public void setEnd() {\n isEnd = true;\n }\n public boolean isEnd() {\n return isEnd;\n }\n}\n
Two of the most common operations in a trie are insertion of a key and search for a key.
\nWe insert a key by searching into the trie. We start from the root and search a link, which corresponds to the first key character. There are two cases :
\nFigure 7. Insertion of keys into a trie.
\nJava
\nclass Trie {\n private TrieNode root;\n\n public Trie() {\n root = new TrieNode();\n }\n\n // Inserts a word into the trie.\n public void insert(String word) {\n TrieNode node = root;\n for (int i = 0; i < word.length(); i++) {\n char currentChar = word.charAt(i);\n if (!node.containsKey(currentChar)) {\n node.put(currentChar, new TrieNode());\n }\n node = node.get(currentChar);\n }\n node.setEnd();\n }\n}\n
Complexity Analysis
\nIn each iteration of the algorithm, we either examine or create a node in the trie till we reach the end of the key. This takes only operations.
\nIn the worst case newly inserted key doesn\'t share a prefix with the the keys already inserted in the trie. We have to add \nnew nodes, which takes us space.
\nEach key is represented in the trie as a path from the root to the internal node or leaf.\nWe start from the root with the first key character. We examine the current node for a link corresponding to the key character. There are two cases :
\nA link does not exist. If there are no available key characters and current node is marked as isEnd
we return true. Otherwise there are possible two cases in each of them we return false :
isEnd
. Therefore the search key is only a prefix of another key in the trie.Figure 8. Search for a key in a trie.
\nJava
\nclass Trie {\n ...\n\n // search a prefix or whole key in trie and\n // returns the node where search ends\n private TrieNode searchPrefix(String word) {\n TrieNode node = root;\n for (int i = 0; i < word.length(); i++) {\n char curLetter = word.charAt(i);\n if (node.containsKey(curLetter)) {\n node = node.get(curLetter);\n } else {\n return null;\n }\n }\n return node;\n }\n\n // Returns if the word is in the trie.\n public boolean search(String word) {\n TrieNode node = searchPrefix(word);\n return node != null && node.isEnd();\n }\n}\n
Complexity Analysis
\nTime complexity : \nIn each step of the algorithm we search for the next key character. In the worst case the algorithm performs operations.
\nSpace complexity : \n
\nThe approach is very similar to the one we used for searching a key in a trie. We traverse the trie from the root, till there are no characters left in key prefix or it is impossible to continue the path in the trie with the current key character. The only difference with the mentioned above search for a key
algorithm is that when we come to an end of the key prefix, we always return true. We don\'t need to consider the isEnd
mark of the current trie node, because we are searching for a prefix of a key, not for a whole key.
Figure 9. Search for a key prefix in a trie.
\nJava
\nclass Trie {\n ...\n\n // Returns if there is any word in the trie\n // that starts with the given prefix.\n public boolean startsWith(String prefix) {\n TrieNode node = searchPrefix(prefix);\n return node != null;\n }\n}\n
Complexity Analysis
\nTime complexity : \n
\nSpace complexity : \n
\nHere are some wonderful problems for you to practice which uses the Trie data structure.
\nAnalysis written by: @elmirap.
\n