问题描述:
设计一个支持以下两种操作的数据结构:
void addWord(word)
bool search(word)search(word) 可以搜索文字或正则表达式字符串,字符串只包含字母 . 或 a-z 。 . 可以表示任何一个字母。
示例:
addWord("bad")
addWord("dad")
addWord("mad")
search("pad") -> false
search("bad") -> true
search(".ad") -> true
search("b..") -> true说明:
你可以假设所有单词都是由小写字母 a-z 组成的。
来源:力扣(LeetCode)
链接:https://leetcode-cn.com/problems/add-and-search-word-data-structure-design
思路:
字典树:
字典树,又称单词查找树,Trie 树,是一种树形结构,是一种哈希树的变种。典型应用是用于统计,排序和保存大量的字符串(但不仅限于字符串),所以经常被搜索引擎系统用于文本词频统计。它的优点是:利用字符串的公共前缀来减少查询时间,最大限度地减少无谓的字符串比较,查询效率比哈希树高。
注:定义来自百度百科。
它有 3 个基本性质:
- 根节点不包含字符,除根节点外每一个节点都只包含一个字符;
- 从根节点到某一节点,路径上经过的字符连接起来,为该节点对应的字符串;
- 每个节点的所有子节点包含的字符都不相同。
Python版本:
这一版的代码无法通过所有用例,原因在于没有对通配符的情况,做26个分支的逐个匹配,这个过程的实现应该将search算法里面的匹配代码改成递归匹配。
修改之后的版本看下面一版:
class WordDictionary(object):
class TrieNode(object):
def __init__(self):
self.isword = False
self.nodes = [None] * 26
def __init__(self):
"""
Initialize your data structure here.
"""
self.root = WordDictionary.TrieNode()
def addWord(self, word):
"""
Adds a word into the data structure.
:type word: str
:rtype: None
"""
p = self.root
for s in word:
if p.nodes[ord(s)-ord('a')] is None:
p.nodes[ord(s)-ord('a')] = WordDictionary.TrieNode()
p = p.nodes[ord(s)-ord('a')]
p.isword = True
def search(self, word):
"""
Returns if the word is in the data structure. A word could contain the dot character '.' to represent any one letter.
:type word: str
:rtype: bool
"""
p = self.root
for s in word:
if s==".":
return True
return False
elif p.nodes[ord(s)-ord('a')] is None :
return False
p = p.nodes[ord(s)-ord('a')]
return p.isword
# Your WordDictionary object will be instantiated and called as such:
# obj = WordDictionary()
# obj.addWord(word)
# param_2 = obj.search(word)
通过的Python版本:
class WordDictionary(object):
class TrieNode(object):
def __init__(self):
self.isword = False
self.nodes = [None] * 26
def __init__(self):
"""
Initialize your data structure here.
"""
self.root = WordDictionary.TrieNode()
def addWord(self, word):
"""
Adds a word into the data structure.
:type word: str
:rtype: None
"""
p = self.root
for s in word:
if p.nodes[ord(s)-ord('a')] is None:
p.nodes[ord(s)-ord('a')] = WordDictionary.TrieNode()
p = p.nodes[ord(s)-ord('a')]
if not p.isword:
p.isword = True
def search(self, word):
"""
Returns if the word is in the data structure. A word could contain the dot character '.' to represent any one letter.
:type word: str
:rtype: bool
"""
p = self.root
return self.match(word,p,0)
def match(self,word, TrieNode, start):
if start == len(word):
return TrieNode.isword
t = word[start]
# 注意:如果当前字母是 "." ,每一个分支都要走一遍
if t == '.':
for i in range(26):
if TrieNode.nodes[i] and self.match(word, TrieNode.nodes[i], start + 1):
return True
return False
else:
if not TrieNode.nodes[ord(t)-ord('a')]:
return False
return self.match(word, TrieNode.nodes[ord(t) - ord('a')], start + 1)
# Your WordDictionary object will be instantiated and called as such:
# obj = WordDictionary()
# obj.addWord(word)
# param_2 = obj.search(word)
附:
python构造一般字典树的代码:
class TrieNode(object):
def __init__(self):
# 是否构成一个完成的单词
self.is_word = False
self.children = [None] * 26
class Trie(object):
def __init__(self):
self.root = TrieNode()
def add(self, s):
"""Add a string to this trie."""
p = self.root
n = len(s)
for i in range(n):
if p.children[ord(s[i]) - ord('a')] is None:
new_node = TrieNode()
if i == n - 1:
new_node.is_word = True
p.children[ord(s[i]) - ord('a')] = new_node
p = new_node
else:
p = p.children[ord(s[i]) - ord('a')]
if i == n - 1:
p.is_word = True
return
def search(self, s):
"""Judge whether s is in this trie."""
p = self.root
for c in s:
p = p.children[ord(c) - ord('a')]
if p is None:
return False
if p.is_word:
return True
else:
return False
if __name__ == '__main__':
trie = Trie()
trie.add('str')
print trie.search('acb')