Description
You are given a string, s, and a list of words, words, that are all of the same length. Find all starting indices of substring(s) in s that is a concatenation of each word in words exactly once and without any intervening characters.
Example
Input:
s = "barfoothefoobarman",
words = ["foo","bar"]
Output: [0,9]
Explanation: Substrings starting at index 0 and 9 are "barfoo" and "foobar" respectively.
The output order does not matter, returning [9,0] is fine too.
分析
采用搜索的方法, 具体做法如下
0 (预处理), 将 words 中中的所有单词用 26 进制转换为 数字
1. 先统计 words 中所有单词中的 字母 个数, 然后 iter 输入字符串 ss。 找到一个窗口,其中所有的字符个数和 words 中一致。
2. 窗口中的字符以 26 进制转换为数字, 和 words 中的数字表示进行比较。
code
class Solution(object):
def findSubstring(self, ss, words):
"""
:type s: str
:type words: List[str]
:rtype: List[int]
"""
N, M, A = len(ss), len(words), ord('a')
if N*M == 0:
return []
L = M * len(words[0])
def build(word):
s = 0
for i in word:
s = s * 26 + ord(i) - A
return s
def build2(str):
i = 0
lWordN = []
step = len(words[0])
for i in range(0, len(str), step):
lWordN.append(build(str[i:i+step]))
return sorted(lWordN)
wordsNum = sorted([build(v) for v in words])
if M*len(words[0]) > N:
return []
dp = [0 for i in range(26)]
for v in words:
for k in v:
dp[ord(k) - A] += 1
for i, v in enumerate(dp):
if v == 0:
dp[i] = -1
s, e, ret, flag = 0, 0, [], False
while N > e:
if dp[ord(ss[e])-A]> - 1:
dp[ord(ss[e])-A] -= 1
s, e = e, e+1
break
e += 1
if e == N:
if build2(ss[s:e+1]) == wordsNum:
ret.append(s)
return ret
while N > e:
chn = ord(ss[e]) - A
if dp[chn] == -1:
e += 1
flag = True
continue
if flag:
while e > s:
s_key = ord(ss[s]) - A
s += 1
if dp[s_key] > -1:
dp[s_key] += 1
flag = False
# can be more fast if we can just reset array dp
if dp[chn] == 0:
while e > s:
s_key = ord(ss[s]) - A
s += 1
if s_key == chn:
break
if dp[s_key] > -1:
dp[s_key] += 1
dp[chn] += 1
dp[chn] -= 1
if e - s +1 == L:
if build2(ss[s:e+1]) == wordsNum:
ret.append(s)
e += 1
return ret
Runtime: 936 ms, faster than 36.72% of Python online submissions for Substring with Concatenation of All Words.
速度上并不是很快,需要思考一波