滑动窗口2:串联所有单词的子串

串联所有单词的子串

给定一个字符串 s 和一些长度相同的单词 words。找出 s 中恰好可以由 words 中所有单词串联形成的子串的起始位置。

注意子串要与 words 中的单词完全匹配，中间不能有其他字符，但不需要考虑 words 中单词串联的顺序。

 

示例 1：

输入：
  s = "barfoothefoobarman",
  words = ["foo","bar"]
输出：[0,9]
解释：
从索引 0 和 9 开始的子串分别是 "barfoo" 和 "foobar" 。
输出的顺序不重要, [9,0] 也是有效答案。
示例 2：

输入：
  s = "wordgoodgoodgoodbestword",
  words = ["word","good","best","word"]
输出：[]

来源：力扣（LeetCode）
链接：https://leetcode-cn.com/problems/substring-with-concatenation-of-all-words
著作权归领扣网络所有。商业转载请联系官方授权，非商业转载请注明出处。

解答

# -*- coding:utf-8 -*-

class Solution(object):
    def findSubstring(self, s, words):
        """
        :type s: str
        :type words: List[str]
        :rtype: List[int]
        """
        if not s or not words:
            return []
        from collections import Counter
        word_num = len(words)
        word_len = len(words[0])  # 题目中单词长度都相同
        window_len = word_num * word_len  # 滑动窗口的长度
        window_left = 0  # 滑动窗口的左边索引
        result_start_index_list = []  # 输出结果：所有符合条件的滑动窗口的左边索引
        target_counter = Counter(words)  # 因为可以乱序，因此通过统计单词出现次数是否相同来比较。
        while window_left + window_len <= len(s):
            cur_word_list = []
            i = window_left
            # 将窗口内按照单词长度切割字符串，添加到cur_word_list中
            flag = True
            for j in range(word_num):
                cur_word = s[i:i+word_len]
                # 优化: 当某个单词不在words无需再判断后面
                if cur_word not in words: 
                    flag = False
                    break
                    
                cur_word_list.append(cur_word)
                i += word_len
                
            # 比较结果
            if flag and Counter(cur_word_list) == target_counter:
                result_start_index_list.append(window_left)

            # 滑动窗口往左移动一位
            window_left += 1

        return result_start_index_list

if __name__ == '__main__':
    s = "barfoothefoobarman"
    words = ["foo", "bar"]

    obj = Solution()
    result = obj.findSubstring(s, words)
    print('result:', result)

思路

该题可以使用滑动窗口求解。窗口长度为words的总长度，窗口从左到右移动一位，按照单词长度将窗口分割成单词，从而比较结果。

步骤

从左向右每个字符每个字符遍历字符串s，按照单词长度切割，最后和哈希（words）进行比较
每遍历一次，记录相应复合条件的位置
输出位置集合

注意点

因为可以乱序拼接，因此这里使用collections的Counter来统计单词出现的次数, 来判断是否满足条件。
优化: 当某个单词不在words中则无需再判断

滑动窗口2:串联所有单词的子串

串联所有单词的子串

解答

友情链接更多精彩内容