30. Substring with Concatenation of All Words

LeetCode/Top Interview 150

30. Substring with Concatenation of All Words

hyunkookim 2024. 11. 29. 17:57

https://youtu.be/taYRJf-M25I?si=rzSfYWmdG3Vfgum9

from collections import defaultdict
from typing import List

class Solution:
    def findSubstring(self, s: str, words: List[str]) -> List[int]:
        if not s or not words:  # 예외 처리: 빈 입력
            return []

        wordLen = len(words[0])  # 모든 단어의 길이
        totalLen = wordLen * len(words)  # 총 단어 길이 합

        # words 배열의 단어 빈도수를 defaultdict로 계산
        wordCount = defaultdict(int)
        for word in words:
            wordCount[word] += 1

        res = []  # 결과를 저장할 리스트

        # s의 각 위치를 시작점으로 검사
        for i in range(len(s) - totalLen + 1):
            # 현재 윈도우에서 본 단어 빈도를 저장할 defaultdict
            seen = defaultdict(int)  
            for j in range(0, totalLen, wordLen):
                word = s[i + j:i + j + wordLen]  # 현재 단어 추출
                if word in wordCount:  # words에 포함된 단어인지 확인
                    seen[word] += 1
                    # 현재 단어의 빈도가 wordCount를 초과하면 중단
                    if seen[word] > wordCount[word]:
                        break
                else:
                    break  # 단어가 words에 포함되지 않으면 중단
            else:
                # 모든 단어가 조건을 만족하면 시작 인덱스를 결과에 추가
                res.append(i)

        return res

이건 속도도 빠른 솔루션

from collections import Counter, defaultdict
from typing import List

class Solution:
    def findSubstring(self, s: str, words: List[str]) -> List[int]:
        # 입력 문자열 s와 단어 리스트 words 중 하나라도 비어있으면 결과는 빈 리스트
        if not s or not words:
            return []

        word_length = len(words[0])  # 모든 단어의 길이는 동일
        num_words = len(words)  # words 리스트에 있는 단어의 개수
        total_length = word_length * num_words  # words의 모든 단어를 연결한 문자열의 총 길이
        word_count = Counter(words)  # words 리스트에서 각 단어의 빈도수 저장
        result_indices = []  # 결과로 반환할 인덱스 리스트

        # 시작 위치를 `word_length`로 나눈 각각의 경우에 대해 탐색
        for i in range(word_length):
            left = i  # 윈도우의 왼쪽 포인터
            right = i  # 윈도우의 오른쪽 포인터
            seen_words = defaultdict(int)  # 현재 윈도우에서 본 단어의 빈도 저장
            count = 0  # 현재 윈도우에 포함된 단어의 개수

            # 오른쪽 포인터가 문자열 s를 벗어나지 않는 동안 반복
            while right + word_length <= len(s):
                word = s[right:right + word_length]  # 현재 윈도우의 단어 추출
                right += word_length  # 오른쪽 포인터를 단어 길이만큼 이동

                if word in word_count:  # 단어가 words에 포함되어 있다면
                    seen_words[word] += 1  # 해당 단어의 빈도를 증가
                    count += 1  # 현재 윈도우에 포함된 단어 개수 증가

                    # 현재 단어의 빈도가 words에 있는 단어의 빈도를 초과한 경우
                    while seen_words[word] > word_count[word]:
                        left_word = s[left:left + word_length]  # 왼쪽 끝 단어 추출
                        seen_words[left_word] -= 1  # 왼쪽 끝 단어의 빈도를 감소
                        count -= 1  # 현재 윈도우에 포함된 단어 개수 감소
                        left += word_length  # 왼쪽 포인터를 단어 길이만큼 이동

                    # 현재 윈도우에 모든 단어가 정확히 포함되었을 경우
                    if count == num_words:
                        result_indices.append(left)  # 왼쪽 포인터 위치를 결과 리스트에 추가
                else:
                    # 단어가 words에 포함되지 않는 경우, 윈도우를 초기화
                    seen_words.clear()  # 현재 본 단어 빈도를 초기화
                    count = 0  # 단어 개수 초기화
                    left = right  # 왼쪽 포인터를 오른쪽 포인터 위치로 이동

        return result_indices

코드 설명

입력 확인:
- 문자열 s나 단어 리스트 words가 비어있다면 결과는 빈 리스트를 반환.
변수 초기화:
- word_length: 단어의 길이(모든 단어의 길이가 동일하므로 words[0] 사용).
- num_words: words 리스트에 있는 단어의 총 개수.
- total_length: words 리스트의 모든 단어를 연결한 문자열의 길이.
- word_count: words 리스트의 각 단어와 빈도를 저장한 Counter 객체.
윈도우 탐색:
- 시작 위치를 word_length로 나눈 각 경우에 대해 탐색.
- 예: word_length = 3이면, 시작 위치는 0, 1, 2가 될 수 있음.
슬라이딩 윈도우 탐색:
- 오른쪽 포인터(right)를 word_length만큼 이동하며 문자열을 탐색.
- word = s[right:right + word_length]로 현재 단어를 추출.
- 단어가 words에 포함된 경우:
  - seen_words[word] 증가, count 증가.
  - 현재 단어 빈도가 word_count[word]를 초과하면, 왼쪽 포인터(left)를 이동하며 빈도를 조정.
- 단어가 words에 포함되지 않는 경우:
  - seen_words와 count를 초기화하고, 왼쪽 포인터를 오른쪽 포인터 위치로 이동.
결과 저장:
- count == num_words이면 현재 윈도우가 모든 단어를 정확히 포함하므로, 왼쪽 포인터 위치(left)를 결과 리스트에 추가.
최종 결과 반환:
- 모든 탐색이 끝난 후, result_indices를 반환.

시간 복잡도

윈도우 탐색: O(n) (n은 문자열 s의 길이)
단어 검증: O(k) (k는 words의 단어 개수)

총 시간 복잡도:
O(n * k)

공간 복잡도

O(k): word_count와 seen_words에 저장된 단어 빈도.

주요 포인트

단어 길이만큼 나눠서 시작점을 탐색하므로 효율적.
defaultdict로 빈도를 관리해 코드가 간결.

'LeetCode > Top Interview 150' 카테고리의 다른 글

54. Spiral Matrix (0)	2024.11.30
76. Minimum Window Substring (0)	2024.11.30
15. 3Sum ★ (0)	2024.11.29
68. Text Justification (0)	2024.11.28
28. Find the Index of the First Occurrence in a String (0)	2024.11.28

현재글30. Substring with Concatenation of All Words

hyunkookim 님의 블로그

hyunkookim 님의 블로그 입니다.

Coding Test, tree, dp, 티스토리챌린지, top150, 누적합, neetcode150, leetcode75, Two pointers, dfs, LeetCode, heap, 오블완, array / string, BFS, 우선순위큐, 슬라이딩윈도우, Backtracking, BST, neetcode,

Today :
Yesterday :

일	월	화	수	목	금	토
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

hyunkookim 님의 블로그

30. Substring with Concatenation of All Words

코드 설명

시간 복잡도

공간 복잡도

주요 포인트

'LeetCode > Top Interview 150' 카테고리의 다른 글

'LeetCode/Top Interview 150'의 다른글

티스토리툴바

30. Substring with Concatenation of All Words

코드 설명

시간 복잡도

공간 복잡도

주요 포인트

'LeetCode > Top Interview 150' 카테고리의 다른 글

'LeetCode/Top Interview 150'의 다른글

관련글

티스토리툴바