動機

複習sliding window

Problem

The DNA sequence is composed of a series of nucleotides abbreviated as 'A', 'C', 'G', and 'T'.

  • For example, ACGAATTCCG is a DNA sequence.

When studying DNA, it is useful to identify repeated sequences within the DNA.

Given a string s that represents a DNA sequence, return all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule. You may return the answer in any order.

 

Example 1:

Input: s = AAAAACCCCCAAAAACCCCCCAAAAAGGGTTTOutput: [AAAAACCCCC,CCCCCAAAAA]

Example 2:

Input: s = AAAAAAAAAAAAAOutput: [AAAAAAAAAA]

 

Constraints:

  • 1 <= s.length <= 105
  • s[i] is either 'A', 'C', 'G', or 'T'.

Sol

對string計數

class Solution:
    def findRepeatedDnaSequences(self, s: str) -> List[str]:
        f = defaultdict(int) # str -> int
        if len(s) < 10:
            return []
        else:
            a = 0
            for b in range(10,len(s)+1):
                f[s[a:b]] += 1
                a += 1
            
            return [k for (k,v) in f.items() if v > 1]