[Leetcode] 187. Repeated DNA Sequences

  1. Repeated DNA Sequences

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
For example,
Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",Return:["AAAAACCCCC", "CCCCCAAAAA"].

Subscribe to see which companies asked this question

1)直接在哈希表中存入10个字符恶毒字符串。
RE?
超过限额内存?

2)用两个Bit保存一个核苷酸,20Bit保存十个字符,用一个int即可。

public class Solution {
    public List<String> findRepeatedDnaSequences(String s) {
        Map<Character,Integer> charMap = new HashMap<>();
        charMap.put('A',0);
        charMap.put('C',1);
        charMap.put('G',2);
        charMap.put('T',3);
        Map<Integer,Integer> seqMap = new HashMap<>();
        List<String> result = new ArrayList<String>();
        int sum = 0;  
        for(int i=0; i<s.length(); i++) {  
            sum = ((sum << 2) + charMap.get(s.charAt(i))) ;  
            sum = sum & 0xFFFFF;
            if(i < 9) continue;  
            Integer v = seqMap.get(sum);
            if(v!=null && v==1){
                result.add(s.substring(i-9,i+1));
            }
            seqMap.put(sum, v!=null?v+1:1);
        }  
        return result;
    }
}
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容