用Picard排序之后标记重复时出现错误:
Exception in thread "main" htsjdk.samtools.SAMException: Value was put into PairInfoMap more than once. 1: null:E00511:470:H5C3GCCX2:1:2104:13991:26782
at htsjdk.samtools.CoordinateSortedPairInfoMap.ensureSequenceLoaded(CoordinateSortedPairInfoMap.java:132)
at htsjdk.samtools.CoordinateSortedPairInfoMap.remove(CoordinateSortedPairInfoMap.java:86)
at picard.sam.markduplicates.util.DiskBasedReadEndsForMarkDuplicatesMap.remove(DiskBasedReadEndsForMarkDuplicatesMap.java:61)
at picard.sam.markduplicates.MarkDuplicates.buildSortedReadEndLists(MarkDuplicates.java:285)
at picard.sam.markduplicates.MarkDuplicates.doWork(MarkDuplicates.java:114)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:187)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:89)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:99)
尝试解决
1、java -jar /public/home/nieyg/biosoft/package/picard-tools-1.124/picard.jar FixMateInformation I=LDN-D2-1.sort.bam O=LDN-D2-1.fix.bam
用FixMateInformation Verify mate-pair information between mates and fix if needed.
然后再标记去重复,无错误
2、可以参考https://gatkforums.broadinstitute.org/gatk/discussion/7431/markduplicates-error-value-was-put-into-pairinfomap-more-than-once 里面的回答