前言
没错,我还是继续吐槽:
keep并不是想如下的那么简答的!
经过了前两文的准备文件,我们终于获得了中间的“sampleID.txt”文件了!
$ /Users/seedson/Downloads/plink_mac_20190617/plink --file /Volumes/Seagate\ Backup\ Plus\Drive/70389_LungSomke/CGEMS/GENEVA_LungCancer/phs000093v2/p2/genotype/phg000206v1/phg000206.v1.GENEVA_LungCancer.genotype-imputed-data.c1.CADM/chr2 --keep /Users/seedson/Desktop/file/SCfam.txt --make-bed --out sclschr220191101
说人话:
#告诉计算机plink位置:
$ /Users/seedson/Downloads/plink_mac_20190617/plink
#告诉计算机要match的chr2文件位置
--file /Volumes/Seagate\ Backup\ Plus\ Drive/70389_LungSomke/CGEMS/GENEVA_LungCancer/phs000093v2/p2/genotype/phg000206v1/phg000206.v1.GENEVA_LungCancer.genotype-imputed-data.c1.CADM/chr2
保留需要的“表型数据”-没错,就是写了两个文章的文件!
--keep /Users/seedson/Desktop/file/SCfam.txt
输出二进制文件
--make-bed
改个名字
--out sclschr220191101
至此,就得出了准备分析的“亚组”数据了。
后面就是根据每一染色体,修改名字,都提取出来了(重复的工作了~)
染色体3:
/Users/seedson/Downloads/plink_mac_20190617/plink --file /Volumes/Seagate\ Backup\ Plus\ Drive/70389_LungSomke/CGEMS/GENEVA_LungCancer/phs000093v2/p2/genotype/phg000206v1/phg000206.v1.GENEVA_LungCancer.genotype-imputed-data.c1.CADM/chr3 --keep /Users/seedson/Desktop/file/SCfam.txt --make-bed --out sclcch3
结果:
染色体5:
/Users/seedson/Downloads/plink_mac_20190617/plink --file /Volumes/Seagate\ Backup\ Plus\ Drive/70389_LungSomke/CGEMS/GENEVA_LungCancer/phs000093v2/p2/genotype/phg000206v1/phg000206.v1.GENEVA_LungCancer.genotype-imputed-data.c1.CADM/chr5 --keep /Users/seedson/Desktop/file/SCfam.txt --make-bed --out sclcch5
如此类推......
后记
--keep真™太简单了!
喜欢就点个赞,大赏一下呗~