主成分分析

数据过滤质控之后

PCA分析（gcta）

###make germ
nohup /home/software/gcta_1.92.3beta3/gcta64 --bfile ld.QC.75_noinclude0-502502-geno02-maf03 --make-grm --autosome-num 26 --out ld.QC.75_noinclude0-502502-geno02-maf03.gcta &

              # autosome表示只选出常染色体来运行

###pca
nohup /home/software/gcta_1.92.3beta3/gcta64 --grm ld.QC.75_noinclude0-502502-geno02-maf03.gcta --pca 5 --out ld.QC.75_noinclude0-502502-geno02-maf03.gcta.out &

               ########输入的文件就是.gcta,这个文件不是上一步的生成文件，pca后面跟的是要做几个主成分的比较

利用R绘图

####
##在excel中将pca.eigenvec文件的第一列改为品种ID即可
a=read.table("pca.eigenvec",header=F)

a$V1 <- factor(a$V1,levels=c("Angus","Holstein","Simmental","Shorthorn",
                             "Charolais","Mishima-Ushi","Hanwoo","JPBC","Kazakh",
                             "Mongolian","Yanbian","Wenling","Brahman"))  ## 修改图例顺序
Breed=b[,1]

p = ggplot(data  = a , 
           aes(x = a[,3],
               y = a[,4],   ############x = a[,3], y = a[,4]代表第3列和第4列比较也就是pc1与pc2比较）
               group = Breed,
               shape = Breed,
               stroke = 1.5))+
  geom_point(aes(color=Breed),size=3) +
  scale_shape_manual(values = seq(1,15)) + ##设置形状
  # scale_shape_manual(values = rep(12,times = 13)) + 
  guides(color=guide_legend(override.aes = list(size=4, stroke = 1.5))) ##改变图例大小

  p + labs(x = "PC1(12.44%)", y = "PC2(5.87%)") + ############修改x轴和y轴的坐标名称
    
  scale_color_manual(values=c("#66a61e", "#66a61e","#66a61e","#66a61e","#66a61e",
                                "#7470b3", "#e7288a","#e728e7","#e7288a","#e7288a","#e7288a",
                                "#b35107","#b35107")) +   
    theme_classic()+ # 去除灰色背景及网格线
    theme(panel.border = element_rect(fill=NA,color="black", size=0.5, linetype ="solid")) #添加边框
#####图就出来啦

主成分分析

推荐阅读更多精彩内容