模拟产生QTL数据

通常,要产生的snp位点直接并不是独立的,他们之间是条件后的二项分布。
比如P(B|A)=(P(B)*P(A)+D)/P(A)
这样,我们产生模拟数据的时候就要注意位点之间的连锁关系:

for(h in 1:N)  #for each individual
{
  gMat = matrix(0, nrow=length(freq), ncol=2)  #initialize a matrix to store geno for 2 allele on each site
  for(i in 1:length(freq)) #for each site
  {
    for(j in 1:2)#for each allele
    {
      idx = ifelse(runif(1, 0, 1) < freq[i], 0, 1) #a random value between 0 and 1, if smaller than the freq the idx is 0
      if(i == 1) # for the first site
      {
        gMat[i,j] = idx #the value for this site is idx
      }
      else # for not first site
      {
        d = runif(1, 0, 1) #random values for following comparison as a threshold for P(A|B)
        a = gMat[i-1,j]
        f1 = ifelse(a == 0, freq[i-1], 1-freq[i-1]) #get the allele percentage for the previous site
        f2 = ifelse(a == 0, freq[i], 1-freq[i]) # get the allele percentage for present site
        gMat[i,j] = ifelse(d < (f1 * f2 +ld[i-1])/f1, gMat[i-1,j], 1-gMat[i-1,j]) 
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。