如何利用SNP信息计算亲缘关系G矩阵

    xiaoxiao2021-03-25  219

    生成SNP文件信息

    # create marker data for 9 SNPs and 10 homozygous individuals snp9 <- matrix(c( "AA", "AA", "AA", "BB", "AA", "AA", "AA", "AA", NA, "AA", "AA", "BB", "BB", "AA", "AA", "BB", "AA", NA, "AA", "AA", "AB", "BB", "AB", "AA", "AA", "BB", NA, "AA", "AA", "BB", "BB", "AA", "AA", "AA", "AA", NA, "AA", "AA", "BB", "AB", "AA", "BB", "BB", "BB", "AB", "AA", "AA", "BB", "BB", "AA", NA, "BB", "AA", NA, "AB", "AA", "BB", "BB", "BB", "AA", "BB", "BB", NA, "AA", "AA", NA, "BB", NA, "AA", "AA", "AA", "AA", "AA", NA, NA, "BB", "BB", "BB", "BB", "BB", "AA", "AA", NA, "AA", "BB", "BB", "BB", "AA", "AA", NA), ncol=9,byrow=TRUE) # set names for markers and individuals colnames(snp9) <- paste("SNP",1:9,sep="") rownames(snp9) <- paste("ID",1:10+100,sep="") str(snp9) chr [1:10, 1:9] "AA" "AA" "AA" "AA" "AA" "AA" "AB" "AA" ... - attr(*, "dimnames")=List of 2 ..$ : chr [1:10] "ID101" "ID102" "ID103" "ID104" ... ..$ : chr [1:9] "SNP1" "SNP2" "SNP3" "SNP4" ...

    将SNP原始文件,转化为0,1,2的格式

    利用最小等位基因频率,将major为0,杂合为1,minor为2。 将缺失值随机补全。

    library(synbreed) gp <- create.gpData(geno=snp9) gp.coded <- codeGeno(gp,impute=TRUE,impute.type="random") geno <- gp.coded$geno geno[1:5,1:5] Summary of imputation total number of missing values : 13 number of random imputations : 13 SNP1SNP2SNP3SNP4SNP5ID10100200ID10200000ID10300101ID10400000ID10500010

    1,利用软件包计算亲缘关系矩阵

    Gmatrix <-kin(gp.coded, ret="realized") Gmatrix[1:5,1:5] ID101ID102ID103ID104ID105ID101 1.66197183 0.04225352 0.25352113 0.74647887-1.22535211ID102 0.04225352 1.23943662-0.66197183 0.53521127-0.02816901ID103 0.25352113-0.66197183 1.30985915 0.04225352-0.16901408ID104 0.74647887 0.53521127 0.04225352 1.23943662-0.73239437ID105-1.22535211-0.02816901-0.16901408-0.73239437 2.22535211

    2,利用编程手动计算亲缘关系矩阵

    M1=geno M= M1[,1:ncol(M1)]-1 p1=round((apply(M,2,sum)+nrow(M))/(nrow(M)*2),3) p=2*(p1-.5) P = matrix(p,byrow=T,nrow=nrow(M),ncol=ncol(M)) Z = as.matrix(M-P) b=1-p1 c=p1*b d=2*(sum(c)) ZZt = Z %*% t(Z) G = (ZZt/d) G[1:5,1:5] ID101ID102ID103ID104ID105ID101 1.66197183 0.04225352 0.25352113 0.74647887-1.22535211ID102 0.04225352 1.23943662-0.66197183 0.53521127-0.02816901ID103 0.25352113-0.66197183 1.30985915 0.04225352-0.16901408ID104 0.74647887 0.53521127 0.04225352 1.23943662-0.73239437ID105-1.22535211-0.02816901-0.16901408-0.73239437 2.22535211

    由此可以看出来,两者结果是一致的

    转载请注明原文地址: https://ju.6miu.com/read-290.html

    最新回复(0)