Imaginings of a Livestock Geneticist

Algorithm for proven and young animals (APY)

As outlined in the previous section (i.e. Recursive Method to Create G-1), as the number of genotyped animals continued to increase computational issues started to manifest. For example, storing and generating the inverse of a genomic relationship containing just 100,000 animals would be computationally very demanding. In some genetic lines and breed associations the current number of genotyped animals greatly exceeds this number (i.e. United States Holstein in the millions). As a result, a method referred to as "Algorithm for proven and young animals (APY) was developed by Misztal et al. (2014). The algorithm splits the genotyped animals into two categories referred to as "proven" and "young". Within the "proven" category only the inverse of the genomic relationship is generated, which greatly reduces the computational load. As a result the APY based genomic inverse is sparse with the nonzero elements forming an L shape and within the nonzero portion only the diagonal elements of the "young" animal group is utilized. The G-1 matrix based on the APY algorithm is outlined below and subscripts with a p refer to proven animals and subscripts with a y refer to young animals.

$$ G^{-1} = \begin{bmatrix} G_{pp}^{-1} + G_{pp}^{-1}G_{py}M_{g}^{-1}G_{yp}G_{pp}^{-1} & -G_{pp}^{-1}G_{py}M_{g}^{-1} \\ -M_{g}^{-1}G_{yp}G_{pp}^{-1} & M_{g}^{-1} \end{bmatrix} $$

Outlined below is the function that generates G-1 using the APY algorithm. The function takes in as input a genomic relationship matrix and is described in the section, how to generate different G matrices along with a vector that declares which group (i.e. proven and young) an animal belongs to. The function initially sorts animals within G so that all proven animals are before young animals. After ordering the animals, the genomic inverse of proven animals only (Gpp) is constructed as described in the previous section Recursive Method to Create G Inverse . Once (Gpp) is generated each portion of the matrix as outlined above is created.

The following genomic relationship file from Misztal et al. (2014) can be utilized with the R code above. Outlined below is what the original G-1 and G-1 using APY looks like.

G-1 Original
1 2 3 4 5 6 7
1 12.2393 14.7386 1.7054 -2.1235 -12.2360 -12.9130 2.1161
2 - 23.2271 2.2710 -4.8817 -17.4429 -19.8912 3.9989
3 - - 1.1919 -0.2002 -1.8185 -1.8360 0.4263
4 - - - 3.2018 3.9332 4.2116 -0.5308
5 - - - - 14.7872 15.5668 -2.7443
6 - - - - - 18.3944 -3.2281
7 - - - - - - 1.7870
G-1 APY
1 2 3 4 5 6 7
1 9.7518 9.9407 1.1878 -1.5205 -8.9847 -9.0904 0.1500
2 - 14.4901 1.3603 -3.6071 -11.3069 -12.6675 0.5082
3 - - 1.0991 -0.0558 -1.1649 -1.0658 0.1041
4 - - - 3.0797 3.1159 3.2528 0.2083
5 - - - - 10.5728 10.6095 -0.0125
6 - - - - - 12.5633 0.0000
7 - - - - - - 1.2205

References