As outlined in a previous section, Recursive Method to Create G-1, as the number on genotyped animals increase computation issues arise. One method to solve this is to use the algorithm for proven and young, as outlined in a previous section (i.e. Algorithm for proven and young animals (APY)). When utilizing this algorithm the number of animals in the proven group needs to be determined along with whether an animal is placed in the proven or young group. A large amount of research has been generated on this topic and helpful articles are outlined in the references section. In this section we will mainly focus on how to derive the appropriate number without any loss in accuracy of the associated breeding values. In general, it has been found that a random sample of the animals can be utilized as long as a sufficient number of animals are used in the proven set to capture the majority of the variance explained in the full genomic relationship.
In order to determine the number of animals to include, a singular value decomposition is done on the matrix of allele content centered for allele frequencies, refferred to as Z in this section. Where the Z matrix enters into the VanRaden (2008) type genomic relationship matrix is outlined below: $$ G = \frac{ZZ'}{2\sum(p_j(1-p_j))}, $$ where pj refers to the allele frequency for SNP p. An alternative way to derive the number is to do an eigenvalue decomposition of G and is equivalent to the singular value decomposition of the Z matrix. Once finished, the number is chosen based on how many eigenvalues it takes to explain a certain percent of the variance in G.