When calculating pedigree-based relationships or inbreeding coefficients it is assumed that the pedigree is ordered so that parents
come before progeny. Furthermore, across the majority of the methods indexing animals in the pedigree and where they are at in the associated
matrix or vector is much easier when the animals are number starting from 1 to the total number of animals in the pedigree. In all of the methods
I discuss I assume that the pedigree is already sorted and renumbered. In a normal pedigree that you would receive a pedigree is not sorted and the ID's
are not numbered correctly. The simplest way to order a pedigree is to sort animals based on their birthdate. Although, birthdates can be incorrectly recorded
or entered, or may not be available for an individual. As a result an approach that assumes birthdates aren't available is optimal. The algorithm to sort a pedigree
is outlined below and adapted from Dr. Larry Schaeffer's notes. In his approach, animals are arranged by assigning generation numbers to animals, then iterate through
the pedigrees modifying the generation numbers of the sire and dam to be at least one greater than the generation number of the offspring. This algorithm can be very
fast if you use hash-tables as look ups to find the sire and dam and determine their respective generation number and is outlined in the C++ code.
SortPedigree <- function(animal,sire,dam)
{
if(is.vector(animal) == FALSE){stop("The animal input variable needs to be a vector!!")}
if(is.vector(sire) == FALSE){stop("The sire input variable needs to be a vector!!")}
if(is.vector(dam) == FALSE){stop("The dam input variable needs to be a vector!!")}
# Intialize them all to 1 #
gen <- rep(1,length(animal))
iter=0; done = "NO";
while(done == "NO")
{
iter=iter+1
numberchanged = 0;
## Loop through each animal and find its sire and dam ##
for(i in 1:length(animal))
{
for(j in 1:length(animal))
{
if(sire[i]==animal[j]) ## If sire generation <= animal add 1 ##
{
if(gen[j] <= gen[i])
{
gen[j] = gen[j]+1; numberchanged = numberchanged+1;
}
}
if(dam[i]==animal[j]) ## If dam generation <= animal add 1 ##
{
if(gen[j] <= gen[i])
{
gen[j] = gen[j]+1; numberchanged = numberchanged+1;
}
}
}
}
if(numberchanged == 0){done = "YES";}
cat(iter," (Changed: ",numberchanged,")\n")
}
# Sort in reverse order pedigree by generation #
pedigree <- data.frame(cbind(animal,sire,dam,gen),stringsAsFactors = FALSE)
pedigree$gen <- as.numeric(pedigree$gen)
pedigree <- pedigree[order(-pedigree$gen), ]
return(pedigree)
}
RenumberPedigree <- function(pedigree)
{
if(ncol(pedigree) != 3){stop("The input pedigree needs to have 3 columns!!")}
## Assume it is sorted!! ##
RenumPed <- matrix(data=NA,nrow=nrow(pedigree),ncol=4)
## take animal 'i' and renumber it to 'i' when it appears in pedigree ##
for(i in 1:nrow(pedigree))
{
tmp <- pedigree[i,1]
RenumPed[i,1] <- i
RenumPed[,2] <- ifelse(pedigree[,2] == pedigree[i,1], i, RenumPed[,2])
RenumPed[,3] <- ifelse(pedigree[,3] == pedigree[i,1], i, RenumPed[,3])
rm(tmp)
}
RenumPed[,4] <- pedigree[,1]
RenumPed[is.na(RenumPed)] <- 0;
return(RenumPed)
}
The following pedigree file from Dr. Schaeffer's notes can be utilized with the
R code above. The columns are animal, sire and dam and the pedigree. Outlined below is what generation each animal is within each iteration
of the for loop. It only takes 4 loops for it to figure out the correct order across all animals.