A review of genomic selection-Implications for the South African beef and dairy cattle industries

The major advancements in molecular technology over the past decades led to the discovery of DNAmarkers, sequencing and genome mapping of farm animal species. New avenues were created for identifying major genes, genetic defects, quantitative trait loci (QTL) and ultimately applying genomic selection (GS) in livestock. The identification of specific regions of interest that affect quantitative traits aimed to incorporate markers linked to QTL into breeding programs by using marker assisted selection (MAS). Most QTL explained only a small proportion of the genetic variation for a trait with limited impact on genetic improvement. Single nucleotide polymorphism (SNP) markers created the possibility to genotype cattle in a single assay with hundreds of thousands of SNPs, providing sufficient genomic information to incorporate into breeding value estimation. Genomic selection is based on the principle of associating many genetic markers with phenotypic performance. A large database of genotyped animals with relevant phenotypes pertinent to a production system is therefore required. South Africa has a long history of animal recording for dairy and beef cattle. The challenge for implementation of GS would be the establishment of breed-specific training populations. Training populations should be genotyped using a high density SNP panel, and the most appropriate genomic prediction algorithm determined. The suitability of commercially available genotyping platforms to South African populations should be established. The aim of this review is to provide an overview of the developments that occurred over the past two decades to lay the foundation for genomic selection with special reference to application in the South African beef and dairy cattle industry. ________________________________________________________________________________


Introduction
The past three decades were characterized by a number of major discoveries and technological developments in the field of molecular genetics.On a technological level the development of polymerase chain reaction (PCR) technology by Mullis (Fore et al., 2006) was a major advancement for molecular research, followed by automated sequencing during the late ninety nineties.The discovery of the hypervariable region in the human genome (Tautz, 1989) paved the way for the discovery and mapping of different DNA markers (Dodgson et al., 1997;Beuzen et al., 2000), and these have been applied widely in several livestock species (Dekkers & Hospital, 2002;Dekkers, 2004;Pollak, 2005;Womack, 2005;Jeon et al., 2006).The human genome was the first genome to be sequenced and contributed to improving sequencing and high throughput technologies (Adams, 2008;Eggen, 2012).Since the completion of the Human Genome project most farm animal species have been sequenced and mapped (Fan et al., 2010) creating new opportunities for genetic improvement in livestock that was previously beyond researchers' reach.
Using DNA technology to perform genome mapping and sequencing, a number of useful single genes was mapped in small livestock (Montgomery & Kinghorn, 1997;Davis, 2004;Van der Werf, 2007) and in beef and dairy cattle (Anderson, 2001;Dekkers & Hospital, 2002).Diagnostic tests for genetic diseases such as bovine leukocyte adhesion deficiency (BLAD), deficiency of uridine monophospate synthase (DUMPS) and complex vertebral malformation (CVM) in dairy cattle have been applied to ensure that seed stock bulls can be identified as non-carriers or carriers of these autosomal recessive mutations (Robinson et al., 1984;Schuster et al., 1992;Thomsen et al., 2006).In sheep, the Prp gene associated with scrapie has also been identified for individual testing (Belt et al., 1995).Major genes affecting quantitative traits have been commercialized for application in the livestock industry, for example the CAST and CAPN1 genes for meat tenderness and the GDF8 gene for double muscling in beef cattle, as well as the RYR and PRKAG3 genes affecting pork quality (Jeon et al., 2006).
The identification of specific genomic regions of interest that affect economically important traits in farm animals held great interest for the livestock industry as it aims at incorporating genomic markers linked to quantitative trait loci (QTL) into breeding programs, by making use of marker assisted selection (MAS) (Anderson, 2001).In the search of QTL and major genes, different approaches have been investigated including genome wide scans using relatively high density panels of microsatellites or single nucleotide polymorphism (SNP) markers across the genome together with genome wide association studies or candidate gene approaches (Ron & Weller, 2007;Hayes & Goddard, 2010).However, it was found that most of the identified QTL explained only a small proportion of the genetic variation for a quantitative trait and was not expected to result in a significant increase in the rate of genetic improvement (Nicholas, 2006).It was postulated that DNA-markers and genomic information held the most potential for increasing genetic progress in quantitative traits with low heritability, traits that are difficult and/or costly to measure, and sexlimited traits (Dodds et al., 2007;Hayes & Goddard, 2010).
Currently the possibility exists to genotype cattle in a single assay with up to 777 962 SNP markers with marker intervals of less than 3 kb, providing sufficient genomic information to be incorporated in breeding value estimation procedures as well as providing useful information for performing genome-wide association analyses.The aim of this review is to provide an overview of the developments that took place over the past two decades to lay the foundation for genomic selection with special reference to application in the South African beef and dairy cattle industry.

The bovine genome and the search for quantitative trait loci
To appreciate the molecular information currently available in cattle, it is first necessary to briefly summarise the first efforts to compile DNA marker maps of the bovine genome.The first genetic maps for cattle were constructed with 746 and 1250 microsatellite markers by Barendse et al. (1997) and Kappes et al. (1997), respectively.The map compiled by Kappes et al. (1997) had a genome coverage of 2990 cM with an average marker interval of 3.0 cM.This map was further improved by adding more than a thousand markers with improved coverage and decreased intervals between microsatellites (on average 1.4 cM) that resulted in a high density map with 3802 microsatellite markers (Ihara et al., 2004).From this effort 880 000 genotypes were generated from the USDA MARC cattle reference families.The development of these high density linkage maps was essential for studying and fine mapping QTL and regions of interest.
In combination with molecular technology, software has been developed to incorporate large and complex pedigrees in the statistical analyses for identification of QTL (Seaton et al., 2002;2006).QTL identification studies have resulted in the identification of causative mutations such as DGAT for milk fat content in dairy cattle (Grisart et al., 2002) and MSTN for double muscling in beef cattle (Charlier et al., 1995).Genome scans were conducted for QTL associated with milk production, health and conformation traits in dairy cattle using granddaughter designs and outbred families (Zhang et al., 1998;Heyen et al., 1999;Schrooten et al., 2000).
Potential advantages from incorporating QTL information in dairy breeding programs varied between studies.The French dairy industry genotyped more than 70 000 bulls for 14 chromosomal regions over a seven year period and the impact was sufficient to reduce the number of bulls for progeny testing by 15% (Boichard et al., 2006).The putative QTL could, however, explain only a relatively small proportion of the genetic variance (0.1% to 13.5%) for most of the traits in the study.This limitation was even more obvious in a genome-wide association study (GWAS) performed in dairy cattle using approximately 43 000 SNP associated with feeding level and response in milk production to heat stress (Hayes & Goddard, 2010) which only explained 1.5% and 2.0% of the genetic variance for the two traits.This phenomenon of unexplained genetic variation, termed the "missing heritability" (Maher, 2008), is not only confined to livestock, but is also present in human GWAS (Visscher, 2008).
Despite the efforts of performing GWAS, the effects of the individual putative QTL identified for the desired traits were small and MAS in livestock was much less effective than was initially expected (Hayes & Goddard, 2010).Furthermore, experiments with sufficient statistical power to detect QTL required several hundred or even thousands of individuals with both genotypic and phenotypic data (Van der Werf et al., 2007).It was thought that the uptake of this technology would increase if the costs could be shared by other applications (e.g.parentage testing, traceability).Nevertheless, it became clear that QTL identification would not be a feasible option for explaining a sizeable proportion of the genetic variance in complex traits.An alternative approach was therefore needed to exploit the information from a large number of genetic markers simultaneously.
The bovine genome sequence, completed in 2004, was the first genome with high SNP coverage of the Cetartiodactyla mammals (family Bovidae) to be sequenced.Single nucleotide polymorphisms are abundant, bi-allelic, single-locus markers located at approximately 3 kb intervals in the Bos taurus genome with an estimated total of nearly 40 million SNP which were identified during sequencing (Seidel, 2010).The frequency of more than 37 000 SNP markers were analyzed in 497 cattle by the Bovine HapMap Consortium and this set the scene for further expansion of SNP discoveries (Eck et al., 2009;Seidel, 2010).Van Tassel et al. (2008) added an additional 23 000 SNPs to the bovine collection, studying 66 cattle that included breeds such as the Holstein, Angus, Red Angus, Gelbvieh, Hereford, Limousin and Simmental.This resulted in the compilation of a commercial 54 001 SNP array that resulted in an additional tool for animal breeders that could be applied in genomic selection (Eck et al., 2009;Matukumali et al., 2009).The newly developed BovineHD beadchip from Illumina (http://www.illumina.com)features 777 962 relatively evenly spaced SNP over the entire bovine genome including the mitochondrial DNA.

Genomic selection
The potential use of genome-wide genetic marker information for use in animal breeding was originally proposed by Meuwissen et al. (2001).The principles of traditional MAS were to include a relatively small number of genetic markers in genetic evaluations (Fernando & Grossman, 1989).These markers originated from information generated in research studies, primarily meta-analyses of controlled experiments (Georges et al., 1995).Modern-day genomic selection, which is essentially a larger scale version of MAS, includes a considerably larger number of genetic markers; the "effects" of each marker are simultaneously estimated in the genomic selection process.The number of markers included in the genomic evaluations is dependent on the methodology used (discussed later) but the original set of genetic markers considered for inclusion is in the order of many thousands.Genomic selection generally assumes that all the genetic variation for a trait should be explained by markers.Polygenic effects are, however, sometimes included in the model to account for genetic variation that is unexplained by the genetic markers (Hayes et al., 2009).Genomic selection should ultimately lead to using genotypes defined by a set of polymorphisms to select for preferred phenotypes (Seidel, 2010).
Implementation of genomic selection in any population requires: 1) genotypes of a large population of animals that have, 2) pertinent phenotypes for the system of production where the genetic/genomic evaluations will be used, and 3) statistical methodologies for implementing efficient and accurate genomic predictions.This is based on the assumption that the breeding program in place is already optimal and includes a) an accurate genetic evaluation system based on access to relevant and heritable phenotypes, b) a pertinent breeding objective encompassing, as far as possible, all relevant traits optimally weighted within a breeding goal, and c) a breeding scheme to ensure long-term sustainable genetic gain.
Monomorphic SNPs do not contribute any information to genomic predictions and may therefore be discarded to reduce future computational requirements, as can SNPs with low minor allele frequency (i.e.<2% [Wiggans et al., 2010;Berry & Kearney, 2011] to <2.5% [Hayes et al., 2009]).Restrictions on the extent of deviations of SNP frequency from Hardy-Weinberg equilibrium can also be imposed (Hayes et al., 2009;Wiggans et al., 2010;Berry & Kearney, 2011); care should, however, be taken if imposing these criteria in multi-breed or multi-strain populations.Even within breed, strict restrictions on the Hardy-Weinberg equilibrium statistic should be undertaken with caution since selection can cause departures from Hardy-Weinberg equilibrium, for example, loci harbouring lethal recessive genetic defects (Berry & Kearney, 2011).
The cost of generating the genotype of an animal has reduced considerably in recent years with advances in technologies.Improvement in accuracy of genomic predictions can be achieved with increased marker density (VanRaden et al., 2009) especially for across-breed evaluations (De Roos et al., 2008).The number of SNP necessary to achieve accurate genomic predictions depends on the extent of linkage disequilibrium in the species or breed, the length of the genome (L), and the effective population size (N e ) (Hayes & Goddard, 2010).It has been shown that approximately 50 000 SNPs are sufficient to obtain accurate genomic predictions (assuming a relatively large training population of genotyped and phenotyped animals) within the Holstein (Berry et al., 2009;VanRaden et al., 2009), but at least 300 000 SNP are required for reliable prediction in Jersey cattle, using a Holstein reference data of genotyped and phenotyped animals (De Roos et al., 2008).Improved accuracy of prediction with greater marker density is expected since the success of genomic selection is based on exploiting genetic markers in linkage disequilibrium with the causative mutations and therefore the greater the marker density the greater the likelihood of tighter linkage disequilibrium between the functional mutation and the genotyped markers.Nonetheless, improvements in the accuracy of genomic prediction from greater marker density are likely to be dependent on the algorithm used in the genomic predictions.
In addition to the ever-reducing cost of genotyping, the lower the number of genetic markers genotyped, the lower the likely cost.Imputation is a method of exploiting linkage analysis and/or linkage disequilibrium by deducing a higher density genotype from a lower density (and therefore lower cost) genotype.Current algorithms for imputation exploit linkage (findhap;VanRaden et al., 2011), population wide linkage disequilibrium (Beagle; Browning & Browning, 2007;2009) or combined linkage analysis and linkage disequilibrium (FImpute; Sargolzaei et al., 2011).Berry & Kearney (2011) reported an average accuracy of imputation from 2 909 SNPs to 54 001 SNPs of 98% in Holstein-Friesian dairy cattle; similar imputation accuracies were reported elsewhere (Weigel et al., 2010).Therefore lower density genotyping, coupled with imputation, can be used to reduce the cost of genomic selection.Imputation approaches may also be used to impute the genotype of a un-genotyped ancestor with several genotyped progeny.
Several international initiatives in dairy cattle are underway to share genotypes thereby reducing the cost of genomic selection for each country (Cromie et al., 2010;Jorjani et al., 2011).The success of these initiatives in dairy cattle is due to the international nature of semen trade, and aided by international genetic evaluations undertaken by INTERBULL.Increased accuracy of prediction is achieved when the genomic selection reference population is closely related to the selection candidates (Habier et al., 2007); therefore providing genotyped back-pedigree (with phenotypes from INTERBULL) to the importing country can be beneficial to the exporting country.

Phenotypic information
Genomic selection is based on the principle of relating genetic markers to phenotypic performance.A large database is therefore required of genotyped animals with all relevant phenotypes pertinent to the system of production where the genomic predictions will be applied.This large database of phenotyped and genotyped animals is generally referred to as the reference population or training population and is used to estimate the genetic marker effects.Within breeds, the improvement in the accuracy of genomic predictions with increasing size of the reference population is non-linear and is dependent on how accurately the phenotypic measures reflect the true breeding value (i.e.heritability) of the animals (Daetwyler et al., 2008;Goddard, 2008) and the relatedness of the reference population of animals to the animals where the genomic prediction equations will be applied (Habier et al., 2007).The lower the accuracy of the phenotypes, as is, on average, the case for lower heritable traits such as fertility (Veerkamp & Beerda, 2007) and health (Berry et al., 2011), the lower the accuracy of genomic predictions for the same reference population size (Daetwyler et al., 2008;Goddard, 2009).For the same reference population size, the greater the relatedness of the reference population to the population where the prediction equations will be applied, the greater will be the accuracy of the genomic predictions (Habier et al., 2007).
Artificial insemination (AI) bulls constitute the majority of international dairy cattle training populations.This is because they generally have more accurate predictions of genetic merit and therefore fewer animals need to be genotyped to achieve the same accuracy of genomic predictions (Daetwyler et al., 2008;Goddard, 2009).The disadvantage of using only AI sires is that the traits included in the genomic predictions are limited to those available nationally and used to estimate the sire breeding values.Facilitated by multiple trait-across country genetic evaluations (MACE; Schaeffer, 1994) undertaken by INTERBULL for dairy cattle, predictions of genetic merit of all international male AI animals on the scale of each member country are obtainable.Therefore, even though a male animal may have no progeny in a given country its MACE evaluation may be used as a phenotype for inclusion in genomic predictions (Berry & Kearney, 2011) weighted by a function of the MACE reliability.Nonetheless, the number of AI sires is limited and therefore natural mating bulls or cows must also be considered for inclusion in genomic selection reference populations.Cows are currently included in the genomic selection reference population in the United States (Wiggans et al., 2011) while natural mating bulls are included in the dairy cattle genomic selection reference population in Ireland (Andrew Cromie, personal communication); the latter is particularly relevant for populations (e.g.some beef populations) where AI is used less frequently.
The phenotype included in all genomic evaluations of dairy cattle are either daughter yield deviations or deregressed estimated breeding values (Berry & Kearney, 2011) that remove pedigree contributions and reverse the effect of shrinkage during the BLUP evaluations.For some populations and traits (e.g.feed intake), however, a sufficiently large reference population may not be available for accurate genomic prediction.Intergenomics (Jorjani et al., 2011) is an international collaboration to undertake international genomic evaluations in Brown Swiss cattle by pooling of international phenotypes (and genotypes).Veerkamp et al. (2012) combined data on Holstein-Friesian dairy cows from research herds in four countries to generate genomic predictions; a similar initiative was undertaken for genomic predictions for residual feed intake in growing dairy heifers in Austral-Asia (Pryce et al., 2012).

Genomic prediction algorithms
The main challenge in genomic predictions is to estimate genetic marker effects because the number of genetic markers available exceeds the number of phenotypes available by several factors, although this phenomenon is changing in some countries (Wiggans et al., 2011).Genetic markers may include SNPs, CNVs, indels but may also include haplotypes.Using simulations, Calus et al. (2007), however, documented that the advantage of using haplotypes over SNPs decreased as the linkage disequilibrium between adjacent SNPs increased, with an r 2 (i.e.measure of linkage disequilibrium) of 0.215 being where both gave similar accuracy.Within Holstein-Friesian cattle the average r 2 between adjacent SNPs with approximately 40,000 SNPs is expected to be >0.19 (Khatkar et al., 2008) thus suggesting that with the Illumina Bovine50 beadchip, using SNPs rather than haplotypes is most sensible.SNPs are generally the marker of choice in national genomic predictions (Berry et al., 2009;Hayes et al., 2009;VanRaden et al., 2009).Irrespective of the genomic prediction algorithm, the general model used to estimate SNP effects is: where Y i is the phenotypic value of animal i for the trait under investigation, μ is the mean effect for the trait under investigation, X j is the effect of locus j, g ij is the genotype of animal i at locus j, and e i is the residual term for animal i; a polygenic effect may also be included in the model (Calus, 2010).
Different approaches exist to estimate the SNP effects.Due to the number of parameters usually being considerably larger than the number of phenotypic records, simple multiple regression methods cannot be used, mainly because of a lack of degrees of freedom, but also because of other concerns such as high levels of multi-collinearity among SNPs.Several algorithms have been proposed for genomic selection and a summary of the assumptions for the different algorithms as described by Hayes & Goddard (2010) is shown in Table 1.One approach for genomic predictions is to use least squares regression but on a considerably reduced number of SNPs that have passed certain criteria such as very strongly associated with the trait under investigation from a series of univariate analyses; machine learning techniques as well as many other approaches may also be used to select a subset of informative SNPs.The final set of SNPs included in the multiple regression model may be obtained using a stepwise algorithm.Dimension reduction techniques have also been proposed such as 1) principal component analyses of the SNPs and subsequent inclusion of the main principal components in a regression analysis, or 2) partial least squares analysis which is similar to principal component analysis but where the latent variables generated take cognizance of their ability to also capture variation in the dependent variable.Other dimension reduction techniques also exist.
Rather than include SNPs individually as fixed effects, SNPs may also be included as random effects.This approach was suggested by the original genomic selection paper (Meuwissen et al., 2001) and is now commonly referred to as GBLUP.The GBLUP model assumes that each SNP contributes equally to the additive genetic variance of the phenotype under investigation and a normal distribution of the SNP effects is usually assumed.Treating each SNP simultaneously as a random effect is equivalent to constructing a genomic relationship matrix from the genotypes (VanRaden, 2008) and replacing the traditional numerator relationship matrix in the mixed model equations for genetic evaluations with the genomic relationship matrix.It is the latter description of GBLUP that is the commonly used GBLUP approach in national genomic evaluations (Berry & Kearney, 2011).However, the contribution of each SNP to the genomic relationship matrix can be weighted differently to account for different variances per SNP and differences in surrounding marker density (VanRaden, 2008).
Two Bayesian approaches were originally proposed by Meuwissen et al. (2001) which they termed BayesA and BayesB, although Gianola et al. (2006) questioned whether they were truly Bayesian approaches.In the BayesA approach of Meuwissen et al. (2001) the prior distribution of the variance of the SNP effects was chosen to represent a few SNPs with large effects and many SNPs with small effects (i.e. an inverted chi-square distribution).The SNP effects are sampled from a normal or t-distribution.BayesB, as proposed by Meuwissen et al. (2001), was largely similar to BayesA with the exception that, a priori, the proportion of SNPs not associated with the phenotype of interest was set and SNPs not entering the model within a Gibbs sampling chain were set to zero.Other Bayesian approaches have since been described (for example Habier et al., 2011).BayesCπ (Habier et al., 2011) as well as other modifications compared to BayesB (e.g.use of a multivariate t-distribution to describe the SNP effects) samples the proportion of SNPs that are associated with the phenotype, rather than this statistic having to be decided on prior to the analysis.Such mixture models, combining two or more distributions of SNP effects, can accommodate SNPs with a distribution of large effects and a distribution of small effects (Calus et al., 2008).This approach is similar to BayesB but avoids the requirement for the Metropolis-Hastings step thereby reducing computational requirements, but still facilitates the inclusion of many SNPs each with small effects (unlike BayesB where such effects were set to zero) thereby capturing any remaining unexplained genetic variance (Calus, 2010).Non-parametric kernel methods have also been proposed for genomic predictions (for review, see Calus, 2010).
Several studies have compared the efficiencies and accuracies of the different algorithms using either real data (Hayes et al., 2009;VanRaden et al., 2009) or simulated data (Calus et al., 2008;Gredler et al., 2009;Meuwissen & Goddard, 2010a;b;Meuwissen et al., 2001;Pszczola et al., 2011).Although not always consistent across studies, the Bayesian approaches described are generally more accurate for traits where large QTL exist (Hayes et al., 2009;VanRaden et al., 2009;Meuwissen & Goddard, 2010a).Genomic predictions of milk fat and protein composition generally perform better with a Bayesian approach assuming a t-distribution of SNP effects (Gredler et al., 2009;Hayes et al., 2009) and this is likely due to the large impact of the DGAT1 gene on milk fat and protein composition (Berry et al., 2010).

Breeding scheme
The ability to accurately estimate the genetic merit of an individual from its DNA through genomic selection is causing a paradigm shift in dairy cattle breeding programs.Genomic selection is now implemented in many dairy cattle national genetic evaluations in the United States (VanRaden et al., 2009), Europe (Berry et al., 2009) and Australasia (Harris & Johnson, 2010).Devising and implementing breeding schemes to maximize the potential of this technology is required.Schaeffer (2006) compared a selection strategy using genomic selection to a traditional progeny testing scheme similar to that operated in Canada.He reported a two-fold increase in genetic gain using genomic selection with a 92% reduction in the cost of proving the bull.Other simulation studies suggest at least a 50% increase in annual genetic gain from implementation of genomic selection breeding programs compared to traditional breeding programs (Pryce et al., 2010;Lillehammer et al., 2011;McHugh et al., 2011).
The fundamental change on how genomic selection will influence breeding programs is that genomic predictions can be obtained at a very young age, well before sexual maturity, thereby reducing the generation interval considerably.Therefore, research into the optimal use of reproductive technologies in breeding programs is likely to intensify, in particular the genotyping of embryos (Humblot et al., 2010).Furthermore, two of the selection pathways in genetic gain originate with the dam (i.e.dams to produce sires and dams to produce dams).For low heritability traits in particular, accurate estimates of the genetic merit of the dams is difficult.Genomic selection will improve the accuracy of selection of the dams and increase genetic gain further.McHugh et al. (2011) documented a large increase in genetic gain achievable when females were also genotyped; not only were the candidate dams more accurately identified, but the contribution of the genotyped and phenotyped dams to the genomic selection training population increased the accuracy of selection even more.

Concerns
As with most new technologies, misuse of the technology, intentionally or not, can have unfavourable consequences.Genomic selection is no exception.Genetic gain is expected to increase by at least 50% with successful implementation of genomic selection (Pryce et al., 2010;Lillehammer et al., 2011;McHugh et al., 2011).Rapid increases in genetic gain can reduce the ability to purge out unfavourable consequences of selection, including inbreeding depression (McParland et al., 2009).One option to minimize (but not eliminate) this concern is to use bulls sparingly in their first year.In this way any congenital defects or calving difficulties may be identified earlier and the bull could be culled or used in appropriate matings (i.e.avoidance of carrier females if the bull is a carrier of a lethal recessive allele).Of course, if the bull is a carrier of a rare lethal recessive allele then this may not be observable in a small number of (unrelated) animals.Nonetheless, young bulls cannot produce sufficient semen to generate large progeny group sizes and therefore small first-crop progeny groups will be generated.Furthermore, an extensive phenotyping strategy must be implemented to rapidly identify any unfavourable consequences of selection.Sentinel herds could be used to compare national average genetic merit to elite genetic merit animals under contrasting systems of production (Wickham et al., 2012).
Because the increase in accuracy of genomic predictions, and therefore genetic gain, is a function of the number of phenotyped animals and the accuracy of their phenotypes, genetic gain will tend to be greatest in high heritability traits where ample phenotypic data are available (Amer, 2012).This will place greater selection pressure on traits such as milk production which are known to be antagonistically correlated with fertility (Berry et al., 2012) and health (Berry et al., 2011) thereby possibly leading to a reduction in genetic gain for fertility and health traits or even reverting to a deterioration in genetic merit for the latter traits in populations where it is currently improving.
Genomic selection has intensified global competition in the international trade of dairy cattle germplasm and this can have serious consequences for inbreeding, and its associated deleterious effects (McParland et al., 2007).Several simulations (Daetwyler et al., 2007) have reported a lower accumulation of inbreeding per generation with genomic selection compared to breeding programs based on traditional BLUP genetic evaluations; however, because the generation interval is shortened in genomic selection breeding programs compared to traditional progeny test breeding programs, inbreeding per annum may be greater with genomic selection.Knowledge of the DNA of individual animals can, nevertheless, be used to reduce the accumulation of inbreeding by identifying least related animals at the genomic level.Stochastic simulations also show that the use of sexed semen in a genomic selection breeding program can help manage rates of inbreeding (Pedersen et al., 2012).Moreover, genomic selection can be used to screen a much larger population of potential candidate animals thereby widening the genetic diversity and, if unrelated / lowly related candidates are genetically elite, then help manage the rates of inbreeding (Hayes & Goddard, 2010).

Genotypes
South African dairy and beef cattle breeders are already exploiting DNA technology through DNAbased parentage verification and diagnostic testing (Van Marle-Köster & Nel, 2003).Dairy bulls are routinely screened for known recessive disorders such as BLAD, DUMPS and CVM.Beef cattle can be tested for meat tenderness with various diagnostic kits as well as for a few genetic disorders, including dwarfism and certain translocations.This is, however, the only (limited) genotypic information available for cattle breeds and the quantity of information differs greatly among breeds.No official system for the collection and storage of biological samples of cattle currently exists in South Africa.A livestock identification system where breeders stored hair samples on a voluntary basis for use in the advent of livestock theft is the only collective repository for biological samples (ARC, 2007).Other biological material (including semen, blood and hair samples) is dispersed among AI companies, research institutions, laboratory facilities, etc.
The South African dairy industry has an advantage over the beef industry owing to stronger national and international genetic linkage with (internationally) superior AI sires.South Africa is a member of INTERBULL which is the body responsible for international genetic evaluations of dairy cattle (Mostert, 2007).In the beef industry there is, however, particular challenges regarding the implementation of genomic selection of which the main one is the limited recording of performance traits.Often the specific traits that could gain the most from genomic selection such as fertility, longevity and carcass quality have not been recorded in South Africa.There is also less genetic linkage between herds due to the limited use of AI by beef breeders and also lower availability of animals with accurate EBVs.These limitations are nonetheless not unique to South Africa and were also documented by Garrick (2011) for the US cattle industry.The structure of the beef industry in South Africa is also more complex with a relatively small seed stock sector where genetic selection takes place and which provides the genetic material (bulls) for commercial production.Also, there is a large number of beef breeds in South Africa compared to dairy breeds which has implications for the identification of suitable reference/training populations needed for the implementation of genomic selection in the entire beef sector.
The challenge for implementation of genomic selection therefore is to firstly collate all available biological samples (hair, semen or live animals for blood) and quantify whether they are of sufficient quality and quantity for the establishment of a breed-specific training population.DNA of animals that have made a large genetic contribution to the modern-day population should also be sourced.All animals should have accurate phenotypic information for use in genomic predictions but genotypes on prominent ancestors, even without phenotypic information, can also be useful to generate population haplotypes for more accurate imputation from lower to higher density genotypes.Once the training population has been established, all animals should be genotyped using a SNP panel, and the most appropriate genomic prediction algorithm determined using the appropriate validation procedures.The suitability of commercially available genotyping platforms to South African populations should be established.

Phenotypes
Beef and dairy cattle in South Africa has a relatively long history of performance recording with official performance testing for beef cattle taking place since 1959 and milk recording since 1917 (Bergh, 2010).Participation in national performance recording schemes and private recording schemes differs, however, between breeds as performance testing has not been enforced by all breed societies.Participation in the recording scheme for beef cattle varies from as low as 32% to 100% (Scholtz, 2010), while participation in official milk recording schemes is estimated at only 24% (Scholtz & Grobler, 2009).The available recording schemes facilitate breeders to submit data on fitness (reproduction), production and quality traits.The average number of cows participating in milk recording in South Africa is summarized in Table 2 for the three major dairy breeds in South Africa.South Africa has approximately 30 different beef cattle breeds including British, European, composite and indigenous breeds.In Table 3 a summary is provided of the beef cattle breeds that have performance information on at least 1000 registered cows and bulls (older than 2 years).
Many of the South African dairy populations participate in INTERBULL and therefore genetic evaluation of all internationally available bulls exists on the South African scale for the majority of traits.These estimates of genetic merit (i.e.phenotypes) are freely available for use in national genomic evaluations.Much of the germplasm used in South African dairy herds originates from outside South Africa (Dürr & Jackobsen, 2009), either directly or indirectly.Therefore, these international estimates of genetic merit, which include performance information from South Africa itself, can be extremely beneficial for implementing genomic selection in a breed.Genotypes for many international AI bulls are currently available through international collaboration (Cromie et al., 2010).Therefore, it is possible to at least undertake research on the potential to implement genomic selection in (some of) these populations.
Beef cattle breeds that may take advantage of genomic selection will need adequate numbers of phenotyped animals.Lower heritability traits such as fertility and health (Veerkamp & Beerda, 2007;Berry et al., 2011) which are the type of traits that can generally benefit most from exploiting genomic information, require even larger populations of animals to achieve noticeable improvements in the accuracy of genomic predictions.The description of the Bonsmara (a composite breed developed in South Africa) detailing the available number of breeding animals with records for maternal, reproduction and growth efficiency traits is summarized in Table 4.The Bonsmara is the largest beef cattle breed in South Africa, with in excess of 81 000 cows.Participation in the beef cattle improvement scheme is compulsory for all Bonsmara breeders, and phenotypic recording is thus enforced.Assuming that sufficient biological material or DNA of appropriate quality is available on a large sample of these animals, the Bonsmara should have sufficient resources to implement genomic selection for at least a selection of performance traits.Some other beef breeds in South Africa (e.g.SA Angus and Beefmaster) should also have sufficiently large databases for the possible establishment of sufficiently sized training populations.For these breed, the resources (e.g.genotypes and phenotypes) available can also be improved through international collaboration.The first challenge for genomic selection in South Africa is to evaluate which breeds have sufficient biological samples and phenotypic data to be able to establish a sufficiently sized training population to generate accurate genomic predictions.The suitability of the commercially available genotyping platforms for genomic evaluations needs also to be confirmed in these populations.This breed-specific training population can then be used to obtain a prediction equation with an ideally high correlation between the genotypic information and the "true" breeding values for the different traits.A number of proven bulls with a large number of recorded progeny (with breeding value predictions with a very high reliability) should be genotyped to validate the accuracy of prediction.Once high predictive ability is achieved, information obtained from low density genotyping of animals can be imputed to calculate direct genomic values.The steps to be taken to achieve this are shown in Figure 1.The use of genomic selection should result in shorter generation intervals and an increased rate of genetic improvement for the South African cattle industry.Across-breed genomic evaluations may increase the accuracy of genomic predictions further, especially for breeds with smaller population sizes, but the potential to achieve high accuracy of genomic predictions using a multi-breed cattle population and the current genotyping platforms is currently unknown.

Conclusion
Application of genomic selection globally in both the dairy and beef industries is underway and smaller populations with fewer resources will have to collaborate and carefully plan breeding programs to remain internationally competitive.It is also important to note that the same principles of genomic selection discussed here for cattle also apply to sheep and goats.South Africa, with a significantly large small stock population, should therefore also investigate the potential of this technology for especially sheep where an internationally comparable animal recording system exists.It is envisaged that genomic selection will be important for South African dairy and beef cattle breeders and in the long term also for small stock.

Figure 1
Figure 1Steps in the process to implement genomic selection.

Table 1
Summary of assumptions for methods based on SNP markers for genomic EBV estimations (adapted fromHayes & Goddard, 2010) * Stochastic search variable selection.

Table 2
Number of dairy cows and bulls in animal recording, and traits recorded in South Africa (2012, R.R. van der Westhuizen, Pers.Comm., bobbie@studbook.co.za)

Table 3
Major beef cattle breeds in South Africa and traits recorded(SA Studbook, May 2012)

Table 4
Traits recorded and number of records per trait for the Bonsmara breed (April 2012) Milk: maternal WW; AFC: age at 1 st calving; ICP: inter-calving period; ADG: average daily gain; FCR: feed conversion ratio; EMA: eye muscle area.