Evaluation of the OvineSNP 50 chip for use in four South African sheep breeds

Relatively rapid and cost-effective genotyping using the OvineSNP50 chip holds great promise for the South African sheep industry and research partners. However, SNP ascertainment bias may influence inferences from the genotyping results of South African sheep breeds. Therefore, samples from Dorper, Namaqua Afrikaner (NA), South African Merino (SA Merino) and South African Mutton Merino (SAMM) were genotyped to determine the utility of the OvineSNP50 chip for these important South African sheep breeds. After quality control measures had been implemented, 85 SA Merino, 20 Dorper, 20 NA and 19 SAMM samples remained, with an average call rate of 99.72%. A total of 49 517 (91.30%) SNPs on the chip met quality control measures and were included in downstream analyses. The NA had the fewest polymorphic loci, 69.20%, while the SAMM, Dorper and SA Merino had between 81.16% and 86.85% polymorphic loci. Most loci of the SA Merino, Dorper and SAMM had a MAF greater than or equal to 0.3. In contrast, the NA exhibited a large number of rare alleles (MAF < 0.1) and a uniform distribution of other loci across the MAF range (0.1 < MAF ≤ 0.5). The NA exhibited the least genetic diversity and had the greatest inbreeding coefficient among the four breeds. The results of the Dorper, SA Merino, and SAMM compare favourably with those of international breeds and thus demonstrate the utility of the OvineSNP50 chip for these breeds. Effects of SNP ascertainment bias, however, could be seen in the number of non-polymorphic loci and MAF distribution of the three commercial breeds in comparison with those of the NA. The implementation of methods to reduce the effect of SNP ascertainment bias and to ensure unbiased interpretation of genotype results should therefore be considered for future studies using OvineSNP50 chip genotype results.

The OvineSNP50 chip was developed by Illumina in collaboration with the International Sheep Genomics Consortium (ISGC) and became commercially available in 2009.This microarray-based system is designed to determine the genotype of approximately 54 000 single nucleotide polymorphisms (SNPs) spaced evenly across the ovine genome.The mean genomic distance between SNPs included in the chip is approximately 51 kb with a median distance equal to 42 kb (International Sheep Genomics Consortium et al., 2010;Kijas et al., 2012).Approximately 500 SNPs discovered through BAC end Sanger sequencing of nine animals, each representing a different breed, are incorporated in the 50K chip.Roche 454 whole-genome shotgun sequencing of six individuals of six different breeds provided approximately 33 000 additional SNPs that are also included in the chip.A total of 15 000 SNPs were identified by reduced representational sequencing (RRS).The RRS was carried out on 60 individuals from 15 breeds (Kijas et al., 2012).The SNP chip has been validated in more than 75 international sheep breeds and has been used in a wide range of applications, which include the elucidation of quantitative trait loci and selection strategies (Miller et al., 2011;Kijas et al., 2012;Demars et al., 2013;Liu et al., 2013;Våge et al., 2013;Gutiérrez-Gil et al., 2014;Phua et al., 2014).SNP panels developed from a limited number of individuals or breeds are not necessarily representative of the polymorphisms and frequency distribution of alleles in the genome of all members of the species, this being referred to as ascertainment bias (Nielsen & Signorovitch, 2003;Morin et al., 2004).Ascertainment bias may influence parameters estimated from genotyping results and could therefore bias conclusions drawn from such data (Kijas et al., 2009;Albrechtsen et al., 2010).Ascertainment bias could also influence assumptions about a test population as well as the association between certain traits and molecular markers in that population (Heslot et al., 2013).It is therefore vital to recognise the effect of possible ascertainment bias of a marker panel on inferences relating to allele frequency distribution, linkage disequilibrium (LD), population genetic structure, association studies and selection strategies (Clark et al., 2005;Heslot et al., 2013).
The ovine SNP chip could be a valuable genotyping tool for use in South African sheep research and for commercial selection strategies.However, the success of whole-genome SNP studies relies on sufficient numbers of polymorphic SNPs spaced evenly throughout the genome of the test population (Kijas et al., 2009).It is therefore necessary to determine whether these South African sheep breeds are polymorphic for a sufficient number of SNPs included in the OvineSNP50 chip.The potential ascertainment bias of markers included in the OvineSNP50 chip and the potentially unique SNP profile of relatively rare African sheep breeds and subtypes supports evaluating the use of the ovine chip in South African sheep.Here the authors report genotyping results for the dominant wool (SA Merino); dual-purpose (SAMM) and meat (Dorper) breeds in South Africa (Cloete & Olivier, 2010).Genotyping results of the Namaqua Afrikaner (NA) were included as this breed represents the indigenous fat-tailed sheep of South Africa (Qwabe et al., 2013).
The Dorper (n = 20), NA (n = 20) and SAMM (n = 20) samples were obtained from animals maintained as a resource flock on Nortier Research Farm, near Lambert's Bay in the Western Cape, South Africa.The SA Merino samples were from resource flocks maintained at Cradock (n = 50) and Grootfontein (n = 50) in the Eastern Cape (Schoeman et al., 2010).Genotyping was done with the OvineSNP50 chip (Illumina) at GeneSeek Inc. (Lincoln, Nebr, USA) from samples applied to blood cards.GenomeStudio Software v1.0 (Genotyping Module, Illumina) was utilised to call genotypes from SNP intensity data and to ensure the stringency of quality-control parameters.Loci that met these quality-control measures were included for further analyses: >0.25 GenCall score; >0.5 GenTrain score; >0.01 MAF; >0.95 call rate and a sample call rate >0.95.Genotype data that met the quality control criteria were used to determine the number of polymorphic loci and MAF distribution of each breed.The observed heterozygosity and inbreeding coefficient (F IS ) were calculated using PLINK v.1.07(Purcell et al., 2007).
One SAMM sample and 15 SA Merino samples were excluded during quality control owing to low sample call rates.The remaining samples had an average call rate of 99.72%.A total of 49 517 SNPs, 91.30% of SNPs on the chip, met the quality-control measures and had a MAF >0.01 across all samples.These results are similar to other ovine studies, which reported >98% and >90% for sample and SNP loci call rates, respectively (Kijas et al., 2012;Shariflou et al., 2012;Våge et al., 2013;Phua et al., 2014).
The number of polymorphic loci of the SAMM, Dorper and SA Merino ranged between 81.16% and 86.85% (of the total number of SNPs on the chip) and the observed heterozygosity values were between 0.33 and 0.35 (Table 1).These results compare favourably with the number of polymorphic loci and heterozygosity estimates reported in the literature for other commercial sheep breeds (Kijas et al., 2009;2012).Although NA samples from five individuals were included in SNP discovery for the ovine SNP chip (Kijas et al., 2009;2012), this breed exhibited the fewest polymorphic loci (69.20%), and lowest mean MAF (0.19) and heterozygosity (0.28).The NA also exhibited the greatest inbreeding coefficient (0.25) among these breeds (Table 1).Similar low levels of genetic diversity have been reported by microsatellite-based studies of locally sourced NA samples (Qwabe et al., 2013), and based on OvineSNP50 genotype information of international NA samples (Kijas et al., 2009;2012).
The MAF distribution patterns of the SA Merino, Dorper and SAMM were similar to those of most European and American breeds (Kijas et al., 2012) with most loci exhibiting MAFs of at least 30% or greater (Figure 1).In contrast, the NA samples exhibited a large number of rare alleles (MAF < 0.1) and a uniform distribution of polymorphic loci across the MAF range (0.1 < MAF ≤ 0.5).Similar results, characterised by a large number of monomorphic loci with many additional loci exhibiting rare alleles, have been observed in BovineSNP50 (Illumina) genotype results of African and indicine cattle breeds (Matukumalli et al., 2009).The limited number of African and indicine individuals and breeds utilised during bovine SNP discovery, resulting in SNP ascertainment bias, has been suggested as underlying these genotype results.Matukumalli et al. (2009) therefore report the BovineSNP50 chip's utility in European and other commercial breeds, but warn of reduced power in African and indicine breeds.Whole-genome SNP genotyping results have indicated that the Merino breed is polymorphic for a large number of SNPs included on the OvineSNP50 chip as well as being one of the most genetically diverse livestock breeds (Kijas et al., 2009;2012;2014).It is therefore not surprising that the SA Merino samples included in the current study exhibited the highest percentage of polymorphic loci, highest mean MAF and high levels of genetic diversity across the four breeds.Although the Dorper and SAMM were not included during SNP discovery, there were genetic links to other breeds, such as the Merino and Poll Dorset (Milne, 2000), which were included.The percentage of polymorphic markers, mean MAF and heterozygosity of the Dorper and SAMM is somewhat less than that of the SA Merino.These results, however, still compare well with other commercial breeds (Kijas et al., 2012).The indigenous fat-tailed NA exhibited a large number of non-polymorphic loci and a distinct MAF distribution that is most likely the result of SNP ascertainment bias.Nevertheless, the OvineSNP50 chip provides an opportunity to gain a substantial quantity of genomic information in a relatively rapid and inexpensive manner for all four breeds.However, it is important to recognise the effect ascertainment bias may have on inferences relating to population genetic estimates and genomic selection strategies.A linkage disequilibrium pruning technique, which compensates for SNP ascertainment bias and ensures the unbiased interpretation of genotype data, would therefore be useful to future studies on the local ovine genetic resource (López Herráez et al., 2009;Kijas et al., 2012).
The utility of the OvineSNP50 chip for genotyping the local Dorper, SA Merino and SAMM breeds has been demonstrated.However, studies that utilise the chip to genotype indigenous breeds or breeds that are not included extensively in SNP discovery might be compromised by the effects of SNP ascertainment bias.There is scope for further studies to examine the use of the ovine SNP chip in the other breeds that constitute the South African ovine genetic resource and to characterise the breed diversity of South African sheep using whole-genome SNP data.

Figure 1
Figure 1 Minor allele frequency distribution of SNP loci included on the OvineSNP50 chip for the South African Mutton Merino (SAMM), Dorper, Namaqua Afrikaner (NA) and South African Merino (SA Merino) breeds.

Table 1
Number of polymorphic loci, mean minor allele frequency (MAF), observed heterozygosity (He) and inbreeding coefficient (F IS ) of the breeds tested NA: Namaqua Afrikaner; SAMM: South African Mutton Merino; n: number of samples.