Assessment of OvineSNP 50 in Nigerian and Kenyan sheep populations

Deciphering genomic information requires markers that are polymorphic and sufficient enough to capture its vast array of genetic data. Polymorphic loci can differ greatly between breeds of the same species and the exclusion of the Nigerian and some African sheep breeds during the development of the OvineSNP50 chip necessitated the validation of SNPs included on the chip to allow for genomic applications of the excluded breeds. A total of sixty sheep samples were genotyped [10 each of the Balami, Uda, West African Dwarf and Yankasa from Nigeria, Dorper and Red Maasai from Kenya (East Africa)] using the Ovine 50k Illumina SNP bead chip. Results revealed that 33,994 SNPs (97.47%) of the called 34,876 SNPs were validated for downstream analysis. Mean heterozygosity values of 0.154 and 0.153 were obtained for polymorphic SNPs on sex and autosomal chromosomes respectively, while the values of 0.662 and 0.054 were obtained on the sex and autosomal chromosomes respectively for the mean identity-bystate (IBS). Six and three individuals violated the per ID and identity-by-state (IBS) thresholds, respectively. It was observed that the Ovine 50k Illumina SNP bead chip was informative in the Nigerian and East African sheep that were studied, and should be useful in examining the underlying genetic variation. *Author for Correspondence: iloribm@funaab.edu.ng

SNP genotyping is a powerful tools and as molecular markers which are based on DNA sequence polymorphisms and which must be sufficient and spaced evenly throughout the genome for it to be informative for population based studies (Kijas et al., 2009).It is useful for the analysis of genetic biodiversity, population structure, linkage disequilibrium, mapping quantitative trait loci and until recent past genomic selection.Indeed, DNA sequences determine the diversity of organisms, and therefore, the techniques used to evaluate DNA polymorphisms directly measure the genetic diversity.With limited ovine genomic sequence available, the International Sheep Genomics Consortium's immediate objective has been to sequence the ovine genome so as to identify SNPs in order to produce a 50k SNP chip.The Ovine SNP50 Bead Chip is a comprehensive genome-wide genotyping array for the ovine genome, providing superior power to interrogate genetic variation across many breeds.Data generated from the Ovine SNP50 Bead Chip have been used in several ways, including the development of Molecular Breeding Values, or the identification of genomic regions which explain either or all of the variation in monogenic traits or a large amount of variation in polygenic traits.There are already a number of publications that report the identification of the gene causing various monogenic disorders such as Arthrogryposis, Achondroplasia and Progressive Muscular Dystrophy through the selective genotyping of case and control animals using the Ovine SNP50 Bead Chip (Christensen et al., 2013).
Knowledge of the genetic relationship among neighbouring sheep populations is crucial for conservation efforts (Mukhongo et al., 2014) while genetic diversity serves as a way for populations to adapt to changing environments.With more variations, it is more likely that some individuals in a population will possess alleles that are suited for the environment (Fu et al., 2017).The conservation of farm animal genetic resources is important for coping with future breeding needs and for facilitating the sustainable use of marginal areas.
With the exclusion of the Nigerian sheep breeds during the development of the ovine 50k SNP bead chip, ascertainment bias of SNPs in the panel and the Nigerian Sheep Genome study which is lacking, it is needed to evaluate Nigerian and neighbouring sheep genetic resource with the Ovine 50k SNP bead chip to determine whether they are polymorphic for sufficient number of SNPs genotype in the bead chip, its applicability and its suitability to characterise this resource which could pave way for their conservation and better breeding management.

Materials & Methods
The four main breeds of sheeps in Nigeria namely Balami, Uda, Yankasa and West African Dwarf (Adu and Ngere, 1979) with an additional two breeds from Kenya; Dorper and Red Maasai were included in this study to recognize the possible effect of ascertainment bias.This is possible as the chip was developed from a limited number of breeds and samples which are not true representative of the polymorphism and frequency of alleles in the genome of members of this species (Nielsen and Signorovitch, 2003).

Results
From the 60 samples outsourced for genotyping, Balami and Yankassa samples did not amplify and therefore were excluded from the analysis.A total of 34,876 SNPs were mapped including: 865 on the X chromosome and 34,011 on the 26 autosomal chromosomes.Both sets were analysed separately.The distribution of genotyped markers on all chromosomes is presented in Table 1.SNP genotyped ranged from 421 on chromosome 24 to 3815 on chromosome 1.

SNP Genotyping
Blood samples were collected from the animals into heparinised tubes from the jugular 0 vein and stored at -2 C using EDTA (ethylenediamine-tetra-acetic acid) as anticoagulants.Approximately, 2ml of blood was collected from each of the animals into EDTA bottles, then transferred to the Animal Breeding and Genetics laboratory (FUNAAB) for analysis.DNA was extracted using NORGEN Genomic DNA isolation kit (NorgenBiotek Corporation, Ontario, Canada) according to the manufacturer's protocols.Concentration and purity of the extracted DNA were ascertained using gel electrophoresis and NanoDrop.Genotyping of DNA samples was outsourced to a private sequencing company using Illumina Infinium array (GeneSeek, Lincoln, NE, USA) consisting of nearly 50,000 SNPs derived mainly from Ovine breeds according to Illumina standard protocols.GenomeStudio Software v1.0 (Genotyping Module Illumina) was used for genotype calling from SNP intensity data and to ensure stringency of quality control parameters.

SNP Quality Control and downstream analysis
The SNP data generated were subjected to quality control using the following criteria to determine the number of polymorphic loci: Minor Allele Frequency (MAF) -1%, SNP Call Rate -90%, Per Individual Call rate or sample call rate -90%, Identity-by-state (IBS) -90%.The quality control measures were ensured using the check.markerfunction of GenAbelpackage implemented in R software (Aulchenko et al., 2007).MAF and within breed genetic variability (Observed and Expected Heterozygosity) was estimated using descriptive.markerand per.id summary function of GenAbel Package implemented in R software.

SNP Quality Control
The result of the sample-based quality control (Table 2) showed that out of the 865 SNPs on the sex chromosome, 1 marker violated the call rate threshold of 90% while 46 markers were removed for violating the minor allele frequency threshold of 1%.Six individuals did not meet the 90% threshold of per individual calling rate (Per.ID) and another three individuals violated the identity-by-state (IBS) 90% threshold, with the mean IBS being 0.662 and the mean heterozygosity of 0.154.Nine individuals and 47 markers were removed for violating quality control remaining 818 markers on sex chromosome.
On the autosomes, 33 markers were removed due to low call rate and 802 markers were removed for violating minor allele frequency.Also, six individuals violated the 90% threshold of per individual calling rate (Per I.D) and another three violated the identity-by-state (IBS) 90% threshold, with the mean IBS being 0.054 and the mean heterozygosity at 0.153.A total of 9 individuals, which were the same as for sex chromosome and 835 markers were removed for violating quality control remaining 31 individuals and 33,176 markers.Table 2: Quality control of SNPs

Distribution of minor allele frequency of markers within the populations
It was observed that with the MAF of less or equal to 1%, the proportion of SNPs ranged from 0.073 in Dorper to 0.216 in WAD (Table 3).It was also observed that with the MAF of greater than 1% and less or equal to 5%, only WAD has a proportion of 0.094.Uda has the highest proportion of 0.179 compared to WAD's 0.088 with the MAF of greater than 5% and less or equal to 10%.With the MAF of greater than 10% and less or equal to 20%, SNP proportion ranged from 0.131 in Red Maasai to 0.204 in Dorper.Finally, SNP proportion ranged from 0.437 in WAD to 0.631 in Dorper with the MAF of greater than 20%.

MAF distribution across the populations on the sex chromosome
On the sex chromosome, It was also observed that with the MAF of less or equal to 1%, the proportion of SNPs ranged from 0.164 in Dorper to 0.248 in Uda (Table 4).It was also discovered that with the MAF of greater than 1%

Discussion
The OvineSNP50 which is a microarraybased system was designed to determine the genotype of approximately 54 000 single nucleotide polymorphisms spaced evenly across the ovine genome, yet excluding the Nigerian a n d o t h e r A f r i c a n s h e e p p o p u l a t i o n s (Sandenbergh et al., 2016).Sample-based filtering relates to the call rate, which is the fraction of called SNPs per sample over the total number of SNPs in the dataset and can be influenced by factors such as DNA quality and concentration (Anderson et al., 2010).The call rate of 99.90% obtained in this study is comparable with call rates of above 99.5% (Kijas et al., 2013) and 99.9% (Tosser-Klopp et al., 2014) reported in related studies.It also compares favourably with the call rate of 99.6% found in the South African Angora goats (Lashmar et al., 2015).This indicates the high quality and informativeness of the Nigerian sheep populations' DNA.
The successful application of SNP arrays depends largely on the degree of polymorphism in the various breeds within each species (Fan et al., 2010).After filtering out SNPs with low minor allele frequency (MAF), which is the frequency at which the second most common allele occurs in a population, 97.57% proved to be polymorphic for these sheep breeds which is comparable to the number of polymorphic loci found in South A f r i c a n p o p u l a t i o n s ( 8 1 . 1 6 -8 6 .8 5 % ; Sandenbergh et al., 2016) ) and goat breeds (Jinlan: 45,648, Skopelos: 50,908) that were not included in the SNP discovery process (Tosser-Klopp et al., 2014).The number of polymorphic loci compares favourably with values of 28,869 and 35,084 SNPs obtained in African N'Dama and Sheko breeds, respectively, during Illumina's bovine SNP50 content validation study (Matukumalli et al., 2009) and the result of this study indicates a high level of polymorphism which is due to the informativeness of the African breeds genotyped.
The Per.Individual call rate (Per.I.D) refers to a quality control measure used in pinpointing the precision of downstream data analyses in genomic prediction and the Per.I.D call rate of the population at 85%, pinpoint towards an informative population.When two or more individuals have identical neuclotide sequences in a DNA segment, they are said to be indentity by state (IBS).For any given pair of individuals with genotype information, identityby-state (IBS) can be observed at a given locus with three possible outcomes: the individuals have two different alleles or they share one or two alleles in common.Two individuals who share 1 or 2 alleles IBS at a given locus may have inherited the shared allele(s) from a recent common ancestor, in which these allele(s) are identical-by-descent (IBD) (Lashmar et al., 2015).After filtering out individuals that are identical by state, 92.5% of the individual population are validated for futher analysis.
Heterozygosity is a condition where an organism has two different alleles of a given gene, which points to genetic variability within the population.The mean heterozygosity which is the average proportion of heterozygous SNPs in this study is high and indicates high level of genetic variation within the population with low level of inbreeding.The heterozygosity value obtained in the present study compares favourably with that of South African sheep (0.33 to 0.35) (Sandenbergh et al., 2016) and 0.365 obtained for Angora goats (Lashmar et al., 2015).
The MAF distribution patterns of all breeds were similar to those of South African, European and American breeds (Kijas et al., 2012;Sandenbergh et al., 2016 ) with most loci exhibiting MAFs of at least 20% or greater.However, West African Dwarf sheep exhibited lowest level of MAF distribution compared to other sheep breeds used in this study.The result might be explained by the low number of individuals and breeds of African origin used in the development of this chip panel resulting in SNP ascertainment bias and possibility of rare alleles in the WAD and other sheep populations used in the study.From our whole-genome genotyping result, it can be concluded that Nigerian and Kenyan sheep populations are polymorphic for majority of the SNPs genotyped on the SNP chip despite the fact that none of the breeds was included in its development.The WAD and Uda sheep breeds of Nigerian relatively had low MAF, heterozygosity and polymorphic marker when compared with the Kenya sheep breeds used in this study.This may most likely be as a result of SNP ascertainment bias and the fact that these sheep breeds have not been improved for any known traits.The improper management might have resulted in inbreeding within the herd and resultant increase in non-polymorphic loci.Although effective population size and rate of evolution differ between autosome and X chromosomes, this marker allow for access to large number of genomic information although with ascertainment bias in Nigerian sheep populations which could be ameliorated using linkage pruning technique.It is recommended that the marker should be used in all Nigerian sheep populations for their characterization, population structure and marker assisted selection for trait of economic importance.

by
Ro c h e 4 5 4 w h o l e -g e n o m e s h o t g u n sequencing.Fifteen thousand (15,000) SNPs identified reduced representational sequencing and less or equal to 5%, only WAD has a proportion of 0.104.Red Maasai has the highest proportion of 0.149 compared to Dorper's 0.044 with the MAF greater than 5% and less or equal to 10%.With the MAF of greater than 10% and less or equal to 20%, SNP proportion range from 0.123 in Red Maasai to 0.233 in Dorper.Finally, SNP proportion ranged from 0.423 in WAD to 0.559 in Dorper with the MAF of greater than 20%.Table 3: Minor allele frequency distribution across the populations' autosomal chromosomes.Prop = proportion Table 4: Minor allele frequency distribution across the four populations' sex chromosome.Prop = proportion Ilori et al. / Nig.J. Biotech.Vol.35 (2) : 176 -183 (Dec.2018)