Vector backbone integration in transgenic cassava is significantly correlated to T-DNA copy number

Multiple T-DNA integrations often occur with transgenic technology, resulting in complex integration patterns and transgene silencing. This study, investigates the correlation coefficient of T-DNA copy number on vector backbone (VBB) integration in transgenic cassava using Dot blot and PCR analysis. Thirty-nine, fifty-one and thirty-eight transgenic cassava plant lines recovered from transformations of cassava friable embryogenic callus with A. tumefaciens strain LBA4404 independently carrying p8016, p8052, and p900 were randomly selected and evaluated for VBB integration and T-DNA copy number. The occurrences of events with low (1-2) and high (≥ 3) T-DNA copy numbers were correlated with the presence and absence of VBB integration. Seventy-two to ninety-eight percent of VBB-free events were low copy number events while 2 to 28% of same where high copy number events. Correlation coefficient of the data revealed that the number of VBB-free events showed a significant positive correlation (r = 0.821, n = 9, p = 0.01) for events with low T-DNA copy number and a significant negative correlation (r = -0.739, n =9, p = 0.02) for high copy number events. This shows that the recovery of events with low T-DNA copy number increases the chances of recovering VBB-free events thereby enhancing the production of quality transgenic events.


Introduction
Agrobacterium mediated transformation is preferred over direct DNA transfer methods such as microparticle bombardment, polyethylene glycol and electroporation due to its simplicity and tendency to generate events with low T-DNA copy number (Shou et al., 2004). However, Agrobacterium meditated transformation often results in unwanted multiple T-DNA integrations. Multiple T-DNA integrations in transgenic plants have been linked to gene silencing and unpredictable inheritance of the desired traits (Fu et al, 2009;Oltmanns et al., 2010). The complex integration patterns observed at transgenic locus harboring multiple transgene insertions include tandem and inverted repeats determined by sequence homology (Tenea et al., 2006). These structures have been associated with low level transgene expressions as a result of transcriptional gene silencing (Hobbs et al., 1990;Oltmanns et al., 2010).
Another limitation associated with Agrobacterium mediated transformation is the tendency to integrate non-T-DNA vector sequences into the genome of transgenic plants (Abdal-Aziz et al. 2006). These non-T-DNA sequences often referred to as vector backbone (VBB) sequences consist of bacterial genes coding for antibiotic resistance, and Agrobacterium virulence and the bacterial origin of replication (De Buck et al., 2000). These sequences are undesirable and complicate regulatory processes in the release of genetically modified crops intended for product development. Evidence of VBB integration in transgenic plants produced via Agrobacterium mediated transformation has been reported in numerous plants including Arabidopsis, tobacco, strawberry and maize (Oltmanns et al., 2010;Kononov et al., 1997;De Buck et al., 2000;Huang et al., 2004). Twenty to fifty percent of recovered plants have been reported to carry VBB sequences in transgenic Arabidopsis and tobacco plants (De Buck et al., 2000) while Kononov et al (1997) recorded up to 70% VBB integration in transgenic tobacco. In transgenic strawberry, the frequency of vector backbone integration was shown to be as high as 90%, depending on the type of plasmid used (Abdal-Aziz et al. 2006). Integration of non-T-DNA vector sequences has been attributed to inconsistency in the function of both the left (LB) and right (RB) border sequences in the initiation and termination of Tstrand (transfer T-DNA strand) synthesis (Abdal-Aziz et al. 2006;Huang et al., 2004).
Gene of interest (GOI) required to confer desired traits in target plants are usually cloned in the T-DNA region between the LB and RB repeats. The function of the LB and RB sequences is to ensure that only the gene within the T-DNA region is transferred into the genome of target cells (Abdal-Aziz et al. 2006).
Originally T-strand synthesis was believed to initiate at the RB and terminate at the LB sequence. However, evidence that both the RB and LB sequences can initiate and terminate Tstrand synthesis was detected in transgenic maize by Huang et al., (2004) and may be the basis for increased frequencies of VBB integration in transgenic plants.
Complex integration pattern associated with multiple T-DNA integration could be linked with the integration of non-T-DNA vector sequences. Evidence of concatemers of the entire binary vector has been recorded showing multiple insertions of T-DNA and non-T-DNA vector sequences (Wenck et al., 1997;Gelvin, 2003). Understanding of the mechanism of T-DNA transfer and proposed models are aiding in the development of technologies that could avert multiple T-DNA integration and/or suppress the transfer of non-T-DNA vector sequences (Oltmanns et al., 2010;Kohli et al., 2010;Kuraya, 2004). Oltmanns et al. (2010) in their work showed that the launching of T-DNA from A. tumefaciens chromosome C58 at the picA locus reduced integrated transgene copy number and almost eliminated VBB integration in transgenic maize and Arabidopsis. Another proposed technology is the use of site specific recombination to simplify transgene integration by reducing the number of integrated copies (Kohli et al., 2010 Methods to suppress and/or eliminate vector backbone integration in transgenic rice have also been reported (Kuraya, 2004). The technology "PureMlb® technology" developed by Japan tobacco Inc. is based on the modification of the left border repeats with two or more additional left border sequences cloned proximal to the original LB sequence (Kuraya, 2004). The modification is aimed at preventing read through at left border region, thereby suppressing the transfer of VBB sequences. Kuruya et al (2004) reported 93% reduction in VBB integration in transgenic immature rice embryo transformed with constructs having multiple left borders (MLB). Xudong et al (2007) have also developed vectors and methods applied in Agrobacterium mediated plant transformation that promote the production of events with reduced VBB integration and high frequency of low T-DNA copy number. This was achieved by incorporating elements of replication origin in bacterium intended for plant cell transformation as a way to maintain low copy number of the DNA construct.
In this present study, the relationship between multiple T-DNA insertion and VBB integration was investigated to determine possible correlation between both occurrences. The aim was to ascertain whether recovery of VBB-free events increases the chances of recovering events with low T-DNA integration. This will thus determine whether methods that limit/suppress VBB integration, or their early elimination from genetic transformation systems, could reduce the time and cost required in the production of quality transgenic events free of undesirable VBB sequences and with low transgene copy number.
Transgenic cassava plants were obtained after the Agrobacterium mediated transformation of cassava friable embryogenic callus (FEC) with the three gene constructs p8016, p8052 and p900. The p8016 construct was designed using the MLB approach (Kuraya, 2004). We also incorporated a red fluorescent protein (DsRed) (Wenck et al., 2003) in the VBB region to aid in the identification of transgenic events carrying integrated VBB sequences (Okwuonu et al., 2015). DsRed is proven to be a superior fluorescent protein for monitoring gene expression in chlorophyll containing tissues (Zhang et al., 2015). p8052 construct is similar to p8016 in architecture with the exception of the multiple LB repeats while p900 is a binary plasmid carrying DsRed in the T-DNA region. Presence of VBB sequences past the LB and RB repeats were investigated at the visual and molecular levels. Occurrence of low (1-2) T-DNA copy number versus high (≥ 3) T-DNA copy number insertions were evaluated with statistical analysis showing a significant positive correlation between multiple T-DNA insertion and VBB integration, and a significant negative correlation with VBB-free events.

Gene constructs
The gene constructs employed in this study, p8016, p8052 and p900 were developed at the International Institute for Crop Improvement ( (Murashige & Skoog, 1962) supplemented with 2% w/v sucrose (Taylor et al., 2012). The FEC target tissue was transformed with A. tumefasciens strain LBA4404 independently carrying p8016, p8052 and p900 constructs as described by Okwuonu et al. (2015).
Dot-blot analysis of transgenic plants DNA was extracted from leaf samples of randomly selected plant lines recovered from p8016, p8052 and p900 transformations using Qiagen DNeasy Plant Mini kit (CA, USA) following manufacturer's instruction. DNA samples were quantified using NanoDrop 2000 UV-Vis Spectrophotometer (Thermo Scientific, Pittsburgh PA, United States). One hundred nanograms of DNA was blotted in triplicate unto Hybond-N+ nylon membrane (GE Healthcare, Piscataway, NJ) and probed with specific probes for the detection of VBB sequences. List of primers used for probe synthesis and their descriptions are shown in Table 1. Probes were synthesized with DIG-probe synthesis kit as specified by the manufacturers (Roche Applied Science, Indianapolis, IN, USA). DNA-probe hybridization was visualized by exposing membrane to X-ray film. The developed films were scanned and saved as Tiff files. Presence or absence of signal on the triplicate dots where indicative of VBB integration or absence of it. A positive VBB integrated events confirmed by Southern blot analysis was used as reference for the presence of VBB integration and nontransgenic 60444 DNA sample as a negative control.
For copy number evaluation, a hybridization probe was synthetized using sense and antisense primers derived from nptII gene (Table 1). Scanned and saved X-ray films were analyzed using open source Image J software version 1.36b as described by Taylor et al. (2012) Image J elliptical selection tool was used to measure the intensity of the signal and data were automatically exported as excel file. Copy number for each of the sample was extrapolated from the average of the triplicated dots. Previously confirmed one, two and triple copy number events by Southern blot analysis were used as references for low and high T-DNA copy number (Taylor et al., 2012).
Multiplex PCR analysis of transgenic plants Specific LB and RB primers shown in Table 1 were used in carrying out multiplex PCR to confirm the result obtained with Dot blot hybridization assay. PCR reaction consisted of a 20 µl reaction containing a 10 µl 2 x Phusion High Fidelity master mixes (New England Biolabs, USA) and 1 µl each of primers 63, 64, 81 and 382 described in Table 1. DMSO at 0.6 µl was added to each reaction mix to optimize the reaction and the volumes made up to 20 µl with 4.4 µl nuclease free water. PCR conditions consisted of an initial denaturation period of 30s at 98 0 C followed by another 10s at 98 0 C, an annealing temperature of 70 0 C for 30s, extension temperature of 72 0 C for 45s and a final extension period at 72 0 C for 10m. The completed reaction was held at 4 0 C. Plasmid DNA and confirmed VBB integrated events were used as positive controls. DNA isolated from non-transgenic 60444 cultivars was used as negative control.

Statistical analysis
The relationship between low/high T-DNA copy number integrations and VBB-integration/VBBfree events was determined by Correlation coefficient analysis. Data were analyzed using the Data analysis tool in Microsoft excel and pvalue calculated with the p-value calculator.

Analysis of VBB integration and T-DNA copy number
Three constructs, p8016 designed with MLB approach and DsRed marker gene in the VBB region, p8052, a single LB construct also harboring the DsRed marker in the backbone and p900 with DsRed marker in the T-DNA region were employed in this study to compare the frequencies of VBB integration in relation to T-DNA copy number. Thirty-nine, fifty-one and thirty-eight transgenic plants lines recovered from FECs transformed with constructs p8016, p8052 and p900 respectively were randomly selected. Genomic DNA was extracted from leaf samples and evaluated for presence and absence of VBB sequences by Dot blot hybridization using specific LB and RB probes synthesized with the primers sets shown in Table 1. Samples B3 and F2 in Figure 2 were positive controls derived from transgenic events previously confirmed as having integrations of sequences past the LB and RB repeats by Southern blot analysis while sample H1 serves as a negative control from a non-transgenic cassava line of cultivar 60444. Samples C1, G3 and G4 were confirmed positive for VBB integration by giving a positive signal with the LB and RB probe.
Results obtained with Dot blot hybridization were confirmed by PCR using primers DsRed-F and Pri-63 for detection of sequences within the VBB past the LB sequence, and primers Pri-81 and Pri-382 for the detection of sequences past the RB sequence (Table 1). Multiplex PCR amplification of 721 bp and 1165 bp PCR products were confirmatory for the integration of VBB sequences past the RB and LB sequences (Fig. 3)  Relationship between T-DNA copy number and VBB integration Integration of VBB sequences and its correlation with T-DNA copy number present in the plant genome were investigated. Dot blot hybridization (Fig. 4) was used to estimate the T-DNA copy number of plants derived from the three constructs p8016, p8052, and p900 as explained in the previous section. The number of VBB-integrated events having low and high T-DNA copy number as well as the percentage occurrences of these events are shown in Table  3, while Table 4 shows the number of VBB-free events with low and high T-DNA copy number and their percentage occurrences.
A Correlation analysis was carried out to confirm the relationship between VBB integration and multiple T-DNA integration. Correlation coefficient of the data revealed that the number of VBB-integrated events showed a significant negative correlation (r = -0.427, n =9, p = 0.12) for low T-DNA copy number events and a significant positive correlation (r = 0.821, n=9, p = 0.01) for high T-DNA copy number events (Table 5). The coefficient between number of VBB-free events and occurrence of low and high copy number events showed a significant positive correlation (r = 0.821, n = 9, p = 0.01) for low T-DNA copy number events, and a significant negative correlation (r = -0.739, n =9, p = 0.02) for high T-DNA copy number events (Table 6). The positive correlation for low copy number and VBB-free events means that the lower the copy number of T-DNA integrated into the plant genome the higher the number of VBB-free events recovered and vice versa.

Discussion
Despite the preference for Agrobacteriummediated transformation over direct DNA transfer methods, integration of multiple T-DNA copy number and VBB sequences in plants produced via Agrobacterium-mediated transformation still remains a subject of concern in the development of genetically modified (GM) crops Oltmanns et al., 2010). The highest priority in agricultural biotechnology is to produce transgenic plants with single copy T-DNA insertion to circumvent problems associated with gene silencing, and plants free of VBB integration to avert complicated passage through subsequent regulatory processes (Gelvin et al., 2012). Methods to enhance the recovery of single copy events free of integrated VBB sequences is essential towards reducing cost associated with molecular analysis of recovered transgenic events.
In this study, we evaluated randomly selected transgenic events derived from three constructs p8016, p8052 and p900 for occurrence of VBB integration and T-DNA copy insertions using Dot blot hybridization and PCR analysis. The aim was to ascertain the relationship between T-DNA copy number and VBB integration and thus determine whether methods enhancing the recovery of VBB-free events will invariably enrich the recovery of events with low T-DNA copy number.
Results obtained from molecular analysis showed high occurrences of low copy number events (72 to 98%) amongst VBB-free events in contrast to VBB-integrated events (22 to 55%). This was further authenticated by correlation analysis which showed a significant negative and positive correlation for low and high T-DNA copy number respectively within the VBB-integrated events and a significant positive and negative correlation for low and high T-DNA copy number integrations respectively within VBB-free events. This indicates that the probability of recovering large population of VBB-free events is high with large population of low T-DNA copy number events and low with large population of events with high T-DNA copy number. This result corresponds to the findings of Huang et al. (2004) who showed that 50% of the transgenic maize plant lines evaluated were single copy number events without the integration of Ori V segment in the VBB region. Forsbach et al. (2003) and Meza et al. (2002) also independently reported that 6% (9/99) and 13% (5/37) single copy Arabidopsis transgenic lines harbored VBB sequences and thus suggesting that single-copy T-DNA transformants are negatively correlated with the occurrences of VBB integration (Ziemienowicz et al., 2008) This correlation could be attributed to complex integration patterns occurring at the transgenic locus which could involve tandem integrations either direct or inverted repeats (determined by sequence homology) with the entire VBB joining two or more T-DNA insertions. This might be as a result of bleedthrough or failure of the LB to terminate T-DNA integration thus allowing repeated integration of the T-DNA. Evidence of concatemers of the entire binary vector has been recorded showing multiple insertions of T-DNA and non-T-DNA vector sequences (Wenck et al., 1997;Gelvin, 2003). In other to verify this hypothesis there is need for a detailed and extensive southern blot analysis to assess if the integration patterns involve a single multiple copy integrations or multiple diverse integrations.
The implication of this study is that molecular characterization of transgenic cassava events could be streamlined to either low T-DNA copy number or VBB-free events. It is therefore imperative to evaluate and assess technologies such as launching of T-DNA from A. tumefaciens chromosome C58 at the picA locus (Oltmanns et al., 2010), employment of vectors with multiple LB system (Kuraya, 2004) and use of site specific recombination such as Cre/loxP recombinase (Kohli et al., 2010;De Paepe et al., 2009) to avert multiple T-DNA integration and/or suppress VBB integration.