Article Text

PDF

Detection of aneuploidies by paralogous sequence quantification
  1. S Deutsch1,
  2. U Choudhury1,
  3. G Merla1,*,
  4. C Howald1,
  5. A Sylvan2,
  6. S E Antonarakis1
  1. 1Department of Genetic Medicine and Development, University of Geneva Medical School, GE 1211, Geneva, Switzerland
  2. 2Biotage AB, Kungsgatan 76, Uppsala, Sweden
  1. Correspondence to:
 Professor Stylianos E Antonarakis
 Department of Genetic Medicine and Development, University of Geneva Medical School, GE 1211, Geneva, Switzerland; Stylianos.antonarakismedecine.unige.ch

Abstract

Background: Chromosomal aneuploidies are a common cause of congenital disorders associated with cognitive impairment and multiple dysmorphic features. Pre-natal diagnosis of aneuploidies is most commonly performed by the karyotyping of fetal cells obtained by amniocentesis or chorionic villus sampling, but this method is labour intensive and requires about 14 days to complete.

Methods: We have developed a PCR based method for the detection of targeted chromosome number abnormalities termed paralogous sequence quantification (PSQ), based on the use of paralogous genes. Paralogous sequences have a high degree of sequence identity, but accumulate nucleotide substitutions in a locus specific manner. These sequence differences, which we term paralogous sequence mismatches (PSMs), can be quantified using pyrosequencing technology, to estimate the relative dosage between different chromosomes. We designed 10 assays for the detection of trisomies of chromosomes 13, 18, and 21 and sex chromosome aneuploidies.

Results: We evaluated the performance of this method on 175 DNAs, highly enriched for abnormal samples. A correct and unambiguous diagnosis was given for 119 out of 120 aneuploid samples as well as for all the controls. One sample which gave an intermediate value for the chromosome 13 assays could not be diagnosed.

Conclusions: Our data suggests that PSQ is a robust, easy to interpret, and easy to set up method for the diagnosis of common aneuploidies, and can be performed in less than 48 h, representing a competitive alternative for widespread use in diagnostic laboratories.

  • DS, Down syndrome
  • FISH, fluorescence in situ hybridisation
  • PSMs, paralogous sequence mismatches
  • PSQ, paralogous sequence quantification
  • QF-PCR, quantitative fluorescence polymerase chain reaction
  • aneuploidies
  • diagnosis
  • paralogous
  • pyrosequencing
  • trisomy

Statistics from Altmetric.com

Chromosome number abnormalities or aneuploidies were first recognised as a cause of human disease in 1959, with the detection of an extra copy of chromosome 21 in children affected with Down syndrome (DS).1,2 Since then many numerical chromosome abnormalities have been characterised, and their overall frequency in all human populations is estimated to be around 1/200 live births. Autosomal trisomies of chromosomes 13 (Patau syndrome), 18 (Edward syndrome), and 21 (DS), and sex chromosome numerical abnormalities (45,X, 47,XXY, 47,XYY, and 47,XXX) account for the vast majority of aneuploidies encountered. DS is the most common chromosomal disorder, resulting in severe mental retardation and multiple dysmorphic features.3 It affects approximately 1/750 live births in all ethnic groups, but high risk pregnancies can be determined through analysis of serum markers and ultrasonographic screening, with maternal age as a risk factor.4

Since the early 1970s prenatal diagnosis for chromosomal disorders has been offered to women with high risk pregnancies, by performing karyotype analysis on fetal cells obtained by amniocentesis or chorionic villus sampling. Even though karyotyping remains the gold standard for chromosome analysis, it has considerable disadvantages, since it is labour intensive, expensive, and takes on average 14 days for the results to be reported.4 These drawbacks have encouraged the development of faster and more efficient techniques for the diagnosis of targeted chromosomal anomalies, such as interphase fluorescence in situ hybridisation (FISH)5 and quantitative fluorescence polymerase chain reaction (QF-PCR).6,7 Both of these techniques are now routinely performed in many diagnostic laboratories in order to provide a rapid (48 h on average) preliminary diagnosis.4 Interphase FISH is also labour intensive, since it requires counting a considerable number of nuclei (50–100) in order to be reliable. QF-PCR, which is based on the amplification of polymorphic microsatellite repeats, is less costly and has the advantage that many samples can be treated in parallel.8,9 However, since individuals are not heterozygous at all polymorphic sites, it requires the analysis of multiple markers (usually four or five) per chromosome in order to obtain at least two informative markers for each individual. This involves setting up and optimising multiplex PCR reactions, which can be a lengthy and complex process.

The completion of the human genome sequence10,11 has provided an extensive catalogue of sequence features that can be exploited for the design of new diagnostic strategies. In this paper we propose and validate a new PCR based diagnostic approach based on the use of paralogous sequences located on different chromosomes. Paralogous sequences have a high degree of sequence identity, but they accumulate nucleotide substitutions over time in a locus specific manner. The principle of the method is based on designing a single pair of primers to co-amplify paralogous sequences located on different chromosomes. The resulting PCR products (of identical size) will contain a number of internal sequence differences (paralogous sequence mismatches or PSMs) that are specific to each locus and are not polymorphic. Quantification of the PSM position can be used to determine the relative dosage of the chromosomes in which the paralogous sequences are located and thus detect the presence of chromosome number abnormalities.

We applied this method, which we term paralogous sequence quantification (PSQ), to 175 DNAs, of which 120 contained a common aneuploidy. We show that it is a reliable, simple, and high throughput alternative for the diagnosis of targeted aneuploidies.

METHODS

Samples

DNA samples from 50 trisomy 21 individuals that had been previously collected with informed consent in our laboratory were used for this study. Specific authorisation was requested from the ethics committee of the Geneva University Hospitals for use of the DNA samples in this particular project. Fifteen fibroblast cell cultures from individuals with various chromosomal abnormalities were purchased from the Coriell Cell Repositories (GM03330, GM02948, GM00526, GM03538, GM02732, GM01359, GM00734, GM00143, GM03102, GM01250, GM09326, GM11337, GM00857, GM01176, GM10179). Sixty DNA samples of individuals carrying trisomies of chromosomes 13 and 18, and various sex chromosome abnormalities, were provided by Genzyme (Cambridge, MA, USA). Finally, 50 normal individuals from the CEPH collection were used as additional controls.

Genomic DNA was prepared with either the PUREGENE whole blood kit (Gentra Systems, Minneapolis, MN, USA) or the QIAamp kit (Qiagen, Hilden, Germany).

Paralogous sequence quantification (PSQ)

PCR reactions with the selected primer pairs (table 1) were set up in a total volume of 25 μl containing 20 ng of genomic DNA, 5 pmol of each primer, and 200 μmol/l of dNTPs. We used 1.25 U of a standard Taq polymerase (Amersham Biosciences, Little Chalfont, Buckinghamshire, UK) or alternatively a ready made 2×PCR mastermix containing dUTP and N-uracil glycosylase (Eurogentec, Seraing, Belgium) with varying levels of MgCl2 and DMSO depending on the assay (table 1).

Table 1

 PCR primers and conditions

PCRs were carried out on a T gradient thermocycler (Biometra, Göttingen, Germany), and cycling conditions consisted of a 2 min step at 50°C, and 10 min denaturation at 94°C. This was followed by 10 cycles of “touchdown PCR” with a 20 s denaturation step at 94°C, a 20 s annealing step starting at 57°C and decreasing by –0.5°C per cycle, and an extension step at 72°C for 20 s. The final 30 cycles were as before, but with a constant annealing temperature of 52°C, followed by a final elongation step of 72°C for 5 min.

PCR products were purified, and annealed to an internal sequencing primer close to the PSM site to be quantified. The purification and pyrosequencing steps were performed following the instructions of the manufacturer (Biotage AB, Uppsala, Sweden).

Data analysis

The Pyrosequencing software (PSQ 96 MA software; www.biotage.com) directly outputs a quantitative value for the proportion of each PSM present in the PCR product. We used the percent of the “query” chromosome as our statistic for all calculations. To determine the range of values that could be confidently diagnosed for every assay we calculated the 99% confidence for the distribution of control and affected individuals (bimodal distribution). Any sample with a value outside these limits was considered uncertain. Uncertain samples were treated either as false positives or as false negatives according to the known karyotypes, and this was used to estimate the sensitivity and specificity of each test using standard approaches.12

In order to combine the two assays for each type of aneuploidy, we first normalised each distribution so that the average percent of the query chromosome for the control individuals was 50 (the expected outcome). We then calculated the mean of the two assays for each sample.

In order to determine the reproducibility of our assays, we randomly selected a control and an affected sample for each autosomal aneuploidy, and a male and a female sample for the X v Y and X v autosomal (A) assays. We performed 12 replicates for each sample for each assay: four on the same run with the same PCR mix, four on a second day with the same PCR mix as the first day, and four on a third day with a different batch of PCR mix and performed by a different operator. We calculated the coefficient of variation for same day, same PCR batch measurements (CV1), different day, same PCR batch measurements (CV2), and different day, different PCR batch measurements (CV3).

RESULTS

Assay design

In order to design PSQ assays, paralogous sequences located on different chromosomes must first be identified. One of the sequences must map to the chromosome of interest (or query chromosome, for example chromosome 21) and the second to any other autosomal chromosome (the reference chromosome).

To identify such paralogous sequences, all the known exons of chromosomes 13, 18, 21, and X (http://www.ensembl.org/) were batch blasted against the human genome. We selected matches with high scores (usually >350) and very low E values (<10−40) where only two hits were observed: one to the query chromosome and the second elsewhere on another autosomal chromosome (fig 1A).

Figure 1

 (A) Ideogram of human chromosomes. The black horizontal bars highlighted by circles show the positions of paralogous sequences in the human genome. Only sequences that were present only twice with a high degree of homology were used. (B) Typical alignment between paralogous sequences used for designing PSQ assays. Dotted boxes indicate the position of primers, and the encircled position shows the paralogous sequence mismatch used for quantification. (C) Principle of the method. If a cell contains two copies of chromosome 5 and two copies of chromosome 21, one expects to see a ratio of 1:1 at the PSM position. When three copies of chromosome 21 are present this ratio should be 1.5:1.

The second step of the method involves the quantification of single nucleotide differences between PSMs. For this we chose the Pyrosequencing13 method which has been previously shown to be highly quantitative.14–17

To design pyrosequencing assays, we took the selected BLAST alignments for each of the query chromosomes (fig 1B) and manually built a consensus sequence, which was entered into Oligo 3 software (Molecular Biology Insights, West Cascade, CO, USA) to obtain a suitable pair of primers perfectly matching both chromosomes (to minimise differences in the efficiencies of amplification) and spanning at least one PSM. Quantification of the PSM position by pyrosequencing can be used to determine the relative dosage of the query and reference chromosomes (fig 1C).

For the detection of sex chromosome abnormalities, we designed two types of assays: first, X v Y assays to quantify the ratio between the X and the Y chromosomes (using a paralogous sequence present in the X and Y chromosomes), and second, X v autosomal assays to obtain the ratio between the X and any autosomal chromosome. The theoretically expected values (table 2) show that this strategy allows the identification of all common aneuploidies.

Table 2

 Expected values

Assay selection

We originally designed four to five assays per chromosomal abnormality that were pre-screened with a panel of eight control and eight aneuploid samples. Each assay was tested with a number of PCR conditions (varying concentrations of MgCl2 and DMSO, and two types of buffer as described in the Methods section). From this, we selected assays for each chromosomal abnormality based on the following criteria: (a) the PSM quantification in control individuals should be close to 50%, indicating that both alleles amplify with equal efficiency; (b) there should be a clear, non-overlapping discrimination between control and aneuploid samples; and (c) there should be the least possible deviation from the mean.

Only a subset of the assays fulfilled these conditions, and most of the assays were sensitive to the PCR condition used (data not shown). Ultimately we selected the best two assays for each chromosomal abnormality for further validation.

Assay results

We analysed the performance of the 10 independent tests designed to detect trisomies of chromosomes 13, 18, and 21 as well as sex chromosome aneuploidies. The means (we used percent of query chromosome as our statistic) and standard deviations for all of the assays are shown in table 3.

Table 3

 Summary results for each assay

Typical results of normal and affected samples for each assay are shown in fig 2. In eight out of the 10 assays the observed average values corresponded or were very close to the theoretically expected values (tables 2 and 3), and for the two remaining assays (Hsa 13b and Hsa 21b) there was an approximate 10% downwards shift for both the control and affected group, which did not affect the performance of the tests. The sensitivity and specificity were similar across all the assays (table 3), with no false positive or false negative calls, but with on average 7% of samples falling outside the set confidence thresholds, thus precluding a diagnosis.

Figure 2

 Typical results (“pyrograms”) of control and affected individuals for all assays (for X v Y and X v A assays males v females are shown). The name of each assay is given on the left and the karyotypes on the top right corner of each panel. The PSM position is indicated by the numbers above the peaks in each graph, which correspond to the chromosomes in which the paralogous sequence is located.

Since we had two independent assays for each aneuploidy, we integrated the results of both tests for each sample to generate a combined distribution. This resulted in a significant improvement in the separation between control and affected individuals, as seen by the greater sensitivities and specificities across all the tests (table 4 and fig 3) and 99% of the samples being unambiguously diagnosed.

Table 4

 Specificity and sensitivity of combined assays

Figure 3

 (A) Combined distributions of the autosomal assays. (B) Distributions of the X v Y and X v A assays. The x axes represent the percent of the query chromosome and the y axes the frequency of each class.

Through out the study, 12 DNA samples repeatedly failed to amplify for at least one of the assays, and so these samples were not considered further.

Assays for autosomal aneuploidies

For trisomies of chromosomes 18 and 21, we tested 89 and 105 DNAs, respectively, and obtained a correct and unambiguous diagnosis in all cases (table 4). We thus correctly identified all 29 trisomy 18 samples and 47 trisomy 21 samples present in the cohort. Concerning the assays for trisomy 13, 91 DNAs were analysed, and out of these an unambiguous diagnosis was obtained for 90 samples. The status of one sample remained uncertain, since its combined value was outside the 99% confidence intervals. We repeated the two trisomy 13 assays for this DNA, which again resulted in an ambiguous result, and thus the sample could not be diagnosed. It is possible that this DNA originates from an individual mosaic for trisomy 13, but since DNAs had been fully anonymised prior to the study, we could not re-analyse the original karyotype.

Assays for sex chromosome aneuploidies

We analysed 93 DNAs for combined X v Y assays and obtained a very clear separation between the four groups defined by the ratio between the X and Y chromosomes (fig 3B). In particular, the separation between the male group and the group containing the females (46,XX, 45,X, and 47,XXX, all of which have 100% of chromosome X) was very large, but this was expected and reflects the theoretical outcomes (table 1). Nevertheless, since very few XXY and XYY individuals were present in the study, additional samples are required in order to establish the precise performance of these tests.

For the X v A combined assays, we analysed 91 samples out of which two samples gave intermediate values that could not be diagnosed. However, since these tests are partially redundant with the X v Y assays, only one sample could not be fully resolved. One of the samples that had given a value of 41% in the X v A assay (hence an intermediate value between one and two X chromosomes), gave a value of 52% in the X v Y assay and thus was unambiguously diagnosed as a normal male. The second sample with an inconclusive diagnosis (X v A combined value of 43%) had given a value of 89% for the X v Y assay, and therefore it was not possible to discriminate between a 46,XX or a 45,X0 diagnosis. We thus repeated the two X v A tests and obtained a combined value of 48% showing that individual is 46,XX in concordance with the karyotype.

Reproducibility

To estimate the reproducibility of individual measurements, we selected a control and an affected sample for each aneuploidy (for the X v Y and X v A assays we picked individuals of different gender) and performed 12 replicates as detailed in the Methods section. The results shown in table 5 demonstrate a high reproducibility for all of the assays, with a low coefficient of variation between same day and same batch replicates (0.7–4.3% of the mean), and for some assays a larger variation for inter batch replicates (up to 6.2%). These results indicate that some of the tests are sensitive to precise PCR conditions and thus, to improve the reliability of the tests, it might be advisable to work with frozen aliquots of a previously validated PCR mix containing the primers, buffer, and dNTPs.

Table 5

 Reproducibility of assays

DISCUSSION

In this study we present the PSQ method as an alternative approach for the rapid and efficient detection of targeted aneuploidies. Ten different assays, designed for the identification of autosomal trisomies of chromosomes 13, 18, and 21 and sex chromosome number abnormalities, were tested. We performed a retrospective study on 175 DNAs that were selected to include a relatively large number of aneuploid samples in order to evaluate the sensitivity and specificity of the tests.

The performance of single assays was characterised by no false negative or false positive calls, but a certain number of samples (7% on average) fell outside the 99% confidence intervals, and for these an unambiguous diagnosis could not be established. When combining the two tests for each chromosomal disorder, there was a significant improvement in the separation between control and affected samples, resulting in increased sensitivities and specificities across all tests, and the correct identification of 118 out of 120 abnormal samples present in the study. The diagnoses in the remaining two samples were inconclusive after the first run and were subsequently re-tested. This allowed an unambiguous diagnosis for one of the two, while the status of the second sample remained uncertain. It is possible this DNA originated from an individual with trisomy 13 mosaicism, but this could not be confirmed.

Eight out of the 10 assays gave average values that were very close to the theoretically expected value. This shows that our strategy of using co-amplification of paralogous sequences with a single pair of primers that match perfectly at both loci, resulted in almost identical amplification efficiencies, and importantly, that end point measurement using the Pyrosequencing method is a quantitative and reliable technique consistent with previously published results.14–17 Selected samples for each assay were measured 12 times in order to evaluate the reproducibility of the tests. The intra and inter run variation between measurements was low when the PCR mixes were from the same batch. Inter batch variances were higher for some assays, suggesting that even small differences in the PCR mix resulting from inaccurate pipeting can have an effect. Our results suggest that in order to optimise the reliability of the procedure it might be necessary to make batches of PCR mix that can be tested and stored prior to use.

The first generation design of this test requires 10 separate PCRs per sample, which significantly reduces the sample throughput and increases the probability of handling errors. However, since the Pyrosequencing technology allows for a certain degree of multiplexing, subsequent improved assays should consist of no more than three or four PCRs per sample. Even with the current protocol, a single operator can handle at least 30–40 samples a day and report results in less than 48 h, which should cover the needs of most diagnostic laboratories.

Alternative molecular methods for the diagnosis of aneuploidies have been recently developed.4,18 PCR based methods such as QF-PCR,6–9 multiple amplifiable probe hybridisation,19 multiplex probe ligation assay,20,21 and PSQ (present study) all have the advantage of being inexpensive and efficient in terms of labour and high throughput. QF-PCR, which is based on the use of polymorphic markers, is by far the most established of all the PCR based techniques, however, it has a number of shortcomings, since some individuals can be homozygous at all sites and the informativeness of markers can vary across different populations. Despite these problems, QF-PCR has been successfully implemented in several diagnostic laboratories8,22 and protocols using single nucleotide polymorphisms are currently being developed. Multiple amplifiable probe hybridisation and multiplex probe ligation assay (both based on size specific probe design, co-amplification, and size separation by capillary electrophoresis) do not make use of polymorphic markers and in principle work on all individuals. These two approaches have the advantage of allowing the simultaneous analysis of up to 40 loci using size specific probes that can be efficiently resolved by capillary electrophoresis, but initial results have shown that up to eight probes per chromosome are needed to obtain reliable results.20

The major drawback of all PCR based tests is that they are targeted to specific regions of the genome, hence rare chromosomal abnormalities and balanced translocations can not be detected. In addition, low level mosaicism, which can have significant clinical consequences, is difficult to detect with any DNA based rather than cell based method.

Non-PCR based technologies such as comparative genome hybridisation have recently shown encouraging results23,24 and the development of high resolution BAC arrays will surely become a powerful tool for the molecular diagnosis of DNA copy number abnormalities. However, current protocols are considerably labour intensive and costly, and hence their application in routine diagnostic protocols is not yet feasible.

The important debate of whether molecular tests should be used as stand alone tests (thus replacing karyotyping altogether) is a complex issue and has been discussed at length elsewhere.4 However, an emerging consensus is that molecular tests might be appropriate as stand alone tests for the group of women that are tested solely on the basis of maternal age or personal choice (this group constitutes the large majority of cases) and for which trisomies of chromosome 13, 18, and 21 and XY aneuploidies account for up to 99.9% of the disease associated abnormalities.

No one single molecular method seems to be obviously superior to the rest, since all have advantages and disadvantages. Our data suggest that PSQ is a robust, easy to interpret, and easy to set up method for the diagnosis of common aneuploidies, which should represent a very competitive alternative for widespread use in routine diagnostic laboratories.

Acknowledgments

We thank Dr A Reymond for discussions and critical advice concerning the manuscript, and Dr E Dermitzakis and Dr C Stella for advice on statistical treatment of the data. DNAs from trisomy 13, 18, and sex chromosome aneuploidies were kindly provided by Genzyme Genetics.

REFERENCES

View Abstract

Footnotes

  • * Present address: Servizio di Genetica Medica, IRCCS Casa Sollievo della Sofferenza, 71016 - San Giovanni Rotondo (FG), Italy.

  • This study was funded by grants from the Swiss National Science Foundation, the NCCR Frontiers in Genetics, the Child Care Foundation (to SEA) and a research grant from Pyrosequencing AB (to SD and SEA).

  • Conflict of interest: none declared.

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.