Whole genome microarray for comparative genomics

Even within one species of bacteria, genetic content can vary by as much as 25 per cent between individual strains.

These differences can determine how virulent a particular strain is, or which organisms it can colonise. To make significant claims about which genes play a role in which biological processes, researchers must compare large numbers of bacterial genomes and see which genes are associated with a particular trait.

Lixin Zhang and colleagues from University of Michigan have made this sort of experiment much easier, by developing the aLibrary on a Slide' technique (BMC Microbiology, 22 March 2004, http://www.biomedcentral.com/1471-2180/4/12/abstract).

The scientists predict that their high-throughput method will be aan efficient and cost effective way for sharing and utilising large strain collections in various comparative genomics studies'.

Multiple copies of single gene

The Library on a Slide is based on current microarray technology ­ the printing and probing of the slide are carried out largely as usual. The most significant difference is that each spot on the slide contains the genomic DNA of one bacterial strain, rather than containing multiple copies of a single gene.

Given the already heterogeneous nature of DNA fragments within a total bacterial genomic preparation, for successful printing in the Library on a Slide technique the scientists used highly purified DNA.

Various DNA purification methods were tested, including both organic extraction and non-organic extraction based on membrane or resin. High quality DNA was obtained from all these methods that was suitable for array printing. But bead beating lysing followed by a commercial DNA purification column, worked well most consistently.

For high throughput, they adapted a 96 well format DNA isolation kit from MO BIO laboratories to accommodate a large number of strains. This system combines bead beating lysis with a vacuum based membrane column. Its column, however, can be easily clogged by precipitated debris and proteins, which are difficult to avoid during multichannel pipetting. So the scientists added an additional step to remove these particles using a 96 well MultiScreen lysate clearing plate before loading the column. To concentrate eluted DNA, they used a MultiScreen PCR plate in a 96 well format.

When purified DNA was directly used for printing the array, they observed very weak hybridisation signals due to inefficient binding of long DNA molecules. To decrease the viscosity of the DNA solution and to improve the spread and binding of genomic DNA to the glass slides, the bacterial genomic DNA was fragmented. For high throughput operation, DNAs were fragmented to about 2 kb on average by sonication within wells of a 96 well microplate on a microplate horn.

Zhang and co-workers Usha Srinivasan, Carl F Marrs, Debashis Ghosh, Janet R Gilsdorf and Betsy Foxman then tested techniques for identifying individual target genes at high sensitivity using fluorescently labelled probes.

Eventually they prepared a ssDNA dendrimer probe using a ssDNA fragment generated by exonulcease treatment. The single stranded dendrimer probe eliminated probe self hybridisation, enhancing probe and target hybridisation kinetics, generating better and more consistent hybridisation results on the Library on a Slide.

Testing the array

As a proof of principle, a test Library on a Slide was created using the E. coli ECOR collection. Low density and high density versions were generated, with ~2000 and ~15000 spots respectively, on a 22 mm x 60 mm surface.

The goal was to screen these isolates for the presence or absence of E. Coli virulence genes and compare them to previous results obtained by other methods. The hemolysin gene (hly), a known virulence factor, was used as an example.

To detect the presence or absence of a gene sequence in each genome spot on the array, signals of immobilised sample genomes were compared with a positive control. It was therefore critical that the same number of copies of each genome be compared.

Although all genomic DNA samples were suspended in the spotting buffer at the same concentration before arraying, they still could differ in genome copy number per spot due to genome size and plasmid content variations. In addition, exact amounts of DNA fixed in each spot could vary due to technical limitations during the printing and post-print processes. To ensure equality, the scientists took advantage of the multiplex labelling and detection features of the microarray platform and mutichannel laser scanner for measuring DNA quantity on the printed spots, employing a dual channel noncompeting hybridisation strategy.

One channel detected the signal from a quantification probe and the other for the probe of interest. 16s ribosomal RNA gene, present in all strains of the E. coli species in the same copy number, was used as the quantification probe and was labelled with Cy5 dye. The other probe contained the DNA sequence of interest, hly, and was labelled with Cy3 dye. Since the genome quantification probe and the gene probe of interest recognise different target sequences, they can be used in the same hybridisation process.

The hybridisation result of each probe was obtained by scanning the slide at a different wavelength, since they were labelled with non interfering dyes that excite at different wavelengths. The 16s rRNA gene probe recognises the same number of target sequences per genome of every sample. Therefore, its hybridisation signal intensity was considered an indicator of genome quantity and used for hly hybridisation signal adjustment using the Cy3/Cy5 signal ratio. The adjusted signal to the positive control ratio was determined and used to determine the presence or absence of the probe of interest, defined on the basis of a cutoff point established in a previous study.

Using a 50 per cent cutoff point, 12 strains were identified as hly gene positive; this was 100 per cent in agreement with results from experiments using dot blot and Southern hybridisation methods.

For the future

This study demonstrates that Library on a Slide is a viable screening platform with currently available array detection technology. The adaptations of fluorescent probes also provide ways for multiplex probing and in-spot DNA quantification beyond that in traditional dot blot hybridisation.

With the development of more dyes and more capable scanners, multiple gene screening can be accomplished in a single experiment. Zhang and his team are in the process of adapting Library on a Slide technology in their study of Escherichia coli, Group B Streptococcus, and Haemophilus influenzae. This, they say, "will enhance our ability to screen large bacterial population to identify genes associated with virulence and transmission.“

While Library on a Slide is currently used for determining absence or presence of a gene or part of a gene, improvements in hybridisation signal and glass surface chemistry will allow scanning for finer sequence variations.

For example, it is possible to print the array on a three-dimensional gel matrix that can be used to perform an array primer extension for detecting a single base mutation.

A single Library on a Slide can also be constructed with isolates from several related species, or species that are part of a microbial ecosystem such as rumen. Such platform, say the scientists, will enable them to examine the extent of shared genetic elements across species ­ especially horizontally transferred virulence factors and antibiotic resistance genes.

More importantly, comprehensive Libraries on a Slide can be produced in large quantities and made available to other investigators ­ an efficient and cost effective way for sharing and utilising large strain collections in various comparative genomics studies.

Recent Issues