Table of content
 
Nordicana D23 / DOI : 10.5885/45409XD-79A199B76BCC4110

Curated reference database of SSU rRNA for northern marine and freshwater communities of Archaea, Bacteria and microbial eukaryotes

Connie Lovejoy 1, André Comeau 2, Mary Thaler 1

1 Université Laval
2 Dalhousie University


Abstract

High throughput sequencing technologies, such as Roche 454 pyrosequencing and Illumina can enable semi-quantitative study of communities of single-celled organisms by generating hundreds of thousands of short sequence reads from a single environmental sample. However, to identify the taxa to which these reads belong requires a reliable database of reference sequences.
We maintain databases of taxa from all three domains of life found in marine and freshwater samples in the Canadian Arctic and subarctic, along with an accompanying file in Fasta format of the quality-checked reference sequences. These files are suitable for use in data-processing pipelines for next-generation sequencing using open-source software such as QIIME, mothur, or UPARSE, when the user wishes to assign taxonomic identities by sequence similarity to short reads.

Table 1. Number of sequences and sequence-length for three taxonomic databases

Domain Number of Sequences Mean sequence length (range) Base-pairs
Eukarya 766 440 (216-657)
Bacteria 33,293 435 (304–571)
Archaea 2288 557 (532–591)


The creation of these databases has been described in Comeau et al. 2011 and 2012. Briefly, we targeted the V4 variable region of the 18S rRNA gene for Eukarya and the V6-V8 and V3-V5 variable regions of the 16S rRNA gene for Bacteria and Archaea respectively. Reference sequences were originally imported from the SILVA database for Archaea and the Greengenes database for Bacteria, and are labeled with the original accession numbers from these databases, while the Eukarya database was assembled de novo, based on taxa found in our studies. We have edited the taxonomic identifications to reflect recent developments in the literature and included high-quality sequences from environmental clone libraries alongside cultured representatives when the former represent clades that are widespread in arctic and subarctic aquatic environments. Taxonomic identification of uncultured clones is based on well-supported phylogenetic trees, and they have been rigorously screened for potential chimeras using UCHIME (Edgar et al. 2011).
Because our focus is on single-celled organisms, our coverage of Metazoa, Fungi, and Streptophyta (land plants) from the Eukaryota database is sufficient to identify and remove these sequences from a sample, but should not be used for detailed taxonomic analysis within these groups. By the same token, chloroplast reference sequences are included in the Bacteria database primarily with the goal of identifying and removing these sequences from analysis.
These databases have been successfully used in numerous studies of microbial communities in high-latitude coastal and offshore marine environments (e.g. Comeau et al. 2011, Monier et al. 2014), as well as high-latitude lakes and ponds (Comeau et al. 2012, Negandhi et al. 2014, Crevecoeur et al. 2015).

References
Edgar, R.C., B.J. Haas, J.C. Clemente, C. Quince, R. Knight, 2011. UCHIME improves sensitivity and speed of chimera detection. Bioinformatics. doi: 10.1093/bioinformatics/btr381

Data citation

Lovejoy, C., Comeau, A., Thaler, M. 2016. Curated reference database of SSU rRNA for northern marine and freshwater communities of Archaea, Bacteria and microbial eukaryotes, v. 1.1 (2002-2008). Nordicana D23, doi: 10.5885/45409XD-79A199B76BCC4110.

Location map


Key references

Comeau, A.M., T. Harding, P.E. Galand, W.F., Vincent, C. Lovejoy, 2012. Vertical distribution of microbial communities in a perennially stratified Arctic lake with saline, anoxic bottom waters. Scientific Reports, 2: 604. DOI: 10.1038/srep00604.
Comeau, A.M., W.K.W. Li, J.-É. Tremblay, E.C. Carmack, C. Lovejoy, 2011. Arctic Ocean microbial community structure before and after the 2007 record sea ice minimum. PLoS One, 6: e27492. DOI: 10.1371/journal.pone.0027492.
Crevecoeur, S., W.F. Vincent, J. Comte, C. Lovejoy, 2015. Bacterial community structure across environmental gradients in permafrost thaw ponds: methanotroph-rich ecosystems. Frontiers in Microbiology. DOI: 10.3389/fmicb.2015.00192.
Monier, A., J. Comte, M. Babin, A. Forest, A. Matsouka, C. Lovejoy, 2014. Oceanographic structure drives the assembly processes of microbial eukaryotic communities. ISME Journal. DOI: 10.1038/ismej.2014.197.
Negandhi, K., I. Laurion, C. Lovejoy, 2014. Bacterial communities and greenhouse gas emissions of shallow ponds in the High Arctic. Polar Biology. DOI: 10.1007/s00300-014-1555-1.

Contributors

Comte, Jérôme (Université Laval)
Crevecoeur, Sophie (Université Laval)
Monier, Adam (University of Exeter)
Onda, Deo (Université Laval)
Potvin, Marianne (Université Laval)

Status

Published

Version history

You can request for data from previous versions at nordicana@cen.ulaval.ca.


Version 1.1 (2002-2008) - Updated March 1, 2016
Version 1.0 (2002-2008) - Updated December 11, 2015

Measurement sites

  Site Latitude Longitude Altitude (m)
More info
AO-NW01
75.99 156.87 -5
More info
Lake A
83.03 -75.43 5

Supplementary material


Download

Download ZIP file contains a readme file and a data file in text format (ASCII).
Please! Always quote citation when using data.