TY - DATA T1 - Supporting data for "The sponge microbiome project" AU - Moitinho-Silva, Lucas AU - Nielsen, Shaun AU - Amir, Amnon AU - Gonzalez, Antonio AU - Cerrano, Carlo AU - Ackermann, Gail L AU - Astudillo-Garcia, Carmen AU - Easson, Cole AU - Sipkema, Detmer AU - Liu, Fang AU - Steinert, Georg AU - Giorgos Kotoulas AU - McCormack, Grace P AU - Guofang Feng AU - Bell, James J AU - Vicente, Jan AU - Bjork, Johannes R AU - Montoya, Jose M AU - Olson, Julie B AU - Reveillaud, Julie AU - Steindler, Laura AU - Mari-Carmen Pineda AU - Marra, Maria V AU - Ilan, Micha AU - Taylor, Michael W AU - Polymenakou, Paraskevi AU - Erwin, Patrick M AU - Schupp, Peter J AU - Simister, Rachel L AU - Knight, Rob AU - Thacker, Robert W AU - Costa, Rodrigo AU - Hill, Russell T AU - Lopez-Legentil, Susanna AU - Thanos Dailianis AU - Ravasi, Timothy AU - Hentschel, Ute AU - Zhiyong Li AU - Webster, Nicole S AU - Thomas, Torsten DO - 10.5524/100332 UR - http://gigadb.org/dataset/100332 AB - Marine sponges (phylum Porifera) are a diverse, phylogenetically deep-branching clade known for forming intimate partnerships with complex communities of microorganisms. To date, 16S rRNA gene sequencing studies have largely utilised different extraction and amplification methodologies to target the microbial communities of a limited number of sponge species, severely limiting comparative analyses of sponge microbial diversity and structure. Here, we provide an extensive and standardised dataset that will facilitate sponge microbiome comparisons across large spatial, temporal and environmental scales. Samples from marine sponges (n=3568 specimens), seawater (n=370), marine sediments (n=65) and other environments (n=29) were collected from different locations across the globe. This dataset incorporates at least 269 different sponge species, including several yet unidentified taxa. The V4 region of the 16S rRNA gene was amplified and sequenced from extracted DNA using standardised procedures. Raw sequences (total of 1.1 billion sequences) were processed and clustered with a) a standard protocol using QIIME closed-reference picking resulting in 39,543 Operational Taxonomic Units (OTU) at 97% sequence identity, b) a de novo protocol using Mothur resulting in 518,246 OTUs, and c) a new high-resolution Deblur protocol resulting in 83,908 unique bacterial sequences. Abundance tables, representative sequences, taxonomic classifications and metadata are provided. This dataset represents a comprehensive resource of sponge-associated microbial communities based on 16S rRNA gene sequences that can be used to address overarching hypotheses regarding host-associated prokaryotes, including host-specificity, convergent evolution, environmental drivers of microbiome structure and the sponge-associated rare biosphere. KW - Metagenomic KW - Marine sponges KW - Archaea KW - Bacteria KW - Symbiosis KW - 16S rRNA gene PY - 2017 PB - GigaScience Database LA - en ER -