• Figure 1.
    Download figureOpen in new tabFigure 1. Community composition of samples containing Melainabacteria.

    (AC) The relative composition of the human fecal samples A, B, and C, and (D) the aquifer community members. In A and D estimated percent relative abundance of the community is plotted, and in B and C, coverage is plotted, but estimated percent relative abundance is noted on the figure for select members. Organisms are classified at the phylum level. The human fecal sample A community is dominated by Prevotella copri DSM 18205, which accounts for more than 40% of the sequencing reads and is represented by several strains. Sequencing depth was not sufficient for human fecal sample C to accurately estimate roughly 25% of the community abundance, which includes MEL.C3. Aspects of the community composition of the aquifer sample are discussed in Wrighton et al. (2012).

    DOI: http://dx.doi.org/10.7554/eLife.01102.004

    Figure 2.
    Download figureOpen in new tabFigure 2. 16S rRNA gene phylogeny of Melainabacteria and Cyanobacteria.

    Trees were built using 16S rRNA gene sequences from MEL.A1, MEL.B1, and MEL.B2 (the 16S rRNA sequence of ACD20 was not recovered). (A) 16S rRNA gene phylogeny tree with five representative sequences from each phylum obtained from the Greengenes May 2011 database (DeSantis et al., 2006). Bootstrap values greater than 50% are indicated. (B) 16S rRNA gene phylogeny built using one representative sequence from each order within Cyanobacteria from the Greengenes database (May 2011) (DeSantis et al., 2006) besides orders YS2, SM1D11, and mle1-12 from which all sequences were used. For Melainabacteria, the habitats from which the sequences were predominantly derived are indicated and colored according to isolation source (blue = environmental (non-gut); brown = gut). Cyanobacteria are displayed in green. Bootstrap values greater than 70% are indicated by a black square.

    DOI: http://dx.doi.org/10.7554/eLife.01102.006

    Figure 4.
    Download figureOpen in new tabFigure 4. The physiological and metabolic landscape of Melainabacteria.

    Metabolic predictions for Melainabacteria based on genes identified in Figure 4—source data 1. Genes in pathways detected in the genomes of the subsurface and at least one gut genome (white box), only in the subsurface genome (grey box), only in at least one gut genome (orange box), and genes missing from pathways in all genomes (red box). Glycolysis proceeds via the canonical Embden-Meyerhof-Parnas (EMP) pathway with the exception of fructose-6-phosphate 1-phosphotransferase (EC:2.7.1.90, gene 3). Names of pathways and fermentation end-products are bolded and ATP generated by substrate-level phosphorylation are noted. All Melainabacteria genomes sampled lack electron transport chain components (including cytochromes (Cyto), succinate dehydrogenase (sdh), flavins, quinones), terminal respiratory oxidases or reductases, and photosystem I or II (PS1, PS2). The genomes also lack a complete TCA cycle (absent enzymes noted by red boxes), with the TCA enzymes instead linked to the fermentation of amino acids and organic acids denoted (pathways, blue arrows). Ferredoxin (Fd, green text) is important for hydrogen (H2) production via hydrogenases (yellow background box). Proton translocation mechanisms (green background box) may be achieved by the activity of trimeric oxaloacetate (OAA) decarboxylase and sodium-hydrogen antiporter, pyrophosphate (PPi) hydrolysis with pyrophosphatases, 11 subunit NADH dehydrogenase, and an annotated NiFe hydrogenase (green enzyme). Annotations for the gene numbers are in Figure 4—source data 2. The complete metabolic comparison of the Melainabacteria can be accessed at http://ggkbase.berkeley.edu/genome_summaries/81-MEL-Metabolic-Overview-June2013.

    DOI: http://dx.doi.org/10.7554/eLife.01102.010

    Figure 4—source data 1.Examination of enzymes (steps) in near-complete KEGG based modules shared among or unique to subsurface ACD20 and gut Melainabacteria genomes MEL.A1, MEL.B1, and MEL.B2.

    Analysis is based on the KEGG Module database (Kanehisa and Goto, 2000).

    DOI: http://dx.doi.org/10.7554/eLife.01102.011

    Download source data [figure-4—source-data-1.media-2.docx]
    Figure 4—source data 2.Gene annotations corresponding to the numbers in Figure 4.

    If the gene occurs in both the ACD20 and gut genomes, the reported annotation is based on ACD20.

    DOI: http://dx.doi.org/10.7554/eLife.01102.012

    Download source data [figure-4—source-data-2.media-3.docx]
    Figure 6.
    Download figureOpen in new tabFigure 6. The phylogeny of the Melainabacteria nitrogenase.

    A maximum likelihood phylogenetic tree constructed with 865 nitrogenase nifH genes from sequenced genomes (Zehr et al., 2003) is shown. nifH groups I and III are shown. The ACD20 nifH (in group III) is denoted in red, while photosynthetic cyanobacterial nifH sequences (in groups I and III) are denoted in green. Relative to group I, group III is characterized by deep bifurcations and long-branch lengths (Zehr et al., 2003), which are represented in the constructed tree by low-bootstrap values (<50) for internal branch positions in group III. ACD20 sequences are monophyletic (but with low bootstrap support) with nifH sequences from anaerobic Clostridium and Fusobacterium species.

    DOI: http://dx.doi.org/10.7554/eLife.01102.016

    Figure 7.
    Download figureOpen in new tabFigure 7. Putative TLR5 activation region in Melainabacteria flagellin genes.

    Protein sequence alignment of residues 88–103 (Escherichia coli coordinates) for the flagellin genes. The range of residues required for TLR5 activation (Andersen-Nissen et al., 2005) are indicated by the top bracket. Sequences are organized by similarity within these residues. Species whose flagellin are reported (Andersen-Nissen et al., 2005) to be recognized (R) or unrecognized (UR) by TLR5 are noted. Based on the visualization of the alignment, flagellin genes predicted to be recognized or unrecognized by TLR5 are indicated; genes of ambiguous TLR5 recognition status are unmarked.

    DOI: http://dx.doi.org/10.7554/eLife.01102.017

    Figure 8.
    Download figureOpen in new tabFigure 8. Phylogeny of flagella-related genes.

    Supertree (cladogram) of 13 bootstrap ML trees of the flagellar genes shared among the four analyzed genomes. The phylum (or more specific taxonomic identifier) of each species is listed: (F) Firmicutes, (S) Spirochaetes, (E-P) Epsilonproteobacteria, (MEL) Melainabacteria, (A) Aquificae, (T) Thermotogae, (D-P) Deltaproteobacteria, (Pl) Planctomycetes, (B) Bacteroidetes, (A-P) Alphaproteobacteria, (B-P) Betaproteobacteria, (G-P) Gammaproteobacteria. In all 13 individual trees, Melainabacteria branched with Firmicutes and Spirochaetes.

    DOI: http://dx.doi.org/10.7554/eLife.01102.018

    Figure 9.
    Download figureOpen in new tabFigure 9. The prevalence of members of Melainabacteria in different environments, including distinct human body habitats.

    In all three panels, the relative abundances of Melainabacteria in different samples types are plotted as box plots (log10 transformed; i.e., 10−1 = 0.1%). Data were obtained from the QIIME database, derive from a variety of studies, and are publically available (Figure 9—source data 1): (A) soil, sediment, and water sites, (B) different human body sites (GI = gastrointestinal), (C, left) mammal stool classified by host diet, (C, right) country of origin for human stool. UD = undetermined.

    DOI: http://dx.doi.org/10.7554/eLife.01102.019

    Figure 9—source data 1.16S rRNA gene sequence datasets used to analyze the sources of Melainabacteria.

    DOI: http://dx.doi.org/10.7554/eLife.01102.020

    Download source data [figure-9—source-data-1.media-5.docx]
    Figure 10.
    Download figureOpen in new tabFigure 10. Single copy gene inventory from reconstructed genomes.

    Data are based on single copy genes (numbers in circles indicate the number of copies found).

    DOI: http://dx.doi.org/10.7554/eLife.01102.021

  • Table 1.

    Samples from which Melainabacteria genomes were recovered

    DOI: http://dx.doi.org/10.7554/eLife.01102.003

    SampleEnvironment# of readsAbundance%# genomes recovered
    AGut109,557,61641 Complete, 1 Partial
    BGut124,163,24832 Complete
    CGut112,578,26421 Complete, 1 Near Complete, 1 Partial
    ACDAquifer232,878,9790.71 Near Complete
    • Sample, number of reads sequenced, and estimates of the abundance of Melainabacteria in the communities based on 16S rRNA gene survey and coverage information. ACD20 was assembled from three samples (see Wrighton et al., 2012).

  • Table 2.

    Melainabacteria genomes recovered in this study

    DOI: http://dx.doi.org/10.7554/eLife.01102.005

    Sample IDCoverageGenome statusSize (bp)%GCScaffoldsN50Coding features16S rRNA genes
    ACD2030xNear Complete2,979,54833.519133,3612,819ND
    MEL.A173xComplete1,867,33632.911,867,3361,8322
    MEL.A25.5xPartial1,192,45530.68816,6131,386ND
    MEL.B162xComplete2,302,30735.321542,1172,2192
    MEL.B244xComplete2,308,20536.326375,3762,2222
    MEL.C126.5xComplete2,053,64234.141,742,0552,1202
    MEL.C227.5xNear Complete2,159,32735.334146,2322,1042
    MEL.C36xPartial1,323,47829.99315,8781,472ND
    • ND = not determined. See the section Genome assembly in ‘Materials and methods’ for an explanation of Genome Status.