For example, the exact composition of the microbial life in any given sample surely contains mixed populations, but the exact species and the relative abundance of them are unknown. When standard metagenomic analysis is applied (shotgun sequencing), there is disruption of the sequencing data and grouping by species can be very difficult. Newer techniques are permitting more accurate analysis through chromatin level probability maps so that the individual genomes of microbial species can be accurately reconstructed within a mixed sample.
The validity of some of the short cut techniques employed by metagenomic assemblers is typically assessed against published references of current metagenomic data. One problem is that the full complement of microbial life for any sample, for example, the human gut is still not fully understood. Metagenomics is only in its initial phases and surprises are revealed routinely in the analysis of tissue and environmental samples.
For example, very recent evaluation of the human gut metagenome revealed a previously unknown gut virus, which has been linked to human chronic disease. Researchers indicate that this virus, called Assphage, lives in the gut of more than half of the world’s population and infects a common gut bacteria, Bacteroidetes. This particular virus was identified through a computer program and had not been previously identified. Yet it is ubiquitous and is currently believed to have a major role in diabetes and obesity. So at this moment, exciting new technologies are deepening our understanding of ourselves as the complex linkages of co-dependent ecologies outlined in The Microcosm Within. Metagenomic assembly is one tool to assess that partnership. Yet, there will be many surprises along the path to our fuller understanding of the power of those associations.
Metagenomic Assembly
Metagenomic evaluation of any environment, ecology or tissue sample requires extensive computational power in order to properly assess large volumes of sequence data. This is required to mobilize an accurate representation of genetic diversity in a sample. New techniques in de novo metagenomic assembly effectively filter the total amount of data to be analyzed; yet, vast amounts of computer power are still required. New approaches to that analysis and changes in
the filtering methods have enabled improved methods of metagenome assembly of a more accurate characterization of the full range of microbial life in a sample. Complex software is necessary since in conventional genome analysis, only one species is being analyzed. However, metagenomic sample analysis, the required metagenome assemblers use algorithms to additionally separate species and to also attempt to assess their relative abundance. Different assemblers, utilizing differing techniques, can skew results based on scalability of the software and variability in data reconstruction filtering techniques. Recent studies have attempted to demonstrate that filtering shortcuts can still permit accurate analysis of any sample.