microbiomedataset is part of tidymicrobiome.
Why this package
microbiomedataset is a data infrastructure package for microbiome analysis. It provides a standardized object system for:
- abundance matrices
- sample metadata
- taxonomy metadata
- phylogenetic trees
- reference sequences
- process tracking
The package is designed around the same data-engineering philosophy as massdataset, so microbiome and metabolome data can be analyzed with aligned workflows and linked in cross-omics studies.
What is new
This package is built around three manuscript-facing ideas.
A tree-aware microbiome data model.
microbiome_datasetextends themassdatasetobject model with taxonomy, explicit tree-link metadata, and reference sequences.A unified analysis and visualization layer. The package standardizes preprocessing, diversity, ordination, differential abundance, network analysis, and publication-style plotting around a single input object.
A microbiome-metabolome integration layer.
microbiomedatasetcan work directly withmassdatasetobjects for paired sample alignment, correlation analysis, multiblock integration, and mechanism-oriented taxon-pathway-metabolite linking.
Installation
Install the development version from GitHub:
if (!requireNamespace("remotes", quietly = TRUE)) {
install.packages("remotes")
}
remotes::install_github("tidymicrobiome/microbiomedataset")Core workflows
1. Microbiome data infrastructure
library(microbiomedataset)
data("global_patterns", package = "microbiomedataset")
global_patterns
plot_composition(global_patterns, taxonomic_rank = "Phylum", top_n = 8)2. Differential abundance and ordination
library(microbiomedataset)
data("demo_crossomics", package = "microbiomedataset")
microbiome_object <- summarise_taxa(
demo_crossomics$microbiome_data,
taxonomic_rank = "Genus"
)
ordination_result <- run_ordination(microbiome_object, method = "PCoA")
plot_ordination(ordination_result, color_by = "study_group")3. Microbiome-metabolome integration
library(microbiomedataset)
data("demo_crossomics", package = "microbiomedataset")
correlation_result <- calculate_correlation(
microbiome_data = demo_crossomics$microbiome_data,
metabolome_data = demo_crossomics$metabolome_data,
sample_link = demo_crossomics$sample_link,
microbiome_rank = "Genus",
method = "spearman",
metabolome_transform = "none"
)
network_object <- build_correlation_network(
correlation_result,
min_abs_correlation = 0.2,
max_q_value = 1,
top_n = 25
)
plot_correlation_network(network_object)Tutorials
Documentation site: https://tidymicrobiome.github.io/microbiomedataset/
Key tutorials:
- Get started
- Import and preprocess
- Tree and sequence handling
- Microbiome visualization
- Cross-omics workflow
- Advanced visualization
Citation
If you use microbiomedataset, please cite the package and the associated paper when available.
Contact
Xiaotao Shen
xiaotao.shen@outlook.com
