Core Microbiome Venn Diagrams - PRD / PTSD / Compartments

Overview

This script identifies and visualizes the core microbiome of two spider species:

PRD – Pardosa lugubris
PTSD – Parasteatoda tepidariorum

The analysis compares bacterial taxa detected in three sample compartments:

ENV – environmental samples
SILK – silk-associated samples
EGGS – egg-associated samples

Core taxa are identified separately for each species-compartment combination and then compared using Venn diagrams. The script also exports supplementary tables listing shared and unique taxa for each comparison.

Purpose

The goal of this script is to detect bacterial families or genera that are:

unique to a specific compartment,
shared between selected compartments,
shared across the full ENV → SILK → EGGS gradient,
comparable between the two spider species.

This provides a simple and publication-friendly way to explore overlap patterns within the core microbiome.

Input data

The script requires three tab-separated input files:

1. Metadata file

Path: 01_Metadata/metadata_plik.tsv

Required columns:

sample-id
group — species label (PRD / PTSD)
type — compartment label (ENV / SILK / EGGS)

2. Feature table

Path: 06_Exports/deseq2_input/feature-table.tsv

This should be a QIIME2-exported feature table containing ASV counts per sample (mitochondria + chloroplasts filtered out).

3. Taxonomy file

Path: 06_Exports/deseq2_input/taxonomy_export/taxonomy.tsv

This file should contain taxonomy assignments for each feature (ASV), exported from QIIME2.

How the script works

The workflow consists of the following main steps:

Load input data
- metadata
- feature table
- taxonomy table
Extract taxonomy
- family names are extracted from taxonomy strings
- genus names are extracted from taxonomy strings
Match valid samples
- only samples present in both metadata and feature table are retained
Aggregate ASV counts
- counts are summed to the selected taxonomic level:
  - family
  - genus
Define core microbiome
- a taxon is considered part of the core microbiome if it:
  - has more than 10 reads in a sample,
  - is present in at least 66% of samples within a given group
Filter taxonomy labels
- ambiguous or non-informative labels are removed, such as:
  - Unassigned
  - Unknown
  - uncultured
  - unresolved subgroup-like labels
  - overly broad taxonomic placeholders
Generate Venn diagrams
- two-set and three-set comparisons are produced
Save outputs
- plots in multiple formats
- CSV tables with shared and unique taxa

Core microbiome definition

The default thresholds used in this script are:

Count threshold: > 10 reads per sample
Prevalence threshold: >= 66% of samples within a group

These values can be modified in the Settings section of the script.

Output files

The script creates the following output directories:

plots/ — graphical outputs
supplementary_tables/ — summary and overlap tables
raw_sets/ — exported raw taxon sets

Plot formats

Each Venn diagram is saved as:

.png
.pdf
.svg

Supplementary tables

For each comparison, the script exports CSV files containing:

taxa shared between all sets
taxa shared between selected pairs only
taxa unique to each set
summary counts for all Venn regions

Main settings

The most important user-defined parameters are:

Taxonomic level

TAX_LEVEL <- "family" # "genus" or "family"

Core microbiome thresholds

COUNT_THRESHOLD <- 10 PREVALENCE_THRESHOLD <- 0.66

Group labels

LEVEL_PRD <- "PRD" LEVEL_PTSD <- "PTSD" LEVEL_ENV <- "ENV" LEVEL_SILK <- "SILK" LEVEL_EGGS <- "EGGS"

Metadata column names

SPECIES_COL <- "group" COMPARTMENT_COL <- "type" SAMPLE_ID_COL <- "sample-id"

Required R packages

dplyr tibble ggplot2 readr tidyr stringr ggVennDiagram patchwork

Example use case

This script is suitable for questions such as:

Which bacterial taxa are shared between environment, silk and eggs?
Which taxa are unique to the environment, silk and eggs?
Which taxa persist across the full ENV → SILK → EGGS transition?
Are the same core taxa present in both spider species?

IMPORTANT:

The Venn diagrams generated by this script represent presence/overlap of core taxa, not differential abundance. They are useful for identifying:

stable taxa consistently detected in a compartment,
compartment-specific taxa,
taxa potentially associated with transfer across sample types.

Because the analysis is based on core membership thresholds, it complements abundance-based approaches such as heatmaps, differential abundance testing, or compositional analyses.

Notes

The script works on QIIME2-exported feature and taxonomy tables. Taxa are filtered before plotting to improve biological interpretability. The resulting diagrams are intended for descriptive and comparative analysis of core microbiome overlap.

Author Mateusz Glenszczyk

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
LICENSE		LICENSE
README.md		README.md
Venn Diagram - Core Microbiome.R		Venn Diagram - Core Microbiome.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Core Microbiome Venn Diagrams - PRD / PTSD / Compartments

Overview

Purpose

Input data

1. Metadata file

2. Feature table

3. Taxonomy file

How the script works

Core microbiome definition

Output files

Plot formats

Supplementary tables

Main settings

Taxonomic level

Core microbiome thresholds

Group labels

Metadata column names

Required R packages

Example use case

IMPORTANT:

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Core Microbiome Venn Diagrams - PRD / PTSD / Compartments

Overview

Purpose

Input data

1. Metadata file

2. Feature table

3. Taxonomy file

How the script works

Core microbiome definition

Output files

Plot formats

Supplementary tables

Main settings

Taxonomic level

Core microbiome thresholds

Group labels

Metadata column names

Required R packages

Example use case

IMPORTANT:

Notes

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages