AMR
1 Preface
Predict antimicrobial resistance (AMR) directly from MALDI-TOF mass spectra using machine learning — a Clojure reproduction of Weis et al. (2022) using the DRIAMS dataset.
This project provides notebooks that walk through the full pipeline — from raw spectra to XGBoost classification — making it convenient for researchers to explore, reproduce, and extend this line of work.
Status
- Preprocessing pipeline — ready and tested
- Machine learning workflows — in progress
- API may change as the project evolves
Setup
1. Get the DRIAMS dataset
Download from the Dryad repository and extract so site folders are under a single directory. You may remove the preprocessing subdirectories:
/path/to/DRIAMS/
├── DRIAMS-A/
│ ├── raw/2015/ raw/2016/ raw/2017/ raw/2018/
│ └── id/2015/ id/2016/ id/2017/ id/2018/
├── DRIAMS-B/ ...
├── DRIAMS-C/ ...
└── DRIAMS-D/ ...
We recommend gzipping the raw .txt files to save space. tablecloth reads .txt.gz transparently.
2. Configure the data path
Either set the environment variable:
export DRIAMS_BASE_DIR=/path/to/DRIAMS/Or edit amr.edn in the project root:
{:base-dir "/path/to/DRIAMS/"}
3. Render the notebooks
From the Clojure REPL:
(require '[dev])
(dev/make-book!)This produces a Quarto book under docs/.
Key Libraries
This project builds on the Scicloj ecosystem for scientific computing in Clojure:
- tablecloth — dataframe library for tabular data manipulation (built on tech.ml.dataset and dtype-next)
- metamorph.ml — machine learning pipelines (XGBoost classification in this project)
- tableplot — interactive plotting via Plotly
- Ripple (
scicloj.ripple.maldi) — MALDIquant-compatible signal preprocessing, binning, and peak detection - Pocket (
scicloj.pocket) — filesystem-based caching for expensive computations - fastmath — numerical and statistical functions
- Kindly — annotation system for notebook visualizations (used with Clay for rendering)
References
- Weis, C., et al. (2022). Direct antimicrobial resistance prediction from clinical MALDI-TOF mass spectra using machine learning. Nature Medicine, 28, 164–174.
Links
License
MIT License — see LICENSE file.
source: notebooks/index.clj