AMR

1 Preface

Predict antimicrobial resistance (AMR) directly from MALDI-TOF mass spectra using machine learning — a Clojure reproduction of Weis et al. (2022) using the DRIAMS dataset.

This project provides notebooks that walk through the full pipeline — from raw spectra to XGBoost classification — making it convenient for researchers to explore, reproduce, and extend this line of work.

Status

  • Preprocessing pipeline — ready and tested
  • Machine learning workflows — in progress
  • API may change as the project evolves

Setup

1. Get the DRIAMS dataset

Download from the Dryad repository and extract so site folders are under a single directory. You may remove the preprocessing subdirectories:

/path/to/DRIAMS/
├── DRIAMS-A/
│   ├── raw/2015/  raw/2016/  raw/2017/  raw/2018/
│   └── id/2015/   id/2016/   id/2017/   id/2018/
├── DRIAMS-B/ ...
├── DRIAMS-C/ ...
└── DRIAMS-D/ ...

We recommend gzipping the raw .txt files to save space. tablecloth reads .txt.gz transparently.

2. Configure the data path

Either set the environment variable:

export DRIAMS_BASE_DIR=/path/to/DRIAMS/

Or edit amr.edn in the project root:

{:base-dir "/path/to/DRIAMS/"}

3. Render the notebooks

From the Clojure REPL:

(require '[dev])
(dev/make-book!)

This produces a Quarto book under docs/.

Key Libraries

This project builds on the Scicloj ecosystem for scientific computing in Clojure:

  • tablecloth — dataframe library for tabular data manipulation (built on tech.ml.dataset and dtype-next)
  • metamorph.ml — machine learning pipelines (XGBoost classification in this project)
  • tableplot — interactive plotting via Plotly
  • Ripple (scicloj.ripple.maldi) — MALDIquant-compatible signal preprocessing, binning, and peak detection
  • Pocket (scicloj.pocket) — filesystem-based caching for expensive computations
  • fastmath — numerical and statistical functions
  • Kindly — annotation system for notebook visualizations (used with Clay for rendering)

References

License

MIT License — see LICENSE file.

source: notebooks/index.clj