This repository contains the official implementation of our study on spatial gene expression prediction from single H&E histology spots using EfficientNet-B0 and lightweight deep learning strategies.
Our work establishes a reproducible performance ceiling for single-spot prediction tasks, significantly outperforming prior methods such as BLEEP and ST-Net on the GSE240429 human liver dataset【144†source】.
-
Single-spot prediction pipeline (no contextual multi-spot input).
-
EfficientNet-B0 backbone with full fine-tuning.
-
Morphology-Preserving Augmentation (MPA): carefully designed augmentation strategy preserving biological structure.
-
Benchmark results:
- Up to 80% improvement over BLEEP on highly expressed genes (HEG).
- Outperformed ResNet-50 while using only 1/5 of the parameters.
-
Exploratory analyses:
- Backbone comparison (ResNet, DenseNet, MobileNet, EfficientNet).
- Augmentation comparison (ours vs. ST-Net).
- Cluster-ID experiments.
- Fine-tuning strategies (layer freezing, hidden layer depth).
BLEEP/
│── 19_make_patches_and_csv.py # Patch extraction + dataset CSV generation
│── 20_train_effb0_sota_split8020.py # EfficientNet-B0 baseline training (80/20 split)
│── 21_train_backbones.py # Backbone comparison (ResNet, DenseNet, etc.)
│── 22_search_freeze_and_hidden_cv5.py # 5-fold CV for layer freezing + hidden size search
│── 23_compare_augs.py # Augmentation comparison (MPA vs ST-Net)
│── GSE240429_data/ # Dataset folder (preprocessed data)
│── requirements.txt # Python dependencies
│── README.md # Project documentation
│── LICENSE # License file
git clone https://github.com/Medical-AI-GSE240429/Code.git
cd your_repopip install -r requirements.txt- Dataset: NCBI GEO: GSE240429
- Place the preprocessed dataset in
GSE240429_data/(TIFF files excluded).
python 19_make_patches_and_csv.py- EfficientNet-B0 baseline:
python 20_train_effb0_sota_split8020.py- Backbone comparison:
python 21_train_backbones.py- 5-fold CV + hyperparameter search:
python 22_search_freeze_and_hidden_cv5.py- Augmentation comparison:
python 23_compare_augs.py- EfficientNet-B0 achieved r = 0.315 on HEG, significantly outperforming BLEEP and ST-Net【144†source】.
- Predicted 50 genes with r ≥ 0.30, compared to only 20 by ResNet-50 with 5× more parameters.
- MPA augmentation provided stable gains compared to ST-Net augmentations.
If you use this repository, please cite:
@article{YourPaper2025,
title = {From Histology to Spatial Transcriptomics: Establishing a Lightweight Single-Patch Baseline},
author = {Hyungyum Jang†, Hyunsoo Shin†, Hawon Lee, Yena Jang, Sunghoon Jung*, Minji Jeon*},
journal = {},
year = {2025}
}
- Hyungyum Jang† – [email protected]
- Hyunsoo Shin† – [email protected]
- Hawon Lee – [email protected]
- Yena Jang – [email protected]
- Sunghoon Jung* – [email protected]
- Minji Jeon* – [email protected]
Contributions follow the paper: †Equal contribution.
This project is licensed under the terms of the MIT License.