Code for the manuscript "Histopathology and proteomics are synergistic for High-Grade Serous Ovarian Cancer platinum response prediction"
Oz Kilim, Alex Olar, András Biricz, Lilla Madaras, Péter Pollner, Zoltán Szállási, Zsofia Sztupinszki, István Csabai
"Patients with High-Grade Serous Ovarian Cancer (HGSOC) exhibit varied responses to treatment, with 20-30% showing inherent resistance to platinum-based chemotherapy. While hematoxylin-eosin (H&E) pathological slides are used for routine diagnosis of cancer type, they may also contain diagnostically useful information about treatment response. Our study demonstrates that combining H&E-stained Whole Slide Images (WSIs) with proteomic signatures using a multimodal deep learning framework significantly improves the prediction of platinum response in both discovery and validation cohorts. This method outperforms the Homologous Recombination Deficiency (HRD) score in predicting platinum response and overall patient survival. The study sets new performance benchmarks and explores the intersection of histology and proteomics, highlighting phenotypes related to treatment response pathways, including homologous recombination, DNA damage response, nucleotide synthesis, apoptosis, and ER stress. This integrative approach has the potential to enhance personalized treatment and provide insights into the therapeutic vulnerabilities of HGSOC."
- GDC data portal TCGA-OV https://portal.gdc.cancer.gov/projects/TCGA-OV
- mmc2.xlsx (https://www.sciencedirect.com/science/article/pii/S0092867416306730)
- mmc3-2.xlsx (https://www.sciencedirect.com/science/article/pii/S0092867416306730)
- 1-s2.0-S0092867416306730-mmc5.xlsx (https://www.sciencedirect.com/science/article/pii/S0092867416306730)
- FFPE_discovery_globalprotein_imputed.tsv (https://www.dropbox.com/s/7zul3j1vyrxo40c/processed_data.zip?e=1&dl=0)
- PTRC-HGSOC_List_clinical_data.xlsx (https://www.cancerimagingarchive.net/collection/ptrc-hgsoc/)
python3 HGSOC_platinum_response/HGSOC_TCGA_tasks_data_setup.py
python3 HGSOC_platinum_response/classical_models_omics.py
setup_general.sh
Or follow the instructions outlined in https://github.com/mahmoodlab/CLAM to create embeddings for models. Embeddings created should be in the form to follow our analysis with the embedders used:
CTransPath/
├── h5_files
├── slide_1.h5
├── slide_2.h5
└── ...
└── pt_files
├── slide_1.pt
├── slide_2.pt
└── ...
UNI/
├── h5_files
├── slide_1.h5
├── slide_2.h5
└── ...
└── pt_files
├── slide_1.pt
├── slide_2.pt
└── ...
OV-Dino/
├── h5_files
├── slide_1.h5
├── slide_2.h5
└── ...
└── pt_files
├── slide_1.pt
├── slide_2.pt
└── ...
4. Running WSI and Multimodal models. This will use embeddings and tasks to train and test all models and all downstream tasks.
HGSOC_platinum_response/splits
run.sh
notebooks/results_analysis/TCGA_HGSOC_results.ipynb
./heatmap.sh
notebooks/interpretability/vis_multi_heatmaps.ipynb
HGSOC_platinum_response/HRD_results_analysis.ipynb
Dataset | Num WSIs | Micron per pixel | Data type |
---|---|---|---|
TCGA_OV | 1481 | 0.5040 | .svs |
CPTAC_OV | 221 | 0.2501 | .svs |
HGSC | 349 | 0.4965 | .svs |
Ovarian Bevacizumab Response | 284 | 0.5 (20X) | .tif |
UBC OCEAN | 538 | 0.5 | .png |
Internal | 42 | 0.424 | .mrxs |
./bulk_ovarian_patching.sh