Downloads
Download
Additional Files
Download
- Supplementary Materials




This work is licensed under a Creative Commons Attribution 4.0 International License.
Article
Marker Gene-Guided Graph Neural Networks for Enhanced Spatial Transcriptomics Clustering
Haoran Liu 1 , Xiang Lin 2 and Zhi Wei 1,*
1 Department of Computer Science, New Jersey Institute of Technology, Newark, NJ 07102, USA
2 Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
∗ Correspondence: zhiwei@njit.edu
Received: 13 December 2024; Revised: 5 January 2025; Accepted: 10 January 2025; Published: 7 February 2025
Abstract: Recent advancements in Spatial Transcriptomics (ST) technologies have enabled researchers to investigate the relationships between cells while simultaneously considering their spatial locations within tissue. These technologies facilitate the integration of gene expression data with spatial information for clustering analysis. While many clustering methods have been developed, they typically rely on the dataset’s intrinsic features without incorporating domain knowledge, such as marker genes. We argue that incorporating marker gene information can enhance the learning of cell embedding and improve clustering outcomes. In this paper, we introduce MGGNN (Marker Gene-Guided Graph Neural Networks), a novel approach designed to enhance spatial transcriptomics clustering. Firstly, we train the model using a contrastive learning framework based on a Graph Neural Network (GNN). Subsequently, we fine-tune the model using a few spots labeled by the expression of marker genes. Simulation and experiments conducted on two real-world datasets demonstrate the superior performance of our model over state-of-the-art methods.
Keywords:
spatial transcriptomics (ST) marker gene graph neural network (GNN) contrastive learning
References
- Sta˚hl, P.L.; Salme´n, F.; Vickovic, S.; et al. Visualization and analysis of gene expression in tissue sections by spatial transcriptomics. Science 2016, 353, 78–82.
- Asp, M.; Bergenstra˚hle, J.; Lundeberg, J. Spatially resolved transcriptomes—next generation tools for tissue exploration. Bioessays 2020, 42, 1900221.
- Soldatov, R.; Kaucka, M.; Kastriti, M.E.; et al. Spatiotemporal structure of cell fate decisions in murine neural crest. Science 2019, 364, eaas9536.
- Maynard, K.R.; Collado-Torres, L.; Weber, L.M.; et al. Transcriptome-scale spatial gene expression in the human dorsolateral prefrontal cortex. Nat. Neurosci. 2021, 24, 425–436.
- Akeret, K.; Hugelshofer, M.; Schaer, D.J.; et al. Spatial transcriptome data from coronal mouse brain sections after striatal injection of heme and heme-hemopexin. Data Brief 2022, 41, 107866.
- Moncada, R.; Barkley, D.; Wagner, F.; et al. Integrating microarray-based spatial transcriptomics and single-cell RNA-seq reveals tissue architecture in pancreatic ductal adenocarcinomas. Nat. Biotechnol. 2020, 38, 333–342.
- Chen, W.T.; Lu, A.; Craessaerts, K.; et al. Spatial transcriptomics and in situ sequencing to study Alzheimer’s disease. Cell 2020, 182, 976–991.
- Hao, Y.; Stuart, T.; Kowalski, M.H.; et al. Dictionary learning for integrative, multimodal and scalable single-cell analysis. Nat. Biotechnol. 2024, 42, 293–304.
- Lin, X.; Liu, H.; Wei, Z.; et al. An active learning approach for clustering single-cell RNA-seq data. Lab. Investig. 2022, 102, 227–235.
- Tian, T.; Zhang, J.; Lin, X.; et al. Model-based deep embedding for constrained clustering analysis of single cell RNA-seq data. Nat. Commun. 2021, 12, 1873.
- Long, Y.; Ang, K.S.; Li, M.; et al. Spatially informed clustering, integration, and deconvolution of spatial transcriptomics with GraphST. Nat. Commun. 2023, 14, 1155.
- Shang, L.; Zhou, X. Spatially aware dimension reduction for spatial transcriptomics. Nat. Commun. 2022, 13, 7203.
- Hu, J.; Li, X.; Coleman, K.; et al. SpaGCN: Integrating gene expression, spatial location and histology to identify spatial domains and spatially variable genes by graph convolutional network. Nat. Methods 2021, 18, 1342–1351.
- Zhao, E.; Stone, M.R.; Ren, X.; et al. Spatial transcriptomics at subspot resolution with BayesSpace. Nat. Biotechnol. 2021, 39, 1375–1384.
- Pham, D.; Tan, X.; Xu, J.; et al. stLearn: integrating spatial location, tissue morphology and gene expression to ffnd cell types, cell-cell interactions and spatial trajectories within undissociated tissues. BioRxiv 2020, https://doi.org/10.1101/2020.05.31.125658.
- Lin, X.; Gao, L.; Whitener, N.; et al. A model-based constrained deep learning clustering approach for spatially resolved single-cell data. Genome Res. 2022, 32, 1906–1917.
- Tian, T.; Zhang, J.; Lin, X.; et al. Dependency-aware deep generative models for multitasking analysis of spatial omics data. Nat. Methods 2024, 21, 1–13.
- Chen, T.; Kornblith, S.; Norouzi, M.; et al. A simple framework for contrastive learning of visual representations. Int. Conf. Mach. Learn. 2020, 119, 1597–1607.
- Ren, H.; Walker, B.L.; Cang, Z.; et al. Identifying multicellular spatiotemporal organization of cells with SpaceFlow. Nat. Commun. 2022, 13, 4076.
- Zong, Y.; Yu, T.; Wang, X.; et al. conST: an interpretable multi-modal contrastive learning framework for spatial transcriptomics. BioRxiv 2022, https://doi.org/10.1101/2022.01.14.476408.
- Zhang, A.W.; O’Flanagan, C.; Chavez, E.A.; et al. Probabilistic cell-type assignment of single-cell RNA-seq for tumor microenvironment proffling. Nat. Methods 2019, 16, 1007–1015.
- Zhong, C.; Tian, T.; Wei, Z. Hidden Markov random ffeld models for cell-type assignment of spatially resolved transcriptomics. Bioinformatics 2023, 39, btad641.
- Pardo, B.; Spangler, A.; Weber, L.M.; et al. spatialLIBD: an R/Bioconductor package to visualize spatially-resolved transcriptomics data. BMC Genom. 2022, 23, 434.
- Buzzi, R.M.; Akeret, K.; Schwendinger, N.; et al. Spatial transcriptome analysis deffnes heme as a hemopexin-targetable inffammatoxin in the brain. Free. Radic. Biol. Med. 2022, 179, 277–287.
- Hafemeister, C.; Satija, R. Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression. Genome Biol. 2019, 20, 296.
- Wolf, F.A.; Angerer, P.; Theis, F.J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 2018, 19, 1–5.
- Kipf, T.; Welling, M. Semi-Supervised Classiffcation with Graph Convolutional Networks. Proceedings of the International Conference on Learning Representations (ICLR) 2017.
- Scrucca, L.; Fraley, C.; Murphy, T.B.; et al. Model-Based Clustering, Classiffcation, and Density Estimation Using Mclust in R; Chapman and Hall/CRC: London, UK, 2023.