Kangjie Zheng Photo

Kangjie Zheng Pronounced /kahng-dʒyeh dʒeng/

Postdoctoral Fellow, Wellcome Sanger Institute

AI for Biology · Genomics · Biomolecules

I develop scalable, data-driven AI models that learn from large-scale biological data to decode the language of life and uncover novel fundamental principles of biological systems.

About Me

I am a Postdoctoral Researcher at the Wellcome Sanger Institute in Cambridge, UK, working with Dr. Mo Lotfollahi on developing scalable, generalizable foundation models for large-scale biological data. My work spans sequence modeling (proteins, genomes) and structural modeling (3D molecular structures), with the goal of advancing integrative understanding across biological modalities.

Academic Background

  • Postdoctoral Fellow, Wellcome Sanger Institute (Sep 2025 – Present)
  • Ph.D. in Computer Science, Peking University (Aug 2020 – Jun 2025)
    Supervisor: Prof. Ming Zhang
  • B.Eng. in Computer Science, Harbin Institute of Technology (Aug 2016 – Jun 2020)
    College: The Honors School of HIT (Top 10 graduates)

Industry Experience

Research Highlights

A multi-modal, multi-scale suite of foundation models targeting diverse large-scale datasets and practical applications, organized around four key aspects:

Selected Publications

* equal contribution · See a full list of papers on Google Scholar .

AI for Science

  1. ESM All-Atom: Multi-scale Protein Language Model for Unified Molecular Modeling. ICML 2024.
    Kangjie Zheng*, Siyu Long*, Tianyu Lu, Junwei Yang, Xinyu Dai, Ming Zhang, Zaiqing Nie, Wei-Ying Ma, Hao Zhou.
  2. SMI-Editor: Edit-based SMILES Language Model with Fragment-level Supervision. ICLR 2025.
    Kangjie Zheng, Siyue Liang, Junwei Yang, Bin Feng, Zequn Liu, Wei Ju, Zhiping Xiao, Ming Zhang.
  3. Mol-AE: Auto-Encoder Based Molecular Representation Learning With 3D Cloze Test Objective. ICML 2024.
    Junwei Yang*, Kangjie Zheng*, Siyu Long, Zaiqing Nie, Ming Zhang, Xinyu Dai, Wei-Ying Ma, Hao Zhou.

Language Modeling

  1. ExLM: Rethinking the Impact of [MASK] Tokens in Masked Language Models. ICML 2025.
    Kangjie Zheng, Junwei Yang, Siyue Liang, Bin Feng, Zequn Liu, Wei Ju, Zhiping Xiao, Ming Zhang.
  2. Towards A Unified Training for Levenshtein Transformer. ICASSP 2023.
    Kangjie Zheng, Longyue Wang, Zhihao Wang, Binqi Chen, Ming Zhang, Zhaopeng Tu.
  3. A Decoding Algorithm Based on Directed Acyclic Transformers for Length-Control Summarization. EMNLP Findings 2024.
    Chenyang Huang, Hao Zhou, Cameron Jen, Kangjie Zheng, Osmar Zaiane, Lili Mou.
  4. Gloss Matters: Unlocking the Potential of Non-Autoregressive Sign Language Translation. ACM MM 2025.
    Zhihao Wang, Shiyu Liu, Zhiwei He, Kangjie Zheng, Liangying Shao, Junfeng Yao, Jinsong Su.

GNN & Data Mining

  1. Learning Generalizable Contrastive Representations for Graph Zero-shot Learning. IEEE Trans. Multimedia (2025).
    Siyu Yi, Zhengyang Mao, Kangjie Zheng, Zhiping Xiao, Ziyue Qiao, Chong Chen, Xian-Sheng Hua, Yongdao Zhou, Ming Zhang, Wei Ju.
  2. Zero-shot Node Classification with Graph Contrastive Embedding Network. Trans. on Machine Learning Research (2023).
    Wei Ju, Yifang Qin, Siyu Yi, Zhengyang Mao, Kangjie Zheng, Luchen Liu, Xiao Luo, Ming Zhang.
  3. Constrained Truth Discovery. IEEE Trans. on Knowledge and Data Engineering (2020).
    Chen Ye, Hongzhi Wang, Kangjie Zheng, Youkang Kong, Rong Zhu, Jing Gao, Jianzhong Li.
  4. Multi-Source Data Repairing Powered by Integrity Constraints and Source Reliability. Information Sciences (2020).
    Chen Ye, Hongzhi Wang, Kangjie Zheng, Jing Gao, Jianzhong Li.

Community Contributions

Academic Service

Program Committee / Reviewer

  • Conference on Neural Information Processing Systems (NeurIPS’25) — Reviewer
  • International Conference on Learning Representations (ICLR’24, ’25) — Reviewer
  • International Conference on Machine Learning (ICML’24, ’25) — Reviewer
  • AAAI Conference on Artificial Intelligence (AAAI’25) — Reviewer
  • ACL Rolling Review (ACL ARR) — Reviewer

Talks & Presentations

Contact Me

Feel free to reach out for collaboration, discussion, or to learn more about me.