Bowen Jin

I am a 4th-year Ph.D. student in Computer Science at University of Illinois Urbana-Champaign, fortunately advised by Prof. Jiawei Han. Before that, I was an undergraduate student in Electrical Engineering of Tsinghua University, fortunately advised by Prof. Yong Li. During the past, I spent time at Apple AIML, Google Research, Amazon Search, and Microsoft Research (both Redmond and Beijing).

My research is supported by Apple PhD Fellowship and Yunni and Maxine Pao Memorial Fellowship. For further information, please see my CV (last update: 2025.06.19).

Research Interests: My main research lies at the intersection of large generative models (e.g., large language models and diffusion models), multimodal data and information networks. In particular, I focus on how large models can integrate text, network, and multimodal data for solving real world problems including information retrieval and knowledge discovery. My current research interest is LLM agent, reasoning and RL.

Large Generative Models + Graphs (Survey): representation learning (Heterformer, Edgeformers), pretraining (Patton), graph plug-in (Graph-CoT) and multimodal synthesis (InstructG2I, GraphGPT-o).
Large Generative Models + IR: reasoning-retrieval interleaved LLM with RL (Search-R1), LLM alignment (LarPO), RAG and long-context LLM (LongRAG), semantic indexer (LMIndexer), generative retrieval (RIPOR) and generative recommendation (UniMP).
Large Generative Models + Science (Survey): general science (MAPLE, ToTER), geospatial science (GeoWise), climate science (CoDiCast), chemistry (ChemRAG) and healthcare (RAM-EHR).

I am actively working on Search-R1, an efficient RL framework for Deepseek-R1 style reasoning + search engine calling (OpenAI DeepResearch) LLM training.

I am also maintaining awesome github repos on Large Language Models on Graphs and Multimodal Learning on Graphs with a survey paper. Feel free to have a look!

I will be on the job market starting in Fall 2025 and am open to both academic faculty positions and industrial research roles. If you believe I might be a good fit for your institution or organization, I’d love to connect! — please feel free to reach out at bowenj4[AT]illinois.edu

News

[2025.5] One paper on weather forecasting with diffusion model has been accepted by IJCAI 2025!
[2025.5] One paper on LLM alignment has been accepted by ICML 2025!

Selected Publications [Full List]

(* denotes equal contribution)

Preprints

An Empirical Study on Reinforcement Learning for Reasoning-Search Interleaved LLM Agents
Bowen Jin, Jinsung Yoon, Priyanka Kargupta, Sercan O. Arik, Jiawei Han.
preprint 2025.
[PDF] [Code] [Resource]

Tutorials

Integrating Textual and Graph Data: Advancing Knowledge Discovery with Semantic and Structural Insights
Bowen Jin, Yu Zhang, Yunyi Zhang, Jiawei Han.
SDM 2025 (Tutorial).
[PDF] [Tutorial Page]
Long Context vs. RAG: Strategies for Processing Long Documents in LLMs
Xinze Li, Yushi Bai, Bowen Jin, Fengbin Zhu, Liangming Pan and Yixin Cao.
SIGIR 2025 (Tutorial).
[PDF] [Tutorial Page]
Bridging Text Data and Graph Data: Towards Semantics and Structure-aware Knowledge Discovery
Bowen Jin, Yu Zhang, Sha Li, Jiawei Han.
WSDM 2024 (Tutorial).
[PDF] [Tutorial Page]

Surveys

Large Language Models on Graphs: A Comprehensive Survey
Bowen Jin*, Gang Liu*, Chi Han*, Meng Jiang, Heng Ji, Jiawei Han.
Transactions on Knowledge and Data Engineering (TKDE) 2024.
[PDF] [Repo] 900+ stars
A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery
Yu Zhang*, Xiusi Chen*, Bowen Jin*, Sheng Wang, Shuiwang Ji, Wei Wang, Jiawei Han.
EMNLP 2024.
[PDF] [Repo] 500+ stars

2025

Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Bowen Jin, Hansi Zeng, Zhenrui Yue, Jinsung Yoon, Sercan O. Arik, Dong Wang, Hamed Zamani, Jiawei Han.
COLM 2025.
[PDF] [Code] [Resource] [Media] [English Record] [Chinese Record] 1000+ stars in two weeks
LLM Alignment as Retriever Optimization: An Information Retrieval Perspective
Bowen Jin, Jinsung Yoon, Zhen Qin, Ziqi Wang, Wei Xiong, Yu Meng, Jiawei Han, Sercan O. Arik.
ICML 2025.
[PDF] [Code]
Long-Context LLMs Meet RAG: Overcoming Challenges for Long Inputs in RAG
Bowen Jin, Jinsung Yoon, Jiawei Han, Sercan O. Arik.
ICLR 2025.
[PDF] [Resource]
GRAPHGPT-O: Synergistic Multimodal Comprehension and Generation on Graphs
Yi Fang*, Bowen Jin*, Jiacheng Shen*, Sirui Ding, Qiaoyu Tan, Jiawei Han.
CVPR 2025.
[PDF] [Code]

2024

InstructG2I: Synthesizing Images from Multimodal Attributed Graphs
Bowen Jin, Ziqi Pang, Bingjun Guo, Yu-Xiong Wang, Jiaxuan You, Jiawei Han.
NeurIPs 2024.
[PDF] [Code] [Model] [Project Page] [Media Coverage]
Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs
Bowen Jin, Chulin Xie, Jiawei Zhang, Kashob Kumar Roy, Yu Zhang, Suhang Wang, Yu Meng, Jiawei Han.
ACL 2024 (findings).
[PDF] [Code] [Data]
Language Models as Semantic Indexers
Bowen Jin, Hansi Zeng, Guoyin Wang, Xiusi Chen, Tianxin Wei, Ruirui Li, et al.
ICML 2024.
[PDF] [Code]
Investigating Instruction Tuning Large Language Models on Graphs
Kerui Zhu*, Bo-Wei Huang*, Bowen Jin*, Yizhu Jiao, Ming Zhong, Kevin Chang, Shou-De Lin, Jiawei Han.
COLM 2024.
[PDF] [Code]

2023

Learning Multiplex Representations on Text-Attributed Graphs with One Language Model Encoder
Bowen Jin, Wentao Zhang, Yu Zhang, Yu Meng, Han Zhao, Jiawei Han.
NeurIPs 2023 (GLFrontiers).
[PDF] [Code]
Heterformer: Transformer-based Deep Node Representation Learning on Heterogeneous Text-Rich Networks
Bowen Jin, Yu Zhang, Qi Zhu, Jiawei Han.
KDD 2023.
[PDF] [Code]
Patton: Language Model Pretraining on Text-rich Networks
Bowen Jin, Wentao Zhang, Yu Zhang, Yu Meng, Xinyang Zhang, Qi Zhu, Jiawei Han.
ACL 2023 (Oral).
[PDF] [Code]
Edgeformers: Graph-Empowered Transformers for Representation Learning on Textual-Edge Networks
Bowen Jin, Yu Zhang, Yu Meng, Jiawei Han.
ICLR 2023 (Poster).
[PDF] [Code]

Before Ph.D.

Multi-behavior Recommendation with Graph Convolutional Networks
Bowen Jin, Chen Gao, Xiangnan He, Depeng Jin, Yong Li.
SIGIR 2020.
[PDF] [Code]

Education

University of Illinois Urbana-Champaign, Ph.D. in Computer Science (2021 - Present).
Tsinghua University, B.S. in Electrical Engineering and Statistics (2017 - 2021).

Professional Experience

Apple AIML - Research Intern
2025.05-now
Google Cloud Research - Student Researcher
2024.05-2025.05
Amazon Search - Applied Scientist Intern
2023.05-2023.12
Microsoft Research - Research Intern
2022.05-2022.09
Microsoft Research (Asia) - Research Intern
2020.09-2021.3

Professional Service

Reviewer:
- WSDM 2023, KDD 2023, NeurIPs 2023
- ICLR 2024, WWW 2024, SDM 2024, ACL 2024, ICML 2024, COLM 2024, NeurIPs 2024
- ICLR 2025, WWW 2025, ACL 2025, ICML 2025, NeurIPs 2025
Journal Reviewer:
- IEEE Transactions on Knowledge and Data Engineering (TKDE)
- ACM Transactions on Information Systems (TOIS)
- IEEE Transactions on Big Data (TBD)
Guest Instructor:
- TAMU, Spring 2025 - CSCE 689 - Special Topics in NLP for Science
- Northwestern University, Spring 2025 - CS 396: Reasoning and Planning in the Foundation Model Era
- UIUC, Fall 2023 - CS 512: Data Mining Principles
Lead TA:
- UIUC, Spring 2024 - CS 412: Introduction to Data Mining

Talks

[2025.5] Search-R1 at BAAI.
[2025.3] Search-R1 at Jina.AI.
[2025.3] Search-R1 at UIUC-NLP Seminar.
[2024.11] long-context LLM and RAG at BuzzRobot.
[2024.10] Multimodal learning on graphs at Emory.

Miscellaneous

I started to play a traditional Chinese instrument, Sheng at the age of eight. Here is a concert record of mine. Hope you enjoy it!:)

I’m a “universal” ball fan and enjoy working out. If you cannot find me in office, catch me at the gym. 😃

Contact

Email: bowenj4[AT]illinois.edu