|
Xueqiang (Patrick) Xu
I am currently in the MSCS program at UIUC, where I am advised by
Prof. Jiawei Han and work closely with
Prof. Jiaxuan You.
I completed my undergraduate studies at UIUC (2020–2024), graduating with
a Highest Honors B.S. in Computer Science.
徐学强  / 
Email  / 
Google Scholar  / 
GitHub  / 
LinkedIn  / 
|
|
|
Research
My research focuses on advancing knowledge-grounded scientific reasoning in large language models.
I study how to extract, structure, and leverage scientific knowledge so that LLMs can reason more reliably,
more transparently, and in ways that meaningfully support scientific discovery.
I approach this goal through three interconnected directions:
-
Structured Knowledge Extraction — developing methods that enable LLMs to extract
entities, attributes, relations, and hierarchical schemas from scientific literature
under weak or zero supervision.
My work aims to transform unstructured papers into machine-interpretable scientific knowledge bases.
-
Knowledge-Augmented LLM Reasoning — integrating structured knowledge into
LLMs' reasoning processes through retrieval, schema guidance, control vectors,
and multi-hop reasoning.
I investigate how explicit knowledge structures can improve LLM factuality, faithfulness, and problem-solving ability
in scientific domains.
-
Scientific Agents and Reliability — building LLM-based scientific agents that can
reason step-by-step and self-correct.
I explore mechanisms for trustworthy, explainable, and robust reasoning pipelines to support
real scientific workflows.
These directions reflect a unified goal: combining the strengths of
structured knowledge and large language models to build
AI systems capable of reliable, interpretable, and scientifically meaningful reasoning.
If our interests align, feel free to reach out — I'm always excited to connect and collaborate!
|
Selected Publications
|
|
Zero-Shot Open-Schema Entity Structure Discovery
Xueqiang Xu, Jinfeng Xiao,
James Barry,
Mohab Elkaref,
Jiaru Zou,
Pengcheng Jiang,
Yunyi Zhang,
Max Giammona,
Geeth de Mel,
Jiawei Han
EACL Main Conference, 2026
Preprint
We introduce ZOES, a novel approach to entity structure extraction that does not require any schema or annotated samples. ZOES operates via a principled mechanism of enrichment, refinement, and unification, based on the insight that an entity and its associated structure are mutually reinforcing.
|
|
|
s3: You Don't Need That Much Data to Train a Search Agent via RL
Pengcheng Jiang,
Xueqiang Xu,
Jiacheng Lin,
Zifeng Wang,
Jimeng Sun,
and Jiawei Han
EMNLP Main Conference, 2025
Preprint Code
In this work, we propose s3, a lightweight, model-agnostic framework that decouples the searcher from the generator using only 2.4k data in the RL training process.
|
|
|
Adaptation of Agentic AI
Pengcheng Jiang*,
Jiacheng Lin*,
Zhiyi Shi*,
Zifeng Wang,
Luxi He,
Yichen Wu,
Ming Zhong,
Peiyang Song,
Qizheng Zhang,
Heng Wang,
Xueqiang Xu,
Hanwen Xu,
Pengrui Han,
Dylan Zhang,
Jiashuo Sun,
Chaoqi Yang,
Kun Qian,
Tian Wang,
Changran Hu,
Manling Li,
Quanzheng Li,
Hao Peng,
Sheng Wang,
Jingbo Shang,
Chao Zhang,
Jiaxuan You,
Liyuan Liu,
Pan Lu,
Yu Zhang,
Heng Ji,
Yejin Choi,
Dawn Song,
Jimeng Sun,
Jiawei Han
(* Equal Contribution)
Preprint, 2025
arXiv Code
We unify the rapidly expanding research landscape of agentic AI systems into a systematic framework that spans both agent adaptations and tool adaptations. Cutting-edge agentic AI systems are built on foundation models that can be adapted to plan, reason, and interact with external tools to perform increasingly complex and specialized tasks.
|
|
|
TELEClass: Taxonomy Enrichment and LLM-Enhanced Hierarchical Text Classification with Minimal Supervision
Yunyi Zhang,
Ruozhen Yang*,
Xueqiang Xu*,
Rui Li*,
Jinfeng Xiao,
Jiaming Shen,
and Jiawei Han (* Equal Contribution)
The Web Conference (WWW), 2025
Preprint Code
We propose TELEClass, which combines the general knowledge of LLMs and task-specific features mined from an unlabeled corpus. TELEClass automatically enriches the label taxonomy with class-indicative features and utilizes novel LLM-based data annotation and generation methods specifically tailored for hierarchical text classification.
|
|
|
LogiCoL: Logically-Informed Contrastive Learning for Set-based Dense Retrieval
Yanzhen Shen,
Sihao Chen,
Xueqiang Xu,
Yunyi Zhang,
Chaitanya Malaviya,
and Dan Roth
EMNLP Main Conference, 2025
Preprint
We introduce LogiCoL, a logically-informed contrastive learning objective for dense retrievers that handles queries with logical connectives. LogiCoL learns to respect subset and mutually-exclusive set relations between query results via soft constraints expressed through t-norm, achieving improvements in both retrieval performance and logical consistency.
|
Awards
- City Scholar at UIUC
- Illinois Scholars Undergraduate Research
- IIDAI scholar
|
|