|
Xueqiang (Patrick) Xu
I am currently in the MSCS program at UIUC, where I am advised by
Prof. Jiawei Han and work closely with
Prof. Jiaxuan You.
I completed my undergraduate studies at UIUC (2020–2024), graduating with
a Highest Honors B.S. in Computer Science.
徐学强  / 
Email  / 
Google Scholar  / 
GitHub  / 
LinkedIn  / 
|
|
|
Research
My research focuses on advancing knowledge-grounded scientific reasoning in large language models.
I study how to extract, structure, and leverage scientific knowledge so that LLMs can reason more reliably,
more transparently, and in ways that meaningfully support scientific discovery.
I approach this goal through three interconnected directions:
-
Structured Knowledge Extraction — developing methods that enable LLMs to extract
entities, attributes, relations, and hierarchical schemas from scientific literature
under weak or zero supervision.
My work aims to transform unstructured papers into machine-interpretable scientific knowledge bases.
-
Knowledge-Augmented LLM Reasoning — integrating structured knowledge into
LLMs' reasoning processes through retrieval, schema guidance, control vectors,
and multi-hop reasoning.
I investigate how explicit knowledge structures can improve LLM factuality, faithfulness, and problem-solving ability
in scientific domains.
-
Scientific Agents and Reliability — building LLM-based scientific agents that can
reason step-by-step and self-correct.
I explore mechanisms for trustworthy, explainable, and robust reasoning pipelines to support
real scientific workflows.
These directions reflect a unified goal: combining the strengths of
structured knowledge and large language models to build
AI systems capable of reliable, interpretable, and scientifically meaningful reasoning.
If our interests align, feel free to reach out — I'm always excited to connect and collaborate!
|
Selected Publications
|
|
Zero-Shot Open-Schema Entity Structure Discovery
Xueqiang Xu, Jinfeng Xiao,
James Barry,
Mohab Elkaref,
Mohab Elkaref,
Jiaru Zou,
Pengcheng Jiang,
Yunyi Zhang,
Max Giammona,
Geeth de Mel,
Jiawei Han
We introduce ZOES, a novel approach to entity structure extraction that does not require any schema or annotated samples. ZOES operates via a principled mechanism of enrichment, refinement, and unification, based on the insight that an entity and its associated structure are mutually reinforcing.
|
|
|
s3: You Don't Need That Much Data to Train a Search Agent via RL
Pengcheng Jiang,
Xueqiang Xu,
Jiacheng Lin,
Zifeng Wang,
Jimeng Sun,
and Jiawei Han
EMNLP Main Conference, 2025
Code
In this work, we propose s3, a lightweight, model-agnostic framework that decouples the searcher from the generator using only 2.4k data in the RL training process.
|
|
|
TELEClass: Taxonomy Enrichment and LLM-Enhanced Hierarchical Text Classification with Minimal Supervision
Yunyi Zhang,
Ruozhen Yang*,
Xueqiang Xu*,
Rui Li*,
Jinfeng Xiao,
Jiaming Shen,
and Jiawei Han (* Equal Contribution)
The Web Conference (WWW), 2025
Code
We propose TELEClass, which combines the general knowledge of LLMs and task-specific features mined from an unlabeled corpus. TELEClass automatically enriches the label taxonomy with class-indicative features and utilizes novel LLM-based data annotation and generation methods specifically tailored for hierarchical text classification.
|
|
|
LogiCoL: Logically-Informed Contrastive Learning for Set-based Dense Retrieval
Yanzhen Shen,
Sihao Chen,
Xueqiang Xu,
Yunyi Zhang,
Chaitanya Malaviya,
and Dan Roth
EMNLP Main Conference, 2025
We introduce LogiCoL, a logically-informed contrastive learning objective for dense retrievers that handles queries with logical connectives. LogiCoL learns to respect subset and mutually-exclusive set relations between query results via soft constraints expressed through t-norm, achieving improvements in both retrieval performance and logical consistency.
|
Awards
- City Scholar at UIUC
- Illinois Scholars Undergraduate Research
- IIDAI Scholar
|
Academic Services
- Reviewer for conferences: EMNLP 2025.
|
|