About Me

I am a Senior Research Scientist in Tencent AI, Seattle Lab. My primary research interest lies in natural language processing and machine learning. My current focus include several key areas: information retrieval, retrieval-augmented generation, large language model agent systems, and multi-modal learning. This specialization reflects my commitment to advancing artifical intelligence capabilities in understanding, reasoning, and interacting with the world.

I earned my Ph.D. in Computer Science and Engineering from the University of Notre Dame in 2023, advised by Prof. Meng Jiang. My research during Ph.D was generously supported by Bloomberg Ph.D Fellowship. I also enjoyed some amazing internship experiences at Microsoft Research, Allen Institute for Artificial Intelligence (AI2), and Bloomberg along the way.

Internship with Me

I am actively seeking highly motivated interns who share my research interests. Kindly reach out to me at wenhaoyu97@gmail.com / wenhaowyu@global.tencent.com with your resume!

I’ve been fortunate to mentor and work alongside many talented students:
- Siru Ouyang (2024), UIUC, advised by Prof. Jiawei Han. Topic: LLM agent [RepoGraph].
- Mengzhao Jia (2024), UND, advised by Prof. Meng Jiang. Topic: Multi-modal [Leopard].
- Di Wu (2024), UCLA, advised by Prof. Kai-Wei Chang. Topic: RAG / agent [LongMemEval].
- Tong Chen (2023), UW at Seattle, advised by Prof. Luke Zettlemoyer and Prof. Hannaneh Hajishirzi. Topic: RAG [Dense X Retrieval].

What’s New!

- [2025.01] One paper is accepted at ACL 2025 on self-evolving web agent and one paper is accepted at TMLR on multi-modal instruction tuning.
- Check my latest post on X (Twitter)!
- [2025.01] Four paper are accepted at ICLR 2025 on LLM agents and code generation.
- [2024.10] Introducing Leopard, a 7B multimodal LLM capable of processing multiple visual documents, charts, and snapshots as input, outperforming SoTA by a large margin.
- [2024.09] Three paper are accepted at EMNLP 2024 main conference on information retrieval, retrieval-augmented generation and relective training.
- [2024.05] Two paper are accepted at ACL 2024 main conference on web agent and cross-lingual instruction tuning.
- [2024.03] We are organizing the Third Workshop on Knowledge Augmented Methods for NLP (KnowledgeNLP) at ACL 2024 in Bangkok in August.
- [2024.03] One paper is accepted at NAACL 2024. Congratulations to Sihao!
- [2024.02] I served as Area Chair for ACL 2024, COLING 2024 and EMNLP 2024.
- [2023.12] Our paper “IfQA: A Dataset for Open-domain Question Answering under Counterfactual Presuppositions” [link] won the outstanding paper award at EMNLP 2023!

Wenhao Yu

Internship with Me

What’s New!