Yue Yu

Research Scientist at FAIR, Meta

Taken in Anchorage, Alsaka

1 Meta Wy, Menlo Park, CA 94025

Hello! I am a research scientist at FAIR CodeGen Team, Meta Superintelligence Labs. Currently, I work closely with the TBD Lab on improving the agentic coding capabilities for Meta’s next-gen LLM. In the past years, I have also worked on a range of topics on LLM Post-training, including code reasoning, RL, instruction following, etc.

Prior to joining Meta, I completed my PhD from Georgia Institute of Technology in December 2024 and earned my bachelor’s degree (with honors) from the Department of Electronic Engineering, Tsinghua University in July 2019.


Educations

Georgia Institute of Technology (2019 - Present)
Ph.D. in Computational Science and Engineering
GPA: 4.00/4.00
Thesis Topic: Towards Efficiently and Effectively Harnessing Large Pre-trained Models via Data-centric Lens.
Advisor: Prof. Chao Zhang

Tsinghua University (2015 - 2019)
B.Eng. in Electronic Engineering
GPA: 3.87/4.00 (Outstanding Graduate)
Research Focus: Spatio-temporal Data Mining [WWW 2019, UbiComp 2020], Recommender Systems [UbiComp 2019].
Advisor: Prof. Yong Li


Industrial Experience

Meta (Jan 2025 - Now)
Research Scientist, FAIR
Manager: Gabriel Synnaeve
Topic: LLM Post-training/RL for Coding Agents.

Meta (May 2024 - Aug 2024)
Research Intern, GenAI (Llama Post-training Team)
Host: Rui Hou, Manager: Melanie Kambadur
Topic: Self-Critiquing Reward Models [NAACL 2025].

NVIDIA (Jan 2024 - May 2024)
Research Intern, Applied Deep Learning Research Group
Host: Wei Ping, Manager: Mohammad Shoeybi
Topic: LLM Instruction Fine-tuning for Zero-shot Retrieval-Augmented Generation [NeurIPS 2024].

Google Research (May 2023 - Aug 2023)
Research Intern, News Understanding Group
Host: Jiaming Shen, Manager: Jialu Liu
Topic: LLM In-context Learning with Rationales [ACL 2024].

Microsoft Research (May 2021 - Aug 2021)
Research Intern, Productivity and Intelligence Group
Mentor: Chenyan Xiong, Manager: Arnold Overwijk
Topic: Zero-shot Dense Text Retrieval [EMNLP 2022].

News

Sep 25, 2024 Two papers are accepted to NeurIPS 2024 and Three papers are accepted to EMNLP 2024. Congratulations!
May 16, 2024 6 papers are accepted to ACL 2024 (4 Main Conf, 2 Findings).
Oct 25, 2023 Honored to receive the NeurIPS 2023 Scholar award!
Sep 22, 2023 3 papers are accepted to NeurIPS 2023. Thanks for my collaborators!
May 16, 2023 Checkout the recent publications: 2 first-author papers are accepted to ACL 2023 (1 Main Conf, 1 Findings), and 3 coauthored papers are accepted to KDD 2023. Thanks and Congratulations for my collaborators!

Selected Publications

  1. Self-Generated Critiques Boost Reward Modeling for Language Models
    Yue Yu, Zhengxing Chen, Aston Zhang, Liang Tan, Chenguang Zhu, Richard Yuanzhe Pang, Yundi Qian, Xuewei Wang, Suchin Gururangan, Chao Zhang, Melanie Kambadur, Dhruv Mahajan, and Rui Hou
    Proceedings of NAACL, 2025.
  2. RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs
    Yue Yu, Wei Ping, Zihan Liu, Boxin Wang, Jiaxuan You, Chao Zhang, Mohammad Shoeybi, and Bryan Catanzaro
    Proceedings of NeurIPS, 2024.
  3. Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias
    Yue Yu*, Yuchen Zhuang*, Jieyu Zhang*, Yu Meng, Alexander Ratner, Ranjay Krishna, Jiaming Shen, and Chao Zhang
    Proceedings of NeurIPS (D&B Track), 2023.
  4. COCO-DR: Combating Distribution Shifts in Zero-Shot Dense Retrieval with Contrastive and Distributionally Robust Learning
    Yue Yu, Chenyan Xiong, Si Sun, Chao Zhang, and Arnold Overwijk
    Proceedings of EMNLP, 2022. (Oral)