Publications

Selected Publications by categories in reversed chronological order. Full list is available on my Google Scholar.

2024

  1. RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs
    Yue Yu, Wei Ping, Zihan Liu, Boxin Wang, Jiaxuan You, Chao Zhang, Mohammad Shoeybi, and Bryan Catanzaro
    Proceedings of NeurIPS, 2024.
  2. Explanation-aware Soft Ensemble Empowers Large Language Model In-context Learning
    Yue Yu, Jiaming Shen, Tianqi Liu, Zhen Qin, Jing Nathan Yan, Jialu Liu, Chao Zhang, and Michael Bendersky
    Proceedings of ACL, 2024.
  3. ARL2: Aligning Retrievers with Black-box Large Language Models via Self-guided Adaptive Relevance Labeling
    Lingxi Zhang, Yue Yu, Kuan Wang, and Chao Zhang
    Proceedings of ACL, 2024.
  4. RAM-EHR: Retrieval Augmentation Meets Clinical Predictions on Electronic Health Records
    Ran Xu*, Wenqi Shi*, Yue Yu, Yuchen Zhuang, Bowen Jin, May D. Wang, Joyce C. Ho, and Carl Yang
    Proceedings of ACL, 2024. (Oral)

2023

  1. Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias
    Yue Yu*, Yuchen Zhuang*, Jieyu Zhang*, Yu Meng, Alexander Ratner, Ranjay Krishna, Jiaming Shen, and Chao Zhang
    Proceedings of NeurIPS (D&B Track), 2023.
  2. ToolQA: A Dataset for LLM Question Answering with External Tools
    Yuchen Zhuang*, Yue Yu*, Kuan Wang*, Haotian Sun, and Chao Zhang
    Proceedings of NeurIPS (D&B Track), 2023.
  3. Cold-Start Data Selection for Better Few-shot Language Model Fine-tuning: A Prompt-based Uncertainty Propagation Approach
    Yue Yu, Rongzhi Zhang, Ran Xu, Jieyu Zhang, Jiaming Shen, and Chao Zhang
    Proceedings of ACL, 2023.
  4. ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense Retrieval
    Yue Yu, Yuchen Zhuang, Rongzhi Zhang, Yu Meng, Jiaming Shen, and Chao Zhang
    Proceedings of ACL Findings, 2023.

2022

  1. COCO-DR: Combating Distribution Shifts in Zero-Shot Dense Retrieval with Contrastive and Distributionally Robust Learning
    Yue Yu, Chenyan Xiong, Si Sun, Chao Zhang, and Arnold Overwijk
    Proceedings of EMNLP, 2022. (Oral)
  2. AcTune: Uncertainty-Based Active Self-Training for Active Fine-Tuning of Pretrained Language Models
    Yue Yu, Lingkai Kong, Jieyu Zhang, Rongzhi Zhang, and Chao Zhang
    Proceedings of NAACL, 2022. (Oral)
  3. Counterfactual and Factual Reasoning over Hypergraphs for Interpretable Clinical Predictions on EHR
    Ran Xu, Yue Yu, Chao Zhang, Mohammed K Ali, Joyce C Ho, and Carl Yang
    Proceedings of ML4H, 2022. (Best Paper Award)

2021

  1. Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self-Training Approach
    Yue Yu*, Simiao Zuo*, Haoming Jiang, Wendi Ren, Tuo Zhao, and Chao Zhang
    Proceedings of NAACL, 2021. (Oral)
  2. SumGNN: multi-typed drug interaction prediction via efficient knowledge graph summarization
    Yue Yu*, Kexin Huang*, Chao Zhang, Lucas M Glass, Jimeng Sun, and Cao Xiao
    Bioinformatics, 2021.
  3. WRENCH: A Comprehensive Benchmark for Weak Supervision
    Jieyu Zhang, Yue Yu, Yinghao Li, Yujing Wang, Yaming Yang, Mao Yang, and Alexander Ratner
    Proceedings of NeurIPS (D&B Track), 2021. (Oral)

2020

  1. STEAM: Self-supervised taxonomy expansion with mini-paths
    Yue Yu, Yinghao Li, Jiaming Shen, Hao Feng, Jimeng Sun, and Chao Zhang
    Proceedings of KDD, 2020. (Oral)
  2. BOND: BERT-assisted open-domain named entity recognition with distant supervision
    Chen Liang*, Yue Yu*, Haoming Jiang*, Siawpeng Er, Ruijia Wang, Tuo Zhao, and Chao Zhang
    Proceedings of KDD, 2020. (Oral)

2019

  1. Understanding Urban Dynamics via State-sharing Hidden Markov Model
    Tong Xia*, Yue Yu*, Fengli Xu, Funing Sun, Diansheng Guo, Depeng Jin, and Yong Li
    Proceedings of WWW, 2019.
  2. Privacy-preserving cross-domain location recommendation
    Chen Gao, Chao Huang, Yue Yu, Huandong Wang, Yong Li, and Depeng Jin
    Proceedings of IMWUT/UbiComp, 2019.