publications

2024

  1. AIES
    Operationalizing content moderation "accuracy" in the Digital Services Act
    Johnny Wei, Frederike Zufall, and Robin Jia
    2024
  2. COLM
    IsoBench: Benchmarking Multimodal Foundation Models on Isomorphic Representations
    Deqing Fu*, Ghazal Khalighinejad*, Ollie Liu*, Bhuwan Dhingra, Dani Yogatama, and 2 more authors
    Dataset , 2024
  3. ACL Findings
    Proving membership in LLM pretraining data via data watermarks
    Johnny Tian-Zheng Wei*, Ryan Yixiang Wang*, and Robin Jia
    2024
  4. NAACL
    Do Localization Methods Actually Localize Memorized Data in LLMs?
    Ting-Yun Chang, Jesse Thomason, and Robin Jia
    Github , 2024
  5. NAACL
    Efficient End-to-End Visual Document Understanding with Rationale Distillation
    Wang Zhu, Alekh Agarwal, Mandar Joshi, Robin Jia, Jesse Thomason, and 2 more authors
    2024
  6. EMNLP
    When Parts are Greater Than Sums: Individual LLM Components Can Outperform Full Models
    Ting-Yun Chang, Jesse Thomason, and Robin Jia
    2024
  7. NeurIPS
    Pre-trained Large Language Models Use Fourier Features to Compute Addition
    Tianyi Zhou, Deqing Fu, Vatsal Sharan, and Robin Jia
    2024
  8. arxiv
    Language Models can Infer Action Semantics for Classical Planners from Environment Feedback
    Wang Zhu, Ishika Singh, Robin Jia, and Jesse Thomason
    2024
  9. NeurIPS
    Transformers Learn Higher-Order Optimization Methods for In-Context Learning: A Study with Linear Models
    Deqing Fu, Tian-Qi Chen, Robin Jia, and Vatsal Sharan
    2024

2023

  1. EMNLP
    Chain-of-Questions Training with Latent Answers for Robust Multistep Question Answering
    Wang Zhu, Jesse Thomason, and Robin Jia
    Github , 2023
  2. EMNLP
    SCENE: Self-Labeled Counterfactuals for Extrapolating to Negative Examples
    Deqing Fu, Ameya Godbole, and Robin Jia
    Github , 2023
  3. EMNLP Findings
    Estimating Large Language Model Capabilities without Labeled Test Data
    Harvey Yiyun Fu, Qinyuan Ye, Albert Xu, Xiang Ren, and Robin Jia
    Github , 2023
  4. EMNLP Findings
    How Predictable Are Large Language Model Capabilities? A Case Study on BIG-bench
    Qinyuan Ye, Harvey Yiyun Fu, Xiang Ren, and Robin Jia
    Github , 2023
  5. ACL
    Data Curation Alone Can Stabilize In-context Learning
    Ting-Yun Chang Jia, and  Robin
    Github , 2023
  6. ACL
    Contrastive Novelty-Augmented Learning: Anticipating Outliers with Large Language Models
    Albert Xu, Xiang Ren, and Robin Jia
    Github , 2023
  7. ACL
    Are Sample-Efficient NLP Models More Robust?
    Nelson F. Liu, Ananya Kumar, Percy Liang, and Robin Jia
    2023
  8. ACL
    Do Question Answering Modeling Improvements Hold Across Benchmarks?
    Nelson F. Liu, Tony Lee, Robin Jia, and Percy Liang
    2023
  9. ODRUM
    Does VLN Pretraining Work with Nonsensical or Irrelevant Instructions?
    Wang Zhu, Ishika Singh, Yuan Huang, Robin Jia, and Jesse Thomason
    2023
  10. EACL Findings
    Benchmarking Long-tail Generalization with Likelihood Splits
    Ameya Godbole Jia, and  Robin
    Github , 2023

2022

  1. EMNLP Findings
    Generalization Differences between End-to-End and Neuro-Symbolic Vision-Language Reasoning Systems
    Wang Zhu, Jesse Thomason, and Robin Jia
    Github , 2022
  2. ICML
    Knowledge base question answering by case-based reasoning over subgraphs
    Rajarshi Das, Ameya Godbole, Ankita Naik, Elliot Tower, Manzil Zaheer, and 3 more authors
    Github and More Info , 2022
  3. NAACL
    On the Robustness of Reading Comprehension Models to Entity Renaming
    Jun Yan, Yang Xiao, Sagnik Mukherjee, Bill Yuchen Lin, Robin Jia, and 2 more authors
    Github , 2022
  4. NAACL
    Models in the Loop: Aiding Crowdworkers with Generative Annotation Assistants
    Max Bartolo, Tristan Thrush, Sebastian Riedel, Pontus Stenetorp, Robin Jia, and 2 more authors
    Github , 2022
  5. ACL Findings
    Question Answering Infused Pre-training of General-Purpose Contextualized Representations
    Robin Jia, Mike Lewis, and Luke Zettlemoyer
    Github , 2022
  6. ACL Findings
    Analyzing Dynamic Adversarial Training Data in the Limit
    Eric Wallace, Adina Williams, Robin Jia, and Douwe Kiela
    Github , 2022
  7. ACL
    On Continual Model Refinement in Out-of-Distribution Data Streams
    Bill Yuchen Lin, Sida Wang, Xi Victoria Lin, Robin Jia, Lin Xiao, and 3 more authors
    More Info and Github , 2022

2021

  1. NeurIPS
    Dynaboard: An Evaluation-As-A-Service Platform for Holistic Next-Generation Benchmarking
    Zhiyi Ma*, Kawin Ethayarajh*, Tristan Thrush*, Somya Jain, Ledell Wu, and 4 more authors
    More Info and Github and More Info , 2021
  2. EMNLP
    Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for Little
    Koustuv Sinha, Robin Jia, Dieuwke Hupkes, Joelle Pineau, Adina Williams, and 1 more author
    Github , 2021
  3. EMNLP
    Improving Question Answering Model Robustness with Synthetic Adversarial Data Generation
    Max Bartolo, Tristan Thrush, Robin Jia, Sebastian Riedel, Pontus Stenetorp, and 1 more author
    More Info and Github and More Info , 2021
  4. BLACKBOX
    To What Extent do Human Explanations of Model Behavior Align with Actual Model Behavior?
    Grusha Prasad, Yixin Nie, Mohit Bansal, Robin Jia, Douwe Kiela, and 2 more authors
    2021
  5. ACL
    The statistical advantage of automatic NLG metrics at the system level
    Johnny Tian-Zheng Wei Jia, and  Robin
    Github , 2021
  6. ACL
    Evaluation Examples Are Not Equally Informative: How Should That Change NLP Leaderboards?
    Pedro Rodriguez, Joe Barrow, Alexander Hoyle, John P. Lalor, Robin Jia, and 1 more author
    More Info and Github , 2021
  7. ACL Findings
    Do Explanations Help Users Detect Errors in Open-Domain QA? An Evaluation of Spoken vs. Visual Explanations
    Ana Valeria Gonzalez, Gagan Bansal, Angela Fan, Robin Jia, and Srinivasan Iyer
    2021
  8. NAACL
    Swords: A Benchmark for Lexical Substitution with Improved Data Coverage and Quality
    Mina Lee*, Chris Donahue*, Robin Jia, Alexander Iyabor, and Percy Liang
    Github and More Info , 2021
  9. NAACL
    Dynabench: Rethinking Benchmarking in NLP
    Douwe Kiela, Max Bartolo, Yixin Nie, Divyansh Kaushik, Atticus Geiger, and 14 more authors
    More Info and Github , 2021
  10. US_PATENT
    N-ary relation prediction over text spans
    Hoifung Poon, Cliff Wong, and Robin Jia
    2021

2020

  1. EMNLP Findings
    On the Importance of Adaptive Data Collection for Extremely Imbalanced Pairwise Tasks
    Stephen Mussmann*, Robin Jia*, and Percy Liang
    More Info and Github , 2020
  2. EMNLP
    With Little Power Comes Great Responsibility
    Dallas Card, Peter Henderson, Urvashi Khandelwal, Robin Jia, Kyle Mahowald, and 1 more author
    Github , 2020
  3. DISSERTATION
    Building Robust Natural Language Processing Systems
    Robin Jia
    2020
  4. ACL
    Selective Question Answering under Domain Shift
    Amita Kamath, Robin Jia, and Percy Liang
    More Info , 2020
  5. ACL
    Robust Encodings: A Framework for Combating Adversarial Typos
    Erik Jones, Robin Jia*, Aditi Raghunathan*, and Percy Liang
    More Info and Github , 2020

2019

  1. EMNLP
    Certified Robustness to Adversarial Word Substitutions
    Robin Jia, Aditi Raghunathan, Kerem Göksel, and Percy Liang
    More Info and Github , 2019
  2. MRQA
    MRQA 2019 Shared Task: Evaluating Generalization in Reading Comprehension
    Adam Fisch, Alon Talmor, Robin Jia, Minjoon Seo, Eunsol Choi, and 1 more author
    Github , 2019
  3. NAACL
    Document-Level N-ary Relation Extraction with Multiscale Representation Learning
    Robin Jia, Cliff Wong, and Hoifung Poon
    More Info , 2019

2018

  1. ACL
    Know What You Don’t Know: Unanswerable Questions for SQuAD
    Pranav Rajpurkar*, Robin Jia*, and Percy Liang
    More Info and More Info and More Info and More Info , 2018
  2. NAACL
    Delete, Retrieve, Generate: A Simple Approach to Sentiment and Style Transfer
    Juncen Li, Robin Jia, He He, and Percy Liang
    More Info and More Info and More Info , 2018

2017

  1. EMNLP
    Adversarial Examples for Evaluating Reading Comprehension Systems
    Robin Jia, and Percy Liang
    More Info and More Info and More Info , 2017
  2. ICASSP
    Learning Concepts through Conversations in Spoken Dialogue Systems
    Robin Jia, Larry Heck, Dilek Hakkani-Tür, and Georgi Nikolov
    More Info and More Info , 2017

2016

  1. ACL
    Data Recombination for Neural Semantic Parsing
    Robin Jia Liang, and  Percy
    More Info and More Info and More Info , 2016
  2. MBE
    "Reverse Genomics" Predicts Function of Human Conserved Noncoding Elements
    Amir Marcovitz, Robin Jia, and Gill Bejerano
    2016

2015

  1. PNAS
    Mx1 and Mx2 Key Antiviral Proteins are Surprisingly Lost in Toothed Whales
    Benjamin A. Braun, Amir Marcovitz, J. Gray Camp, Robin Jia, and Gill Bejerano
    2015