• Evaluating AMR-to-English NLG Evaluation
    Emma Manning, Shira Wein and Nathan Schneider

  • Informative Manual Evaluation of Machine Translation Output
    Maja Popovic

  • Automatic Machine Translation Evaluation in Many Languages via Zero-Shot Paraphrasing
    Brian Thompson and Matt Post

  • Evaluating Semantic Accuracy of Data-to-Text Generation with Natural Language Inference
    Ondrej Dusek and Zdenek Kasner

  • Studying the Effects of Cognitive Biases in Evaluation of Conversational Agents
    Sashank Santhanam and Samira Shaikh


  • A proof of concept on triangular test evaluation for Natural Language Generation
    Javier González Corbelle, José María Alonso Moral and Alberto Bugarín Diz

  • This is a Problem, Don’t You Agree? Framing and Bias in Human Evaluation for Natural Language Generation
    Stephanie Schoch, Diyi Yang and Yangfeng Ji

  • NUBIA: NeUral Based Interchangeability Assessor for Text Generation
    Hassan Kane, Muhammed Yusuf Kocyigit, Ali Abdalla, Pelkins Ajanoh and Mohamed Coulibali

  • On the interaction of automatic evaluation and task framing in headline style transfer
    Lorenzo De Mattei, Michele Cafagna, Huiyuan Lai, Felice Dell’Orletta, Malvina Nissim and Albert Gatt

  • Evaluation rules! On the use of grammars and rule-based systems for NLG evaluation
    Emiel van Miltenburg, Chris van der Lee, Thiago Castro-Ferreira and Emiel Krahmer