DREAM

A Challenge Dataset and Models for Dialogue-Based Reading Comprehension


What is DREAM?

DREAM is a multiple-choice Dialogue-based REAding comprehension exaMination dataset. In contrast to existing reading comprehension datasets, DREAM is the first to focus on in-depth multi-turn multi-party dialogue understanding.


DREAM contains 10,197 multiple choice questions for 6,444 dialogues, collected from English-as-a-foreign-language examinations designed by human experts. DREAM is likely to present significant challenges for existing reading comprehension systems: 84% of answers are non-extractive, 85% of questions require reasoning beyond a single sentence, and 34% of questions also involve commonsense knowledge.

Report Your Results

If you have new results, please send an email to dream@dataset.org with the link to your paper!

Leaderboard

Report Time Model Accuracy
Human Ceiling Performance

Tencent & Cornell & UW & AI2

Sun et al., 2019
98.6
Human Performance

Tencent & Cornell & UW & AI2

Sun et al., 2019
95.5
Oct 01, 2019 RoBERTa-Large + MMM

MIT & Amazon Alexa AI

Jin et al., 2019
88.9
Jul 21, 2019 XLNet-Large

River Valley High School, Singapore

https://github.com/NoviScl/XLNet_DREAM
72.0
Apr 25, 2019 BERT-Large

https://github.com/nlpdata/mrc_bert_baseline
66.8
Apr 23, 2019 BERT-Base

https://github.com/nlpdata/mrc_bert_baseline
63.2
Feb 01, 2019 GBDT++ and FTLM++ (ensemble)

Tencent & Cornell & UW & AI2

Sun et al., 2019
59.5
Feb 23, 2019 EER + FT

Tencent & TTIC & Cornell & UPenn

Wang et al., 2019
57.7
Feb 01, 2019 FTLM++

Tencent & Cornell & UW & AI2

Sun et al., 2019
57.4
Feb 01, 2019 Finetuned Transformer LM (*)

OpenAI

Radford et al., 2018
55.5
Feb 01, 2019 GBDT++

Tencent & Cornell & UW & AI2

Sun et al., 2019
52.8
Feb 01, 2019 DSW++

Tencent & Cornell & UW & AI2

Sun et al., 2019
50.1
Feb 01, 2019 Co-Matching (*)

Singapore Management University & IBM Research

Wang et al., 2018
45.5
Feb 01, 2019 Distance-Based Sliding Window (*)

Microsoft Research

Richardson et al., 2013
44.6
Feb 01, 2019 Sliding Window (*)

Microsoft Research

Richardson et al., 2013
42.5
Feb 01, 2019 Word Matching (*)

Microsoft Research

Yih et al., 2013
42.0
Feb 01, 2019 Gated-Attention Reader (*)

Carnegie Mellon University

Dhingra et al., 2017
41.3
Feb 01, 2019 Stanford Attentive Reader (*)

Stanford University

Chen et al., 2016
39.8

*: Run and reported by Sun et al., 2019.