The ShARC Leaderboard

The ShARC competition was run between 2018 and 2024 on the end to end task of conversational question answering based on the ShARC dataset. The competition is now closed to future submissions. To help support future work using this resource, the test set has now been released publicly (as of the 9th February, 2024) and is available to download through the data page or Hugging Face Datasets.

You can find the competition details for reference only on codalab.

ShARC: End-to-end Task

# Model / Reference Affiliation Date Micro Accuracy[%] Macro Accuracy[%] BLEU-1 BLEU-4
# BiAE Li Auto Inc. & Beijing University of Posts and Telecommunications May 2023 77.9 81.1 64.7 51.6
# DGM Shanghai Jiao Tong University Jan 2021 77.4 81.2 63.3 48.4
# ET5 Beijing Institute of Technology Jan 2022 76.3 80.5 69.6 55.2
# Discern (single model) The Chinese University of Hong Kong May 2020 73.2 78.3 64.0 49.1
# EMT Salesforce Research & CUHK Nov 2019 69.4 74.8 60.9 46.0
# EMT + entailment Salesforce Research & CUHK Mar 2020 69.1 74.6 63.9 49.5
# UrcaNet (ensemble) IBM Research AI Dec 2019 69.0 74.6 56.7 42.0
# E3 University of Washington Feb 2019 67.6 73.3 54.1 38.7
# BiSon (single model) NEC Laboratories Europe Aug 2019 66.9 71.6 58.8 44.3
# UrcaNet (single model) IBM Research AI Aug 2019 65.1 71.2 60.5 46.1
# [Anonymous] [Anonymous] Jul 2019 64.9 71.4 55.3 38.9
# BERT-QA University of Washington Feb 2019 63.6 70.8 46.2 36.3
# Baseline-CM Bloomsbury AI May 2018 61.9 68.9 54.4 34.4
# Baseline-NMT Bloomsbury AI May 2018 44.8 42.8 34.0 7.8