Microsoft Download Center Archive

MSR Abstractive Text Compression Dataset

  • Published:
  • Version: 1.0
  • Category: Tool
  • Language: English

This dataset contains sentences and short paragraphs with corresponding shorter (compressed) versions. There are up to five compressions for each input text, together with quality judgements of their meaning preservation and grammaticality.

  • This dataset contains sentences and short paragraphs with corresponding shorter (compressed) versions. There are up to five compressions for each input text, together with quality judgements of their meaning preservation and grammaticality. The dataset is derived using source texts from the Open American National Corpus (ww.anc.org) and crowd-sourcing. More details can be found in the included README and the paper: “A dataset and evaluation metrics for abstractive compression of sentences and short paragraphs” [Toutanova, Brockett, Tran, and Amershi, EMNLP 2016].

Files

Status: Live

This download is still available on microsoft.com. The downloads below will come directly from the Microsoft Download Center.

FileSize
Release.zip
SHA1: e70fced74a0a48f96675c071592ece9db9600a62
17.39 MB

File sizes and hashes are retrieved from the Wayback Machine’s indexes. They may not match the latest versions of files hosted on Microsoft servers.

System Requirements

Operating Systems: Android, Apple Mac OS X, Linux, Windows 10, Windows 8

    • Windows 8, Windows 10, Android, Apple Mac OS X, Linux

Installation Instructions

    • Click Download and follow the instructions.