VLSP2022 - Vietnamese Abstractive multi-document summarization (AbMusu task)

Shared Task Registration Form

Important dates 

  • July 27, 2022: Registration open.
  • Oct 1, 2022: Registration closed. Release of the training set.
  • Oct 15, 2022: Release of the validation sets. 
  • Nov 1, 2022: Release of the test sets. Challenge started. 
  • Nov 7, 2022: Run submission deadline. Final results on private test. Announce top 5 teams to submit technical reports.
  • November 15, 2022: Deadline for top 5 teams to submit technical reports. If any top teams did not submit their reports, follow-up teams can submit and take their places (follow-up teams are recommended to write their reports in advance and submit by this deadline).
  • November 26, 2022: Final winners announcement, result presentation and award ceremony (workshop day).

Introduction

In the era of information explosion mining data effectively has huge potential but is a difficult problem which takes time, money and labour effort. Multi-document summarization is one of the natural language processing tasks that is useful for solving this problem. Receiving the set of documents as input, the summarization system aims to select or generate important information to create a brief summary for these documents [1]. It is a complex problem that has gained attention from the research community. Several competitions have been launched in recent years to support research and development in this field for English, such as DocEng 2019 [2] and BioNLP-MEDIQA 2021 [3], ect.

Based on output characteristics, there are two major approaches for automatic summarization, i.e, extractive and abstractive summarization. Extractive summarization tends to select the most crucial sentences (sections) from the documents while abstractive summarization tries to rewrite a new summary based on the original important information [4]. From the early 1950s, various methods have been proposed for extractive summarization ranging from frequency-based methods [5] to machine learning-based methods [6]. The extractive methods are fast and simple but the summaries are far from the manual-created summary, which can be remedied with the abstractive approach [7]. In the multi-document problem, extractive approaches show significant disadvantages in arranging and combining information from several documents. In recent years, sequence-to-sequence learning (seq2seq) makes abstractive summarization possible [8]. A set of models based on encoder-decoder such as PEGASUS [9], BART [10], T5 [11] achieves potential results for abstractive multi-document summarization. Studies on this issue in Vietnamese texts are still in the early phases. Therefore, this shared task is proposed to promote the development of research on abstractive multi-document summarization for Vietnamese text.

Task description

The goal of the Vietnamese abstractive multi-document summarization task (AbMusu task) is to develop summarization systems that could create abstractive summaries automatically for a set of documents on a topic. The input is multiple news documents on the same topic, and the output of the model is a related abstractive summary. This task focuses on Vietnamese news summarization.

Data

The provided data is Vietnamese news data on various topics, including the economy, society, culture, science and technology, etc. It is divided into training, validation and test datasets. The datasets contain several document clusters. Each cluster has 3-5 documents that illustrate the same topic. On training and validation datasets, a manual-created reference abstractive summary is provided per cluster.

The test set is formatted similarly to the training and validation sets, but without an abstractive summary. The goal of this shared task is to build models to create an abstractive summary per cluster automatically. The participant must submit the result in the same format as the training and validation dataset for evaluation. After the result submission deadline, participating teams are required to submit a technical report in the format provided by the organizers in order to have their results recognized.

Evaluation method

The official evaluation measures are the ROUGE-2 scores and ROUGE-2 F1 is the main score for ranking. ROUGE-2 Recall (R), Precision (P) and F1 between predicted summary and reference summary are calculated with formulas [12]:

ROUGE-2 P = |Matched N-grams| / |Predict summary N-grams|

ROUGE-2 R = |Matched N-grams| / |Reference summary N-grams|

ROUGE-2 F = (2 x ROUGE-2 P x ROUGE-2 F) / (ROUGE-2 P + ROUGE-2 F)

Registration and contact

To contact us, mail to: abmusu.vlsp2022@gmail.com

References

[1] Ježek K, Steinberger J. Automatic text summarization (the state of the art 2007 and new challenges). In Proceedings of Znalosti 2008 Feb (pp. 1-12).

[2] Lins RD, Mello RF, Simske S. DocEng'19 Competition on Extractive Text Summarization. In Proceedings of the ACM Symposium on Document Engineering 2019 2019 Sep 23 (pp. 1-2).

[3] Abacha AB, M’rabet Y, Zhang Y, Shivade C, Langlotz C, Demner-Fushman D. Overview of the MEDIQA 2021 shared task on summarization in the medical domain. In Proceedings of the 20th Workshop on Biomedical Language Processing 2021 Jun (pp. 74-85).

[4] Allahyari M, Pouriyeh S, Assefi M, Safaei S, Trippe ED, Gutierrez JB, Kochut K. Text summarization techniques: a brief survey. arXiv preprint arXiv:1707.02268. 2017 Jul 7.

[5] Khan R, Qian Y, Naeem S. Extractive based text summarization using k-means and tf-idf. International Journal of Information Engineering and Electronic Business. 2019 May 1;11(3):33.

[6] Gambhir M, Gupta V. Recent automatic text summarization techniques: a survey. Artificial Intelligence Review. 2017 Jan;47(1):1-66.

[7] El-Kassas WS, Salama CR, Rafea AA, Mohamed HK. Automatic text summarization: A comprehensive survey. Expert Systems with Applications. 2021 Mar 1;165:113679.

[8] Hou L, Hu P, Bei C. Abstractive document summarization via neural model with joint attention. InNational CCF conference on natural language processing and Chinese computing 2017 Nov 8 (pp. 329-338). Springer, Cham.

[9] Zhang J, Zhao Y, Saleh M, Liu P. Pegasus: Pre-training with extracted gap-sentences for abstractive summarization. InInternational Conference on Machine Learning 2020 Nov 21 (pp. 11328-11339). PMLR.

[10] Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461. 2019 Oct 29.

[11] Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res.. 2020 Jun;21(140):1-67.

[12] Lin CY. Rouge: A package for automatic evaluation of summaries. InText summarization branches out 2004 Jul (pp. 74-81).