Reliable Intelligence Identification on Vietnamese SNSs (ReINTEL)

Important dates

  • Sep 10, 2020: Registration open
  • Oct 15, 2020: Registration closed
  • Oct 20, 2020: Challenge started (via CodaLab.Org)
  • Nov 30, 2020: Final results on private test.
  • Dec 01, 2020: Announce top 3 teams to submit technical reports.
  • Dec 10, 2020: Deadline for top 3 teams to submit technical reports.
  • Dec 12, 2020: If any top teams did not submit their reports, follow-up teams can submit and take their places (follow-up teams are recommended to write their reports in advance and submit by this deadline).
  • Dec 15, 2020: Final winners announcement.
  • Dec 18, 2020:Result presentation and award ceremony (workshop day).

Introduction

This challenge aims to identify a piece of information shared on social network sites (SNSs), is reliable or unreliable. With the blazing-fast spurt of SNSs, e.g., Facebook, Zalo, or Lotus, there are approximately 65 million Vietnamese users on board with the annual growth of 2.7 million in the recent year, as reported by the Digital 2020 [6]. SNSs become essential means for users to not only connect friends but also freely create, share diverse information [2, 5], i.e., news. Within freedom, a number of users tend to spread unreliable information for their personal purposes affecting the online society. Detecting whether news spreading in SNSs is reliable or unreliable has gained significant attention recently [1, 3, 4]. Therefore, this shared task targets identifying shared information in Vietnamese SNSs. It provides an opportunity for participants who are interested in the problem, to contribute their knowledge to improve the online society for social good.

Data Format

Each instance includes 6 main attributes with/without a binary target label as follows:

  • id: unique id for a news post on SNSs

  • uid: the anonymized id of the owner

  • text: the text content of the news 

  • timestamp: the time when the news is posted 

  • image_links: image urls associated with the news

  • nb_likes: the number of likes that the news is received

  • nb_comments: the number of comment that the news is received

  • nb_shares: the number of shares that the news is received

  • label: a manually annotated label which marks the news as potentially unreliable

    • 1: unreliable

    • 0: reliable

Training/Testing Data

Participants will be provided approximately 8,000 training examples with the respective target labels. The testing set consists of 2,000 examples without labels. 

Result submission 

Participants must submit the result in the same order as the testing set in the following format: 

  • id1, label probability 1

  • Id2, label probability 2

Evaluation Metric

The submission will be evaluated with ground-truth labels using ROC-AUC metric [7]

Organisers:

  • Duc-Trong Le (contact person)
  • Harry Nguyen
  • Xuan-Son Vu

 

Contact Us:

  • reintel-organizers (at) vietnlp.com

 

References

[1] Ruchansky, N., Seo, S., & Liu, Y. (2017, November). Csi: A hybrid deep model for fake news detection. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (pp. 797-806).

[2] Shu, K., Sliva, A., Wang, S., Tang, J., & Liu, H. (2017). Fake news detection on social media: A data mining perspective. ACM SIGKDD explorations newsletter, 19(1), 22-36.

[3] Shu, K., Cui, L., Wang, S., Lee, D., & Liu, H. (2019, July). defend: Explainable fake news detection. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 395-4 05.

[4] Shu, K., Wang, S., & Liu, H. (2019, January). Beyond news contents: The role of social context for fake news detection. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, pp. 312-320.

[5] Zhou, X., Zafarani, R., Shu, K., & Liu, H. (2019, January). Fake news: Fundamental theories, detection strategies and challenges. In Proceedings of the twelfth ACM international conference on web search and data mining (pp. 836-837).

[6] Digital 2020 - Global Digital Overview, https://wearesocial.com/digital-2020

[7] ROC AUC metric, https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html