VLSP 2025 challenge on Semantic Parsing
Important dates
- June 23, 2025: Registration open
- July 5, 2025: Training data release
- July 15, 2025: Public test release
- August 20, 2025: System submission deadline
- August 30, 2025: Private test results release
- September 10, 2025: Technical report submission
- September 27, 2025: Notification of acceptance
- October 3, 2025: Camera-ready deadline
- October 29-30, 2025: Conference dates
Task Description
The objective of building a semantic parser for Vietnamese is to enable accurate understanding and formal representation of Vietnamese sentences by analyzing their syntactic and semantic structures. This parser aims to extract the underlying meaning of text and convert it into structured forms, such as Abstract Meaning Representation (AMR) or logical expressions. Such a tool is essential for enhancing the performance of downstream natural language processing tasks, including machine translation, information extraction, and question answering, especially in the context of Vietnamese, which currently lacks large-scale annotated datasets and semantic resources.
Training and Test Data
The organizers will provide:
- Labeling guidelines
- Training and testing datasets consisting of Vietnamese sentences taken from various domains such as Vietreebank, the novel "The Little Prince," articles, restaurant and hotel reviews, etc., all of which have been semantically labeled according to the provided schema. All data is organized in PENMAN format.
Data Format
- Input: A sentence in Vietnamese.
- Output:
- A semantic representation in PENMAN format
For example:
# ::snt The boy wants the girl to believe him.
(w / want-01
:ARG0 (b / boy)
:ARG1 (b2 / believe-01
:ARG0 (g / girl)
:ARG1 b))
Evaluation
This shared task uses Smatch to evaluate the semantic parsing system. Smatch score (defined below) of two semantic graphs in terms of their matching triples (edges) by finding a variable (node) mapping that maximizes the count, M, of matching triples, then:
- M is the number of matching triples
- T is the total number of triples in the first semantic graph
- G is the total number of triples in the second semantic graph
- Precision is defined as P = M/T
- Recall is defined as R = M/G
- The Smatch score is the F-score: F = 2 * (P*R)/(P+R)
For example:
a0 / watch (b0 / watch
:ARG0 (a1 / boy) :ARG0 (b1 / girl)
:ARG1 (a2 / tv)) :ARG1 (b2 / boy))
In where: a0(watch)-b0(watch), a1(boy)-b1(girl)
instance (a0, watch) ∧ instance (a1, boy) ∧ instance (a2, tv) ∧ ARG0 (a0, a1) ∧ ARG1 (a0, a2)
instance(b0, watch) ∧ instance (b1, girl) ∧ instance (b2, boy) ∧ ARG0 (b0, b1) ∧ ARG1 (b0, b2)
So: P = 3/5 , R = 3/5 , F = 0.6
Submission
Submission instructions will be announced later.
Contact
Zalo Group: https://zalo.me/g/mwxcug903
Organizers
- Nguyen Thi Minh Huyen, email: ntmhuyen@gmail.com
- Ha My Linh, email: halinh.hus@gmail.com
- Vu Xuan Luong
- Pham Thi Duc
- Ngo The Quyen
- Le Ngoc Toan
Sponsors and Partners