VLSP 2025 challenge on Semantic Parsing

Important dates

June 23, 2025: Registration open
July 6, 2025: Training data release
July 15, 2025: Public test release
August 20, 2025: System submission deadline
August 30, 2025: Private test results release
September 10, 2025: Technical report submission
September 27, 2025: Notification of acceptance
October 3, 2025: Camera-ready deadline
October 29-30, 2025: Conference dates

Task Description

The objective of building a semantic parser for Vietnamese is to enable accurate understanding and formal representation of Vietnamese sentences by analyzing their syntactic and semantic structures. This parser aims to extract the underlying meaning of text and convert it into structured forms, such as Abstract Meaning Representation (AMR) or logical expressions. Such a tool is essential for enhancing the performance of downstream natural language processing tasks, including machine translation, information extraction, and question answering, especially in the context of Vietnamese, which currently lacks large-scale annotated datasets and semantic resources.

Training and Test Data

The organizers will provide:

Labeling guidelines
Training and testing datasets consisting of Vietnamese sentences taken from various domains such as Vietreebank, the novel "The Little Prince," articles, restaurant and hotel reviews, etc., all of which have been semantically labeled according to the provided schema. All data is organized in PENMAN format.

Data Format

Input: A sentence in Vietnamese.
Output:
- A semantic representation in PENMAN format

For example:

# ::snt The boy wants the girl to believe him.
(w / want-01
:ARG0 (b / boy)
:ARG1 (b2 / believe-01
:ARG0 (g / girl)
:ARG1 b))

Evaluation

This shared task uses Smatch to evaluate the semantic parsing system. Smatch score (defined below) of two semantic graphs in terms of their matching triples (edges) by finding a variable (node) mapping that maximizes the count, M, of matching triples, then:

M is the number of matching triples
T is the total number of triples in the first semantic graph
G is the total number of triples in the second semantic graph
Precision is defined as P = M/T
Recall is defined as R = M/G
The Smatch score is the F-score: F = 2 * (P*R)/(P+R)

For example:

a0 / watch (b0 / watch
:ARG0 (a1 / boy) :ARG0 (b1 / girl)
:ARG1 (a2 / tv)) :ARG1 (b2 / boy))

In where: a0(watch)-b0(watch), a1(boy)-b1(girl)

instance (a0, watch) ∧ instance (a1, boy) ∧ instance (a2, tv) ∧ ARG0 (a0, a1) ∧ ARG1 (a0, a2)
instance(b0, watch) ∧ instance (b1, girl) ∧ instance (b2, boy) ∧ ARG0 (b0, b1) ∧ ARG1 (b0, b2)

So: P = 3/5 , R = 3/5 , F = 0.6

Submission

Submission instructions will be announced later.

Contact

Zalo Group: https://zalo.me/g/mwxcug903

Organizers

Nguyen Thi Minh Huyen, email: ntmhuyen@gmail.com
Ha My Linh, email: halinh.hus@gmail.com
Vu Xuan Luong
Pham Thi Duc
Ngo The Quyen
Le Ngoc Toan

Association for Vietnamese Language and Speech Processing

Search