Skip to main content

Association for Vietnamese Language and Speech Processing

A chapter of VAIP - Vietnam Association for Information Processing

VLSP 2025 challenge on Multimodal Legal QA on Traffic Sign Rules

Important dates

26 June 2025: Call for participants.

07 July 2025: Training set releases 

05 August 2025: Private test set release 

05-10 August 2025: Testing phase

    + From 05 to 07: Testing for Task 1

    + From 07 to 10: Testing for Task 2

15 August 2025: Top teams announcement and the beginning of paper submission.

30 Aug 2025: Paper submission of top 5 teams. 

27 Sep 2025: Notification of acceptance.

03 Oct 2025: Camera ready.

29 - 30 Oct 2025: Conference (workshop)

Registration

Please register via this link: https://forms.gle/jKGSWKDRjpUwzYhA9

Task Description

The question answering (QA) task is a highly applicable problem in the field of artificial intelligence, especially in natural language processing (NLP). In particular, applying QA to the legal domain (legal QA) can help build intelligent support systems that serve users’ legal information retrieval needs. At present, complying with road traffic safety regulations is an urgent issue to ensure traffic safety and to protect the lives and property of citizens when participating in traffic. In road traffic, thoroughly understanding and strictly following traffic instructions through signs and signals is the foundation for ensuring safety for travelers. The Share Task VLSP 2025 MLQA-TSR is introduced with the aim of promoting NLP research through the QA task, helping to build systems that support users in understanding the meanings of road traffic signs and traffic scenarios based on those signs, thereby raising awareness of traffic safety. Notably, this task, for the first time, combines both image and text data with the goal of developing multimodal models to support research in NLP in particular and AI in general.

The VLSP 2025 MLQA-TSR consists of two sub-tasks:

Subtask 1: Multimodal Retrieval

Input: 

   + Question about the traffic signs in natural languages.

   + Actual image of the traffic signs on the street.. 

Output: Reference: articles(s) in LAW ON ROAD TRAFFIC ORDER AND SAFETY (36/2024/QH15) or National Technical Regulation on Traffic Signs and Signals (QCVN 41:2024/BGTVT)

Subtask 2: Question answering

Input: 

    + Question about the traffic signs in natural languages.

    + Actual image of the traffic signs on the street.. 

    + Reference: term(s) in Regulation on Traffic or National Technical Regulation on Traffic Signs and Signals

Output: Multiple-choice (4 options: A,B,C,D) or Yes/No questions

For example (in Vietnamese language): 

The traffice sign image: 

Example

Question: Các loại xe nào được phép lưu thông vào đoạn đường trên trong khoảng từ 6:00 đến 22:00:

A. Xe khách 40 chỗ.

B. Xe ô tô con

C. Xe đầu kéo.

D. Ô tô kéo rơ moóc

Reference: Điều 26.1, P.106(a,b) trong Thông tư 54/2019/TT-BGTVT 

Correct answer: B.

Evaluation metric

Subtask 1: F2 score

For one sample, the F2 is computed as: 

     + precision =  the number of correctly retrieved articles / the number of retrieved articles

     + recall = the number of correctly retrieved articles / the number of relevant articles

     + F2 = 5*precision*recal / (4*precision + recall) 

The final F2 score is the average value over all samples

Subtask 2: Accuracy. 

Accuracy = total correct choices / Total questions 

Rules

1. The participating teams will be provided with a dataset by the Organizers and are only allowed to use the dataset provided by the competition; external datasets are not permitted.

2. Teams are allowed to use open-source large language models (LLMs), and are encouraged to adopt methods that utilize small-sized but efficient LLMs. Commercial LLMs such as ChatGPT, Claude Sonet, etc., are not allowed. The used model must have been published or available on HuggingFace or GitHub

3. Teams must submit a technical report describing the proposed method to share the task, along with the source code that is capable of reproducing the model or system.

Training and Test Data

To be announced

Submission

The method for submission will be announced later.

Organizers

Minh Le Nguyen - Japan Advanced Institute of Science and Technology (JAIST)

Ngan Luu-Thuy Nguyen - University of Information Technology, Vietnam National University, Ho Chi Minh City (VNUHCM-UIT)

Kiet Van Nguyen - University of Information Technology, Vietnam National University, Ho Chi Minh City (VNUHCM-UIT)

Vu Tran - Japan Advanced Institute of Science and Technology (JAIST)

Trung Vo - Japan Advanced Institute of Science and Technology (JAIST)

Son Thanh Luu - University of Information Technology, Vietnam National University, Ho Chi Minh City (VNUHCM-UIT), and Japan Advanced Institute of Science and Technology (JAIST)

Hiep Nguyen - Japan Advanced Institute of Science and Technology (JAIST)

Khanh Tran - University of Information Technology, Vietnam National University, Ho Chi Minh City (VNUHCM-UIT)

Contact

Mr. Son Thanh Luu (sonlt@uit.edu.vn)

Sponsors and Partners

VinBIGDATA   VinIF  AIMESOFT  bee  Dagoras            

 

 zalo    VTCC  VCCorp

 

 

IOIT  HUS  USTH  UET    TLU  UIT  INT2  jaist  VIETLEX