VLSP 2025 Challenge on Vietnamese Legal Small Language Models (LegalSLM)
Update 5/7/2025:
The registration will be closed on 6/7/2025
Update 3/7/2025:
Training data and base models have been released here:
https://huggingface.co/VLSP2025-LegalSML
Registration here:
https://forms.gle/nD1b88WprhhBiSiP8
Important Dates
- June 23, 2025: Registration open
- July 3, 2025: Training data and base models release
- July 15, 2025: Public test release
- August 15, 2025: System submission deadline
- August 25, 2025: Private test results release
- September 5, 2025: Technical report submission
- September 27, 2025: Notification of acceptance
- October 3, 2025: Camera-ready deadline
- October 29-30, 2025: Conference dates
Task Overview
With the rapid advancement of Large Language Models such as ChatGPT, Gemini, Claude, DeepSeek, Qwen,… the demand for intelligent tools to process legal texts is growing significantly. While Legal NLP research has made substantial progress in languages like English, Japanese, and Chinese, foundational research for Vietnamese legal text processing remains limited.
However, the development and deployment of large-scale LLMs face significant challenges, especially in the context of Vietnam's resource constraints. This raises an important research question: can small-sized models achieve comparable performance to large models when specialized for specific domains? This task aims to address this question by developing small-to-medium sized language models specialized for Vietnamese legal domain, focusing on legal question answering and consultation.
Developing small language models also creates opportunities for more Vietnamese research groups to participate, utilizing limited resources while developing intelligent and efficient methodologies.
Objectives
The primary goals of this challenge are to:
- Build specialized Vietnamese legal language models capable of accurate legal question answering and consultation
- Develop small-to-medium sized models (≤ 4B parameters) to enhance practical deployability and accessibility
- Establish benchmarks for evaluating Vietnamese legal language understanding and reasoning capabilities
- Accelerate development of practical AI tools for the Vietnamese legal sector
Task Description
Participants will develop language models specialized for Vietnamese legal domain that can handle three core evaluation tasks:
- Legal Citation Usefulness: Determining whether a legal citation is useful for answering a specific legal question (True/False classification)
- Multiple-Choice Legal QA: Testing comprehensive Vietnamese legal knowledge through multiple-choice questions
- Free-Text Legal QA: Generating accurate and coherent narrative answers to Vietnamese legal questions
Data and Resources Provided
Training Data
- Vietnamese legal corpus: Preprocessed legal texts extracted from official Vietnamese codes, statutes, and legal documents
- Legal news and articles: Additional legal domain content including legal news and commentary
- Additional datasets: Teams are encouraged to supplement the training data with other publicly available or legally acquired legal-domain datasets.
Base Models
- Qwen3-1.7B and Qwen3-4B: Pre-trained on the provided legal corpus to serve as specialized base models
- Alternative models: Teams may use provided base models or develop with any other open-source models of their choice
- Parameter constraint: All models must be ≤ 4B parameters
Evaluation Data
- Public evaluation dataset: Released for initial model validation and development
- Private evaluation dataset: Used for final ranking and evaluation (held by organizers)
Model Requirements
- Parameter limit: All submitted models must have ≤ 4B parameters
- Model flexibility: Teams can fine-tune provided base models or use alternative open-source architectures
- Focus areas: Emphasis on continual pretraining, fine-tuning and instruction tuning for legal domain specialization
- No external information access: Models must not use any external information sources during inference (no search mechanisms, external APIs, or real-time data retrieval allowed)
Inference Guide
Details on input/output formats and instructions for running inference will be announced in a later update.
Evaluation Process
Public Testing Phase
- Public test set released for initial model validation
- Leaderboard rankings based on public performance
- Allows teams to gauge model performance during development
Final Evaluation
- Teams submit their final models to organizers
- Models evaluated on private test set to ensure fair and unbiased results
- Final rankings based on performance across all three evaluation tasks
Submission Requirements
- Model submission: Final trained models (≤ 4B parameters)
- Technical report: Detailed methodology, experiments, and results analysis
- System description: Implementation details and architectural choices
Organizers:
Lê Anh Cường, Ton Duc Thang University (TDTU): leanhcuong@tdtu.edu.vn
Nguyễn Việt Hà, University of Engineering and Technology, Vietnam National University (UET-VNU)
Nguyễn Phương Thái, UET-VNU
Dương Trọng Chí, TDTU
Lê Võ Quyết Thắng, TDTU
Nguyễn Phước Nguyên, TDTU
Nguyễn Trọng Hiếu, TDTU
Nguyễn Thị Thùy Linh, UET-VNU
Nguyễn Ngọc Khương, Hai Phong University (HPU)
Sponsors and Partners