VLSP 2025 Challenge on DRiLL: The challenge of Deep Retrieval in the expansive Legal Landscape
Important dates
- June 20, 2025: Registration open
- July 01, 2025: Training data release
- July 15, 2025: Public test release
- August 08, 2025: Private test release
- August 12, 2025: System submission deadline
- August 15, 2025: Private test results release
- August 30, 2025: Technical report submission
- September 27, 2025: Notification of acceptance:
- October 03, 2025: Camera-ready deadline
- October 29-30, 2025: Conference dates
Registration
https://bit.ly/vlsp-2025-drill
Task Description
With the rapid advancement of artificial intelligence, particularly generative models in natural language processing (NLP) such as ChatGPT, DeepSeek, and Qwen, the demand for intelligent tools to process legal texts is growing significantly. While Legal NLP research has seen substantial progress in languages like English, Japanese, and Chinese, foundational research for Vietnamese legal text processing remains relatively underdeveloped. In this shared task, we introduce one of the first initiatives aimed at advancing Vietnamese Legal NLP.
Information Retrieval (IR) is a core task in NLP, concerned with identifying which pieces of information are most relevant to a given query. In the legal domain, the Legal Document Retrieval task focuses on determining which legal articles are relevant to a specific legal question. The task can be formalized as follows: Given a set of questions Q = {q1, q2, ..., qn} and a corpus of articles A = {a1, a2, ..., an) the task is required to identify a a subset A′ ⊂ A where each article ai ∈ A' is considered “relevant” to the corresponding question q.
We call an article “Relevant” to a query if the query sentence can be answered Yes/No, entailed from the meaning of the article.
LLMs Usages
You can use LLMs whose training data and/or model are publicly available (e.g. Huggingface or similar sites), but you cannot use LLMs whose models are closed (e.g. GPT-4o, Gemini, ...). For reproducibility purposes, please include information on how to obtain the model in the paper.
Evaluation metrics
- Automatic Evaluation: Recall, Precision, Macro-F2
- Human Evaluation
Data Format
Training data:
[
{
"id": 11938,
"question": "Chế độ báo cáo của doanh nghiệp kinh doanh dịch vụ xếp hạng tín nhiệm quy định thế nào?",
"relevant_laws": [
27053,
27071
]
}
]
Legal corpus:
[
{
"id": 0,
"law_id": "14/2022/TT-NHNN",
"content": [
{
"aid": 0,
"content_Article": "1. Thông tư này quy định mã số, tiêu chuẩn chuyên môn, nghiệp vụ và xếp lương đối với các ngạch công chức chuyên ngành Ngân hàng.\n\n2. Thông tư này áp dụng đối với công chức làm việc tại các đơn vị thuộc Ngân hàng Nhà nước Việt Nam (gọi tắt là Ngân hàng Nhà nước)."
},
{
"aid": 1,
"content_Article": "1. Kiểm soát viên cao cấp ngân hàng Mã số: 07.044 2. Kiểm soát viên chính ngân hàng Mã số: 07.045 3. Kiểm soát viên ngân hàng Mã số: 07.046 4. Thủ kho, thủ quỹ ngân hàng Mã số: 07.048 5. Nhân viên Tiền tệ - Kho quỹ Mã số: 07.047 "
},
......
]
Submission guidelines
The instructions for submission will be announced at a later time.
Organizers
- Thi-Hai-Yen Vuong - VNU University of Engineering and Technology (VNU-UET) - yenvth@vnu.edu.vn
- Ha-Thanh Nguyen - National Institute of Informatics (NII), Japan
- Trong-Khoi Dao - VNU University of Law (VNU-UL)
- Tan-Minh Nguyen - Japan Advanced Institute of Science and Technology (JAIST)
- Hoang-Trung Nguyen - VNU University of Engineering and Technology (VNU-UET)
- Hoang-Quynh Le - VNU University of Engineering and Technology (VNU-UET)
Sponsors and Partners