VLSP 2025 Speech Quality Assessment
Important dates
June 23, 2025: Registration open
July 1, 2025: Training data, public test release
August 14, 2025: Private test release
August 14, 2025: System submission deadline
August 14, 2025: Private test results release
August 30, 2025: Technical report submission
September 27, 2025: Notification of acceptance
October 3, 2025: Camera-ready deadline
October 29-30, 2025: Conference dates
General Description
With the advancement of information and communication technology, connecting with others via the Internet and telecommunication systems has become effortless. However, speech transmitted over these networks often degrades, diminishing its original quality and potentially leading to annoyance or misunderstandings. Consequently, Speech Quality Assessment (SQA) is crucial for evaluating the performance of communication systems, drawing interest from telephone companies and Internet service providers. In this task, participants will work with a Vietnamese dataset, where each degraded speech sample is assigned a quality score from 1 to 5. The objective is to develop a model that can predict the channel quality scores for given speech samples.
This year, in addition to the data provided by the organizers, teams can utilize external resources like pretrained models and open datasets. Before the competition starts, teams are encouraged to propose external resources. The organizers will review these suggestions and select resources based on criteria such as accuracy, popularity, and size to ensure fairness. During the competition, teams are only permitted to use the resources approved by the organizers.
Dataset
In this competition, teams will receive speech recordings captured over a mobile network, along with quality scores ranging from 1 to 5. Initially, we recorded the original speech using high-quality equipment. Using Nemo Handy software [1], we made phone calls between two mobile phones. The calling phone played back the recorded speech, while the receiving phone stored the transmitted audio. By comparing the differences, Nemo Handy provided POLQA [2] quality scores for the channel. The speech is stored in .wav format with an 8kHz sampling rate, and each sample includes a channel score.
Evaluation Metrics
In this task, two popular metrics are Pearson Correlation Coefficient (PCC) and Mean Square Error (MSE):
PCC: is a correlation coefficient that measures the linear correlation between two sets of data.
MSE: measures the average of the squares of the errors.
The higher PCC and the lower MSE indicate the better model. Therefore, the overall evaluation metric is calculated as (higher is better):
Final_Score = 0.7 * PCC - 0.3 * MSE
Contact
Zalo Group: https://zalo.me/g/jrpmsi296
Registration
https://forms.gle/F9tQjCBUBpM52Zcr9
Organizers
Tạ Bảo Thắng, Hanoi University of Science and Technology, tabaothang97@gmail.com
Lê Minh Tú, WorldQuant, minhtutx@gmail.com
Đỗ Văn Hải, Thuyloi University, haidv@tlu.edu.vn
References
https://www.keysight.com/us/en/assets/7018-05575/flyers/5992-2050.pdf
Beerends, John G., Christian Schmidmer, Jens Berger, Matthias Obermann, Raphael Ullmann, Joachim Pomy, and Michael Keyhl. "Perceptual objective listening quality assessment (polqa), the third generation itu-t standard for end-to-end speech quality measurement part i—temporal alignment." Journal of the audio engineering society 61, no. 6 (2013): 366-384.
G. Mittag, B. Naderi, A. Chehadi, and S. Moller, “NISQA: A deep cnn-self-attention model for multidimensional speech quality prediction with crowdsourced datasets,” In Proc. Interspeech 2021, pp. 2127–2131, 2021
Le, Minh Tu, Bao Thang Ta, Phi Le Nguyen, Van Hai Do. "A Gaussian Distribution Labeling Method for Speech Quality Assessment." International Conference on Computational Data and Social Networks. Singapore: Springer Nature Singapore, 2023.
Bao Thang Ta, Minh Tu Le, Van Hai Do, and Huynh Thi Thanh Binh. "Enhancing No-Reference Speech Quality Assessment with Pairwise, Triplet Ranking Losses, and ASR Pretraining." In Proc. Interspeech 2024, pp. 2700-2704. 2024.
Sponsors and Partners