VLSP 2021 - Vietnamese Speaker Verification

Important dates

  • Aug 5, 2021: Registration open

  • Aug 30, 2021: Registration closed

  • Sep 6, 2021: Start of dataset building

  • Sep 20, 2021: End of dataset building

  • Oct 1, 2021: Challenge started (via [aihub.vn](http://aihub.vn))

  • Nov 6, 2021: Private test set release for SV-T1 and SV-T2

  • Nov 7, 2021: Private test results announcement

  • Nov 8, 2020: Announce top 3 teams to submit technical reports

  • Nov 25, 2021: Deadline for top 3 teams to submit technical reports

  • Nov 26, 2021: Result announcement and presentation (workshop day)

Description

VLSP2021 Speaker Verification will feature two evaluation tasks. Teams can participate in one of the tasks or both.

Task-01 (SV-T1): Focusing on the development of SV models with limited data. For this task, the organizer will provide a training set with over 1000 speaker identities. Participants can only use this dataset for model development. Any use of additional data for model training is prohibited.

Task-02 (SV-T2): Focusing on testing the robustness of SV systems. For this task, participants can use the released training set and any other data.

Public pre-trained models may be used for system training and development in both tasks and must be specified and shared with other teams. Non-speech audio and data (e.g., noise samples, impulse responses, ...) may be used and should be noted in the technical report.

Final standings for both tasks will be decided based on private test results on Nov 7, 2021.

Training data

You will be provided with the training dataset after participating in the dataset building phase.

Evaluation data

Private evaluation sets will be made available for the two tasks SV-T1 and SV-T2. The SV-T1 test set is a combination of in-domain speakers and out-domain speakers. Train speakers and test speakers are mutually exclusive.

In evaluation sets, each record is a single line containing two fields separated by a tab character and in the following format:

enrollment_wav<TAB>test_wav<NEWLINE>

**where

enrollment_wav - The enrollment utterance
test_wav - The test utterance

Example evaluation set:

enrollment_wav        test_wav
file1.wav                      file2.wav
file1.wav                      file3.wav
file1.wav                      file4.wav
...

Evaluation metric

The performance of the models will be evaluated by the Equal Error Rate (EER) where the False Acceptance Rate (FAR) equals the False Rejection Rate (FRR).

Submission Guidelines

Multiple submissions are allowed, the evaluation result is based on the submission having the lowest EER.

The submission file is composed of a header and a set of testing pairs and a cosine similarity output by the system for the pair. The order of the pairs in the submission file must follow the same order as the pair list. A single line must contain 3 fields separated by tab character in the following format:

enrollment_wav<TAB>test_wav<TAB>score<NEWLINE>

where

enrollment_wav - The enrollment utterance
test_wav - The test utterance
score - The cosine similarity

For example:

enrollment_wav        test_wav        score
file1.wav                      file2.wav        0.81285
file1.wav                      file3.wav        0.01029
file1.wav                      file4.wav        0.45792
...