I-MSV 2022: Indic- Multilingual Speaker Verification Challenge 2022

cocosda22 Thu, 08/11/2022 - 05:41

Timeline

Challenge registration open: August 12h, 2022
Release of training and developmental data: September 7th, 2022
Release of public test data: September 15th, 2022
Release of private test data: October 10th, 2022
Submission of scores: October 15th, 2022
Announcing the Top 3 to be presented at the conference: October 20th, 2022
Technical report submission: November 15th, 2022
Announcing the ranking winners: November 26th, 2022

ABOUT the Challenge

Speaker verification (SV) is the task of verifying whether an input utterance matches the claimed identity. Though there exists an ample amount of research in SV technologies, the development concerning multilingual conversation is limited. In a country like India, almost all the speakers are multilingual and the speaking style changes from one geographical region to another. Consequently, the development of a multilingual SV (MSV) system on the data collected in the Indian scenario is more challenging. With this motivation, the COCOSDA INDIC- Multilingual Speaker Verification (I-MSV) Challenge 2022 has been designed for understanding and comparing the state-of-the-art SV techniques using speech data from 13 Indian languages collected using five different sensors. The goal of this challenge is to make the SV system robust to language and sensor variations between enrollment and testing. There will be two types of tasks namely constrained SV and unconstrained SV involved in I-MSV Challenge 2022.

I-MLSV Challenge 2022 Rules

The I-MLSV challenge consists of two tracks namely, Track 1 (Constrained SV) and Track 2 (Unconstrained SV). The evaluation rule for both the tracks is as follows:

Submissions other than the defined tasks will not be included in this challenge.
Constrained SV: Participants are not allowed to use speech data other than the speech data released as a part of the constrained SV challenge for the development of the SV system.
Unconstrained SV: Participants are free to use any publicly available speech data in addition to the audio data released as a part of unconstrained SV.
Metric for evaluation: Equal Error Rate (EER) will be used as the metric for performance evaluation for the defined test scenarios.
Participating teams need to share their final SV systems, along with a write-up in o-cocosda format (https://vlsp.org.vn/cocosda2022/paper-submission), which should give a brief description about
- The database used with appropriate citations
- Brief description of the methods used to build the system
- Github link with proper code structure and details

Nature of Speech Data

Developmental data: Developmental data consists of speech data in Indian languages, collected in multiple sessions using five different sensors.
Enrolment data: The Enrolment data consists of utterances from the English language captured in multiple sessions using only a headphone as the sensor.
Public test data: Public test data will be provided for two conditions namely, matched and mismatched test conditions.
- Matched test condition: The language and sensor used for test data collection have remained the same as the enrolment data.
- Mismatched test condition: The language and sensor used for test data collection differ from the enrolment data.
Private test data: In private test data, the test utterances are collected using different languages and five sensors, including the sensor used for collecting the enrollment data. The duration of test data varies from 10 to 60 sec.

Submission Guidelines

Multiple submissions are allowed but under a limitation of each phase, the evaluation result is based on the submission having the lowest EER.
The submission file comprises a header; a single line must contain 3 fields separated by tab character in the following format:

test_wav<TAB>target_speaker_ID<TAB>similarity score<NEWLINE>
**where
test_wav - The test speech
score - The similarity score is in the range of 0 to 1.

For example:

test_wavtarget_speaker_IDscore

abc.wav10010.97285

Organizer:

Team NLTM Speaker recognition, India
Details about the team: https://sr-meity.github.io/Manuals/

Contact:

Any queries kindly write to us at jagabandhu.mishra.18@iitdh.ac.in

Search