- Challenge registration open: August 12h, 2022
- Release of training and developmental data: September 7th, 2022
- Release of public test data: September 15th, 2022
- Release of private test data: October 10th, 2022
- Submission of scores: October 15th, 2022
- Announcing the Top 3 to be presented at the conference: October 20th, 2022
- Technical report submission: November 15th, 2022
- Announcing the ranking winners: November 26th, 2022
ABOUT the Challenge
Speaker verification (SV) is the task of verifying whether an input utterance matches the claimed identity. Though there exists an ample amount of research in SV technologies, the development concerning multilingual conversation is limited. In a country like India, almost all the speakers are multilingual and the speaking style changes from one geographical region to another. Consequently, the development of a multilingual SV (MSV) system on the data collected in the Indian scenario is more challenging. With this motivation, the COCOSDA INDIC- Multilingual Speaker Verification (I-MSV) Challenge 2022 has been designed for understanding and comparing the state-of-the-art SV techniques using speech data from 13 Indian languages collected using five different sensors. The goal of this challenge is to make the SV system robust to language and sensor variations between enrollment and testing. There will be two types of tasks namely constrained SV and unconstrained SV involved in I-MSV Challenge 2022.
I-MLSV Challenge 2022 Rules
The I-MLSV challenge consists of two tracks namely, Track 1 (Constrained SV) and Track 2 (Unconstrained SV). The evaluation rule for both the tracks is as follows:
- Submissions other than the defined tasks will not be included in this challenge.
- Constrained SV: Participants are not allowed to use speech data other than the speech data released as a part of the constrained SV challenge for the development of the SV system.
- Unconstrained SV: Participants are free to use any publicly available speech data in addition to the audio data released as a part of unconstrained SV.
- Metric for evaluation: Equal Error Rate (EER) will be used as the metric for performance evaluation for the defined test scenarios.
- Participating teams need to share their final SV systems, along with a write-up in o-cocosda format (https://vlsp.org.vn/cocosda2022/paper-submission), which should give a brief description about
- The database used with appropriate citations
- Brief description of the methods used to build the system
- Github link with proper code structure and details
Nature of Speech Data
- Developmental data: Developmental data consists of speech data in Indian languages, collected in multiple sessions using five different sensors.
- Enrolment data: The Enrolment data consists of utterances from the English language captured in multiple sessions using only a headphone as the sensor.
- Public test data: Public test data will be provided for two conditions namely, matched and mismatched test conditions.
- Matched test condition: The language and sensor used for test data collection have remained the same as the enrolment data.
- Mismatched test condition: The language and sensor used for test data collection differ from the enrolment data.
- Private test data: In private test data, the test utterances are collected using different languages and five sensors, including the sensor used for collecting the enrollment data. The duration of test data varies from 10 to 60 sec.
- Multiple submissions are allowed but under a limitation of each phase, the evaluation result is based on the submission having the lowest EER.
- The submission file comprises a header; a single line must contain 3 fields separated by tab character in the following format:
test_wav - The test speech
score - The similarity score is in the range of 0 to 1.
Team NLTM Speaker recognition, India
Details about the team: https://sr-meity.github.io/Manuals/
Any queries kindly write to us at email@example.com