Vietnamese Text-To-Speech on Common Datasets | Association for Vietnamese Language and Speech Processing

Important dates

Sep 10, 2020: Registration open
Sep 30, 2020: Registration closed
Oct 1, 2020: Start of dataset creation
Oct 15, 2020: End of dataset creation
Oct 31, 2020: Training dataset and TTS API specification requirement released;
Nov 20, 2020: TTS API submission; Start of evaluation phase
Dec 7, 2020: End of evaluation phase
Dec 15, 2020: Technical report submission
Dec 18, 2020: Result announcement (workshop day)

The VLSP Text-To-Speech (TTS) Challenge 2020 has been designed for understanding and comparing research techniques in building Vietnamese corpus-based TTS synthesizers on the same data. The basic challenge is to take the released speech database, build a TTS system with a training voice from the data. The synthetic utterances for test sentences from each synthesizer will then be evaluated through listening tests.

Participants have to join to build the dataset before receiving it. The main task is to transcribe or to correct the transcription for a part of the dataset.

Training data

You will be provided the training dataset after participating in the dataset building. The duration of the dataset is about 5-6 hours of a single speaker.

Test data

You have to submit TTS API specification so that the organizer can use your API to synthesize utterances from text in the test set. You will receive your synthetic utterances when the evaluation phase ends. The synthesized utterances will be presented to three groups of listeners: speech experts, volunteers, and undergraduates.