The TC-STAR project is envisaged as a long-term effort to advance research in all core technologies for Speech-to-Speech Translation (SST). SST technology is a combination of Automatic Speech Recognition (ASR), Spoken Language Translation (SLT) and Text to Speech (TTS) (speech synthesis). The objectives of the project are ambitious: making a breakthrough in SST that significantly reduces the gap between human and machine translation performance. The project targets a selection of unconstrained conversational speech domains—speeches and broadcast news—and three languages: European English, European Spanish, and Mandarin Chinese. Accurate translation of unrestricted speech is well beyond the capability of today’s state-of-the-art research systems. Therefore, advances are needed to improve the state-of the-art technologies for speech recognition and speech translation.
Weekly pick from the #MeetweenScientificWatch: “Video-SALMONN: Speech-enhanced audio-visual large language models” – Redefining video comprehension with speech-aware AV-LLMs and groundbreaking QA accuracy. 🎥🎤🤖
I’m glad to announce that our work “How "Real" is Your Real-Time Simultaneous Speech-to-Text Translation System?” has been accepted at the Transactions of @aclanthology (TACL)! 🎉
The preprint is available here:
The new @iwslt shared task on instruction following speech models is out! Test sets will be available on the 1st of April and participants have to submit their models by April 15th. Check out the description for more info (or get in touch with us):
📢First Call for Papers 📢
The 22nd @iwslt event will be co-located with @aclmeeting
31 July-1 August 2025 –Vienna, Austria
Scientific submission due March 15, 2025
More details here:
@marcfede @esalesk @ELRAnews @shashwatup9k @MarineCarpuat @_janius_