EU-BRIDGE aimed at developing automatic transcription and translation technology that permits the development of innovative multimedia captioning and translation services of audiovisual documents between European and non-European languages. The project provided streaming technology that can convert speech from lectures, meetings, and telephone conversations into the text in another language. Therefore EU-BRIDGE intends to put together academics, engineering and business expertise in order to create competitive offers to existing needs of translation, communication, content processing and publishing. The four use cases were: Captioning Translation for TV broadcasts, University Lecture Translations, European Parliament Translations, Unified Communication Translation. The prospective users of the project were European companies operating in an audiovisual market (in particular TV captioning and translation).
🇦🇹 I’ll be in Vienna for #ACL2025NLP!
Interested in training a SpeechLLM without a lot of params or data? Come to my poster:
🖼️ Mon, 18:00
Also into Speech Summarization? Join my IWSLT talk in collab with @fbk_mt:
🎤 Fri, 14:00
Happy to chat - come say hi! 😎
Papers in 🧵
Sara Papi, Maike Z\"ufle, Marco Gaido, Beatrice Savoldi, Danni Liu, Ioannis Douros, Luisa Bentivogli, Jan Niehues, "MCIF: Multimodal Crosslingual Instruction-Following Benchmark from Scientific Talks,"
Our pick of the week by @mgaido91: "WhisperKit: On-device Real-time ASR with Billion-Scale Transformers" by Atila Orhon, Arda Okan, Berkin Durmus, @zachnagengast, and Eduardo Pacheco (ICML 2025)
#speech #speechtech #whisper #ASR #realtime
A couple of weeks before presenting our large-scale speech model compression task at IWSLT, here there is of the first attempts to bring large-scale models to the devices on the edge: https://arxiv.org/pdf/2507.10860... Hope to see more works along this direction!
Our pick of the week by @FBKZhihangXie: "Adversarial Speech-Text Pre-Training for Speech Translation" by Chenxuan Liu, Liping Chen, Weitai Zhang, Xiaoxi Li, Peiwang Tang, Mingjia Yu, Sreyan Ghosh, and Zhongyi Ye (ICASSP 2025)
#speech #speechprocessing #speechtech #translation
🚀 AdvST: Adversarial training aligns speech and text distributions without parallel data! Combines adversarial learning + hidden-state swapping to fix length mismatch & boost low-resource speech translation. https://ieeexplore.ieee.org/document/10888294