We are very happy to announce that the following paper, accepted at the Transactions of ACL (TACL) journal is published and available:
๐ฆ๐น Iโll be in Vienna for #ACL2025NLP!
Interested in training a SpeechLLM without a lot of params or data? Come to my poster:
๐ผ๏ธ Mon, 18:00
Also into Speech Summarization? Join my IWSLT talk in collab with @fbk_mt:
๐ค Fri, 14:00
Happy to chat - come say hi! ๐
Papers in ๐งต
Sara Papi, Maike Z\"ufle, Marco Gaido, Beatrice Savoldi, Danni Liu, Ioannis Douros, Luisa Bentivogli, Jan Niehues, "MCIF: Multimodal Crosslingual Instruction-Following Benchmark from Scientific Talks,"
Our pick of the week by @mgaido91: "WhisperKit: On-device Real-time ASR with Billion-Scale Transformers" by Atila Orhon, Arda Okan, Berkin Durmus, @zachnagengast, and Eduardo Pacheco (ICML 2025)
#speech #speechtech #whisper #ASR #realtime
A couple of weeks before presenting our large-scale speech model compression task at IWSLT, here there is of the first attempts to bring large-scale models to the devices on the edge: https://arxiv.org/pdf/2507.10860... Hope to see more works along this direction!
Our pick of the week by @FBKZhihangXie: "Adversarial Speech-Text Pre-Training for Speech Translation" by Chenxuan Liu, Liping Chen, Weitai Zhang, Xiaoxi Li, Peiwang Tang, Mingjia Yu, Sreyan Ghosh, and Zhongyi Ye (ICASSP 2025)
#speech #speechprocessing #speechtech #translation
๐ AdvST: Adversarial training aligns speech and text distributions without parallel data! Combines adversarial learning + hidden-state swapping to fix length mismatch & boost low-resource speech translation. https://ieeexplore.ieee.org/document/10888294