The TC-STAR project is envisaged as a long-term effort to advance research in all core technologies for Speech-to-Speech Translation (SST). SST technology is a combination of Automatic Speech Recognition (ASR), Spoken Language Translation (SLT) and Text to Speech (TTS) (speech synthesis). The objectives of the project are ambitious: making a breakthrough in SST that significantly reduces the gap between human and machine translation performance. The project targets a selection of unconstrained conversational speech domainsโspeeches and broadcast newsโand three languages: European English, European Spanish, and Mandarin Chinese. Accurate translation of unrestricted speech is well beyond the capability of todayโs state-of-the-art research systems. Therefore, advances are needed to improve the state-of the-art technologies for speech recognition and speech translation.
๐ค What Matters in Data for DPO? I asked myself this question a few days ago while trying to understand how to generate a dataset with preferences to run #DPO. This recent #NeurIPS paper answered some of my questions. The findings are simple but crucial for data creation:
๐ Come and join our group! ๐
We offer 2 fully funded PhD positions:
๐ Human-Centred Evaluation Frameworks for Multilingual Technologies (A6)
๐ค Multimedia Personalization with Multimodal Large Language Models (A7)
โฐ Deadline: 15 May 2026
๐ Details: https://iecs.unitn.it/education/admission/call-for-application
Our pick of the week by
@FBKZhihangXie
: "Detecting Hallucination in SpeechLLMs at Inference Time Using Attention Maps" by @JWaldendorf, Bashar Awwad Shiekh Hasan and Evgenii Tsymbalov
๐ฐ
#SpeechLLM #Hallucination
๐ New paper: Detecting Hallucinations in SpeechLLMs at Inference Time Using Attention Maps
๐ http://arxiv.org/abs/2604.19565
๐งฉ Lightweight inference-time detection for SpeechLLM hallucinations via audio attention.
โจ Attention classifiers beat uncertainty baselines on ASR and S2TT.
๐ New Shared Task: Model Compression for Machine Translation at #WMT2026 (co-located with #EMNLP2026)!
๐
Test data out on June 18th, submissions by July 2nd!
Can you shrink an LLM and keep translation quality high? ๐ง ๐ง
๐ https://www2.statmt.org/wmt26/model-compression.html #NLP #ML #LLM #ModelCompression