The TOSCA-MP project aimed to develop user-centric content annotation and search tools for professionals in networked media production and archiving (television, radio, online), addressing their specific use cases and workflow requirements. The project brought together 10 partners from 6 European countries including industry partners providing solutions for the media industry, public service broadcasters as well as their European association, a university and research centres. TOSCA-MP investigated scalable and distributed content processing methods performing advanced multimodal information extraction and semantic enrichment. Other key technology areas included search methods across heterogeneous networked content repositories and novel user interfaces. An open standards based service oriented framework integrated the components of the system.
Our pick of the week by @beomseok_lee_: "ALAS: Measuring Latent Speech-Text Alignment For Spoken Language Understanding In Multimodal LLMs" by Pooneh Mousavi, @yingzhi_wang, @mirco_ravanelli, and @CemSubakan (2025)
#SLU #speech #multimodal #LLM
Speech-language models show promise in multimodal tasksโbut how well are speech & text actually aligned? ๐ค
This paper https://arxiv.org/abs/2505.19937 proposes a new metric to measure layer-wise correlation between the two, with a focus on SLU tasks. ๐๐ฃ๏ธ๐
๐ Ciao! Stiamo studiando come l'AI viene usata in Italia e per farlo abbiamo costruito un sondaggio!
๐https://bocconi.eu.qualtrics.com/jfe/form/SV_2nTelXaXvJlinbg (รจ anonimo, dura ~10 m, se partecipi o lo diffondi ci aiuti un sacco๐)
Ci interessa anche raggiungere persone che non si occupano di AI!
Our pick of the week by @apierg: "Agree to Disagree? A Meta-Evaluation of LLM Misgendering" by Arjun Subramonian, Vagrant Gautam, Preethi Seshadri, Dietrich Klakow, @kaiwei_chang, @YizhouSun (2025).
#LLM #misgendering #gender
Super interesting paper by Subramonian et al: "Agree to Disagree? A Meta-Evaluation of LLM Misgendering" https://arxiv.org/abs/2504.17075
Turns out, misgendering is messier than just pronouns. I'd love to see this analysis extended to grammatical gender languages! #LLM #AI #ethics @fbk_mt
๐ New tech report out! Meet FAMA, our open-science speech foundation model family for both ASR and ST in ๐ฌ๐ง English and ๐ฎ๐น Italian.
The models are live and ready to try on @huggingface ๐
๐
#ASR #ST #OpenScience #MultilingualAI