The TOSCA-MP project aimed to develop user-centric content annotation and search tools for professionals in networked media production and archiving (television, radio, online), addressing their specific use cases and workflow requirements. The project brought together 10 partners from 6 European countries including industry partners providing solutions for the media industry, public service broadcasters as well as their European association, a university and research centres. TOSCA-MP investigated scalable and distributed content processing methods performing advanced multimodal information extraction and semantic enrichment. Other key technology areas included search methods across heterogeneous networked content repositories and novel user interfaces. An open standards based service oriented framework integrated the components of the system.
Our pick of the week by @FBKZhihangXie: "#Speech Discrete Tokens or Continuous Features? A Comparative Analysis for Spoken Language Understanding in #SpeechLLMs" by @WangDingdo2603, Junan Li, @HelenMeng_CUHK, et al. (#EMNLP2025)
#SLU #SpeechTech
๐ New paper: Speech Discrete Tokens or Continuous Features?
๐ https://aclanthology.org/2025.emnlp-main.1266.pdf
๐งฉ A comprehensive benchmark of SpeechLLMs using HuBERT/WavLM with Qwen & LLaMA.
โจ Continuous features outperform overall, while discrete tokens excel at phoneme-level detail.
๐ Exciting news from the @FBK_MT group!
Four of our members @BeatriceSavoldi, @lina_conti, @negri_teo & @luisabentivogli are attending #EMNLP2025 in Suzhou ๐จ๐ณ with 5 accepted papers!
Come to our sessions & let's connect:
๐ https://mt.fbk.eu/fbk-mt-at-emnlp-2025/
Weโre also hiring postdocs!โก
๐๐Congratulations to our PhD student @DennisFucci on a very successful thesis defense! ๐
Many thanks to the evaluation committee members @debora_nozza, @mirco_ravanelli, and Leonardo Badino for their insightful feedback and appreciation of his work!
#nlproc