Tasks that ultimately require knowledge-based multimedia techniques (content-oriented search, assessment, abstracting, etc.) are still to a major extent carried out manually. PATExpert’s overall scientific goal is to change the paradigm currently followed for patent processing from textual (viewing patents as text blocks enriched by “canned” picture material, sequences of morpho-syntactic tokens, or collections of syntactic structures) to semantic (viewing patents as multimedia knowledge objects) processing. PATExpert developed a multimedia content representation formalism based on Semantic Web technologies for selected technology areas and investigate the retrieval, classification, multilingual generation of concise patent information, assessment and visualization of patent material encoded in this formalism, taking the information needs of all user types as defined in a user typology into account. PATExpert’s technological goal was to develop a showcase that demonstrates the viability of PATExpert’s approach to content representation for real applications. The composition and the competence of the Consortium ensured the achievement of these goals.
Our pick of the week by @FBKZhihangXie: "PHRASED: Phrase Dictionary Biasing for Speech Translation" by Peidong Wang, Jian Xue, Rui Zhao, @ChenJunkun, Aswin Shanmugam Subramanian, and Jinyu Li (2025).
#Speech #SpeechAI #Translation #ST #SpeechTranslation
🚀 Boost rare-phrase translation in speech! Uses **bilingual dictionaries** to dynamically bias outputs.
✅ **+21%** recall in streaming ST
✅ **+85%** in multimodal LLMs
🔗: http://arxiv.org/abs/2506.09175
FAMA è il primo foundation model vocale open-science per ita e eng, sviluppato da FBK. Riconosce e traduce la voce usando solo dati e strumenti pubblici: oltre 150.000 ore di audio open, codice e processi completamente accessibili.
@fbk_stek @fbk_mt
https://magazine.fbk.eu/it/news/la-prima-famiglia-di-modelli-open-science-per-il-riconoscimento-vocale-e-la-traduzione-del-parlato/
Emanuele Pianta Award for the Best Master’s Thesis in Computational Linguistics submitted at an Italian university and defended between August 1st 2024 and July 31st 2025
- Deadline: August 1st, 2025 (11:59 pm CEST)
- All details online: https://clic2025.unica.it/emanuele-pianta-award-for-the-best-masters-thesis/
Our pick of the week by @DennisFucci: "Speech Representation Analysis Based on Inter- and Intra-Model Similarities" by Yassine El Kheir, Ahmed Ali, and Shammur Absar Chowdhury (ICASSP Workshops 2024)
#speech #speechtech
Findings from https://ieeexplore.ieee.org/document/10669908 show that speech SSL models converge on similar embedding spaces, but via different routes. While overall representations align, individual neurons learn distinct localized concepts.
Interesting read! @fbk_mt