mGeNTE
mGeNTE (Multilingual Gender-Neutral Translation Evaluation) is a natural, multilingual corpus designed to benchmark gender-neutral language and automatic translation.mGente is built upon European Parliament speech data extracted...
Read Moreby Beatrice Savoldi | Jan 13, 2025 | Corpora | 0
mGeNTE (Multilingual Gender-Neutral Translation Evaluation) is a natural, multilingual corpus designed to benchmark gender-neutral language and automatic translation.mGente is built upon European Parliament speech data extracted...
Read Moreby Beomseok Lee | Aug 21, 2024 | Corpora | 0
Spoken Language Understanding (SLU) involves interpreting spoken input using Natural Language Processing (NLP). Voice assistants like Alexa and Siri are real-world examples of SLU applications. The core tasks in SLU include...
Read Moreby Mauro Cettolo | Apr 30, 2024 | Corpora | 0
Ready-to-use version for MT research purposes of the multilingual transcriptions of TED talks
Read Moreby Dennis Fucci | Oct 20, 2023 | Corpora | 0
Text corpora for Spanish, French, and Italian containing gendered words referring to the first-person speaker
Read Moreby Beatrice Savoldi | Oct 19, 2023 | Corpora | 1
The INclusive Evaluation Suite (INES) is a test set designed to assess MT systems ability to produce gender-inclusive translations for the German→English language pair. By design, each German source sentence in INES includes an...
Read Moreby Beatrice Savoldi | Oct 9, 2023 | Corpora | 0
GeNTE (Gender-Neutral Translation Evaluation) is a natural, bilingual corpus designed to benchmark the ability of machine translation systems to generate gender-neutral translations. Built from European Parliament speeches,...
Read Moreby Marco Gaido | Jul 7, 2023 | Corpora | 0
EC Short Clips is a test set dedicated to evaluate automatic subtitling systems.
Read Moreby Marco Gaido | Jul 7, 2023 | Corpora | 0
EuroParl Interviews is a test set dedicated to evaluate automatic subtitling systems.
Read Moreby Matteo Negri | Jun 1, 2023 | Corpora | 0
Multilingual benchmark built from European Parliament speeches and annotated with Named Entities and Terminology
Read Moreby Mauro Cettolo | May 30, 2023 | Corpora | 0
Annotation of dubbing segments based on the Heroes corpus
Read Moreby Beatrice Savoldi | May 30, 2023 | Corpora | 0
This multilingual dataset was created within the TOSCA-MP project as ground truth data for the evaluation of automatic transcription and spoken language translation technologies.
Read More
A special evening in Rome to talk about Physical AI and Europe’s role in shaping this new frontier.
Partners from across Europe came together to present the DVPS project, and connect with key people from public institutions, embassies, industries, national & international media.
Thrilled to be part of this amazing project and team!
🚀 DVPS has launched at Translated's HQ!
70 researchers from 20 institutions across 9 countries unite to build next-gen multimodal foundation models that learn from real-world interaction.
A new European AI journey begins.
#DVPS #PhysicalAI #HorizonEurope #MultimodalAI
Our pick of the week by @FBKZhihangXie: "PHRASED: Phrase Dictionary Biasing for Speech Translation" by Peidong Wang, Jian Xue, Rui Zhao, @ChenJunkun, Aswin Shanmugam Subramanian, and Jinyu Li (2025).
#Speech #SpeechAI #Translation #ST #SpeechTranslation
🚀 Boost rare-phrase translation in speech! Uses **bilingual dictionaries** to dynamically bias outputs.
✅ **+21%** recall in streaming ST
✅ **+85%** in multimodal LLMs
🔗: http://arxiv.org/abs/2506.09175
FAMA è il primo foundation model vocale open-science per ita e eng, sviluppato da FBK. Riconosce e traduce la voce usando solo dati e strumenti pubblici: oltre 150.000 ore di audio open, codice e processi completamente accessibili.
@fbk_stek @fbk_mt
https://magazine.fbk.eu/it/news/la-prima-famiglia-di-modelli-open-science-per-il-riconoscimento-vocale-e-la-traduzione-del-parlato/