Corpora

mGeNTE

mGeNTE (Multilingual Gender-Neutral Translation Evaluation) is a natural, multilingual corpus designed to benchmark gender-neutral language and automatic translation.mGente is built upon European Parliament speech data extracted...

Read More

MOSEL

The MOSEL corpus is a multilingual dataset collection including up to 950K hours of open-source speech recordings covering the 24 official languages of the European Union. We collect data by surveying labeled and unlabeled...

Read More

Speech-MASSIVE

Spoken Language Understanding (SLU) involves interpreting spoken input using Natural Language Processing (NLP). Voice assistants like Alexa and Siri are real-world examples of SLU applications. The core tasks in SLU include...

Read More

INES

The INclusive Evaluation Suite (INES) is a test set designed to assess MT systems ability to produce gender-inclusive translations for the German→English language pair. By design, each German source sentence in INES includes an...

Read More

GeNTE

GeNTE (Gender-Neutral Translation Evaluation) is a natural, bilingual corpus designed to benchmark the ability of machine translation systems to generate gender-neutral translations. Built from European Parliament speeches,...

Read More
Loading

📢📢 We invite proposals for @iwslt 2026 shared tasks! For further information on this initiative, please refer to the https://iwslt.org/assets/pdfs/IWSLT2026-Call_for_Tasks.pdf
Submission deadline: September 30th, 2025

@_janius_ @marcfede @shashwatup9k @esalesk @katsuhitosudoh @aclmeeting @ELRAnews

Heading home after an exciting and intense @aclmeeting in Vienna! We had a great time presenting our work and connecting with the community.

Thanks to everyone who came by!

#acl2025 #nlproc
(1/6)

🇦🇹 I’ll be in Vienna for #ACL2025NLP!

Interested in training a SpeechLLM without a lot of params or data? Come to my poster:
🖼️ Mon, 18:00

Also into Speech Summarization? Join my IWSLT talk in collab with @fbk_mt:
🎤 Fri, 14:00

Happy to chat - come say hi! 😎
Papers in 🧵

Sara Papi, Maike Z\"ufle, Marco Gaido, Beatrice Savoldi, Danni Liu, Ioannis Douros, Luisa Bentivogli, Jan Niehues, "MCIF: Multimodal Crosslingual Instruction-Following Benchmark from Scientific Talks,"

Load More