WAGS
Ready-to-use version for MT research purposes of the multilingual transcriptions of TED talks
Read Moreby Mauro Cettolo | Apr 30, 2024 | Corpora | 0
Ready-to-use version for MT research purposes of the multilingual transcriptions of TED talks
Read Moreby Dennis Fucci | Oct 20, 2023 | Corpora | 0
Text corpora for Spanish, French, and Italian containing gendered words referring to the first-person speaker
Read Moreby Beatrice Savoldi | Oct 19, 2023 | Corpora | 0
The INclusive Evaluation Suite (INES) is a test set designed to assess MT systems ability to produce gender-inclusive translations for the German→English language pair. By design, each German source sentence in INES includes an...
Read Moreby Beatrice Savoldi | Oct 9, 2023 | Corpora | 0
GeNTE (Gender-Neutral Translation Evaluation) is a natural, bilingual corpus designed to benchmark the ability of machine translation systems to generate gender-neutral translations. Built from European Parliament speeches,...
Read Moreby Marco Gaido | Jul 7, 2023 | Corpora | 0
EC Short Clips is a test set dedicated to evaluate automatic subtitling systems.
Read Moreby Marco Gaido | Jul 7, 2023 | Corpora | 0
EuroParl Interviews is a test set dedicated to evaluate automatic subtitling systems.
Read Moreby Roldano Cattoni | Jul 4, 2023 | Corpora | 0
MuST-C is a multilingual speech translation corpus whose size and quality facilitates the training of end-to-end systems for speech translation from English into several languages. For each target language, MuST-C comprises...
Read Moreby Luisa Bentivogli | Jun 1, 2023 | Corpora | 0
MuST-SHE: a multilingual benchmark allowing for a fine-grained analysis of gender bias in Machine Translation and Speech Translation.
Read Moreby Matteo Negri | Jun 1, 2023 | Corpora | 0
Multilingual benchmark built from European Parliament speeches and annotated with Named Entities and Terminology
Read Moreby Mauro Cettolo | May 30, 2023 | Corpora | 0
Ready-to-use version for MT research purposes of the multilingual transcriptions of TED talks
Read Moreby Mauro Cettolo | May 30, 2023 | Corpora | 0
Annotation of dubbing segments based on the Heroes corpus
Read More
📢 Proud to announce that our @mgaido91 has won the 2023 @EAMTee best Phd thesis award with his work on “Direct speech translation toward high-quality, inclusive, and augmented systems". 💯👏
👇Thesis: https://iris.unitn.it/handle/11572/374507
👇About the award: https://eamt.org/2024/04/17/2023-anthony-c-clarke-award-for-the-eamt-best-thesis-awardee-announcement/
#NLProc
🚨 The industry track call for papers for #EMNLP2024 is now out!
Deadline: July 18th, 2024
We welcome submissions from industry groups on a wide variety of topics, please see the CfP for details!
We just released our first set of #LLMs of the #Minerva family, #pretrained from scratch on #English and #Italian, 500 billion words, 350M to 3B parameters. #FAIR @fondazione_fair and @Cineca1969! Documentation and transparency first! https://www.ansa.it/amp/osservatorio_intelligenza_artificiale/notizie/approfondimenti/2024/04/23/ecco-minerva-la-prima-famiglia-di-llm-addestrati-da-zero-in-italiano_ac6cc4b0-6c65-4cef-a7e3-dfca7bc3c2c4.html
Our pick of the week by @lina_conti: "An image speaks a thousand words, but can everyone listen? On translating images for cultural relevance" by @simi_97k, @Sathya8NR, @yueqi_song, and @gneubig.
#translation #image #culture #LLM