WAGS
Ready-to-use version for MT research purposes of the multilingual transcriptions of TED talks
Read Moreby Mauro Cettolo | Apr 30, 2024 | Corpora | 0
Ready-to-use version for MT research purposes of the multilingual transcriptions of TED talks
Read Moreby Dennis Fucci | Oct 20, 2023 | Corpora | 0
Text corpora for Spanish, French, and Italian containing gendered words referring to the first-person speaker
Read Moreby Beatrice Savoldi | Oct 19, 2023 | Corpora | 1
The INclusive Evaluation Suite (INES) is a test set designed to assess MT systems ability to produce gender-inclusive translations for the German→English language pair. By design, each German source sentence in INES includes an...
Read Moreby Beatrice Savoldi | Oct 9, 2023 | Corpora | 0
GeNTE (Gender-Neutral Translation Evaluation) is a natural, bilingual corpus designed to benchmark the ability of machine translation systems to generate gender-neutral translations. Built from European Parliament speeches,...
Read Moreby Marco Gaido | Jul 7, 2023 | Corpora | 0
EC Short Clips is a test set dedicated to evaluate automatic subtitling systems.
Read Moreby Marco Gaido | Jul 7, 2023 | Corpora | 0
EuroParl Interviews is a test set dedicated to evaluate automatic subtitling systems.
Read Moreby Matteo Negri | Jun 1, 2023 | Corpora | 0
Multilingual benchmark built from European Parliament speeches and annotated with Named Entities and Terminology
Read Moreby Mauro Cettolo | May 30, 2023 | Corpora | 0
Annotation of dubbing segments based on the Heroes corpus
Read Moreby Beatrice Savoldi | May 30, 2023 | Corpora | 0
This multilingual dataset was created within the TOSCA-MP project as ground truth data for the evaluation of automatic transcription and spoken language translation technologies.
Read Moreby Marco Gaido | May 30, 2023 | Corpora | 0
The largest freely-available Synthetic Corpus for Automatic Post-Editing
Read Moreby Matteo Negri | May 30, 2023 | Corpora | 0
English-Italian corpus with annotated bilingual terms in IT domain
Read Moreby Beatrice Savoldi | May 30, 2023 | Corpora | 0
Cross-Lingual Textual Entailment Dataset
Read More
Our pick of the week by @lina_conti: "Exploring NMT Explainability for Translators Using NMT Visualising Tools" by Gonzalez-Saez, @MariamNakhle, @MeLlamoJamesT, @raheel_qader, @didier_schwab, et al., 2024.
#NMT #NLP #NLProc #explainiableAI #XAI
Our pick of the week by @beomseok_lee_: "DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding" by Shon, @shinjiw_at_cmu, et al., 2024.
#SLU #speech #LLM
Our pick of the week by @BeatriceSavoldi: "Rethinking Model Evaluation as Narrowing the Socio-Technical Gap" by @QVeraLiao and @ZiangXiao, 2023.
#Human #HumanCentered #Model #Evaluation #ModelEvaluation #AI
📢Come and join our group!
We offer a fully funded 3-year PhD position with the IECS Doctorate School @UniTrento:
Speech Translation in the LLM Era (Area A6)
⏱️Deadline: August 5th, 2024, h 4.00pm (CEST)
📌Application Details: https://iecs.unitn.it/education/admission/call-for-application
#NLProc @FBK_research