1 October 2022 to 31 March 2024

Bias Mitigation and Gender Neutralization Techniques  for  Automatic Translation

With language technologies entering widespread use and being deployed at a massive scale, their societal impact has raised concern both within and outside the scientific community. Indeed, while such technologies bring undeniable advantages in many contexts, it is also evident that they come with inherent risks, such as reproducing (or even amplifying) real-world asymmetries by codifying and entrenching various kinds of biases.

Within this project we aim at making automatic translation technology more reliable and inclusive when it comes to the notion of gender. This is achieved following two orthogonal perspectives:

  • Objective 1: develop new gender bias mitigation techniques able to reduce the tendency of current ST systems to overproduce masculine forms and perpetuate gender stereotypes in their outputs;
  • Objective 2: go beyond the masculine/feminine dichotomy and develop resources and methods for gender-neutral translation, where unnecessary and potentially discriminatory gender specifications are avoided.

Project Results: objective 1 (gender bias mitigation)

Publications
Datasets

Project Results: objective 2 (gender-neutral translation)

Publications
Datasets
  • GeNTE: the first natural benchmark for gender-neutral translation, available for English-Italian. GeNTE is publicly released together with a reference-free evaluation metric, which is trained on synthetic gender-neutral data generated with GPT.
  • INES: a synthetic test suite for assessing gender-neutral translation in the German-English direction, which was used in the “Test Suites” task at WMT 2023.
Workshop

Open source code

The code developed during the project is released in open source in our public Github repository FBK-fairseq, where it is listed according to the related publications.