simulstream
simulstream is a Python library for simultaneous/streaming speech recognition and translation. It...
Read Moreby Marco Gaido | Jan 20, 2026 | Software | 0
simulstream is a Python library for simultaneous/streaming speech recognition and translation. It...
Read Moreby Marco Gaido | May 17, 2024 | Software | 0
Open source repository with the code and models used in recent papers
Read Moreby Marco Gaido | May 17, 2024 | Software | 0
SubSONAR evaluates the quality of SRT files using the multilingual multimodal SONAR model. The evaluation accounts for the semantic similarity (computed as a cosine similarity) between each subtitle block and the corresponding...
Read Moreby Marco Gaido | May 17, 2024 | Software | 0
pangolinn is a Python library for neural network developers that contains test suites aimed at...
Read Moreby Matteo Negri | May 30, 2023 | Software | 0
A neural adaptive machine translation system that adapts to context and learns from corrections
Read Moreby Dennis Fucci | May 30, 2023 | Software | 0
AQET (Adaptive Quality Estimation Tool) is an open-source package for performing Quality Estimation for Machine Translation able to continuously learn from post-edited sentences.
Read Moreby Andrea Piergentili | May 30, 2023 | Software | 0
An extension of MGIZA++, which allows to align sentence pair in an online mode.
Read Moreby Dennis Fucci | May 30, 2023 | Software | 0
The IRST Language Modeling (IRSTLM) Toolkit features algorithms and data structures suitable to estimate, store, and access very large n-gram language models.
Read Moreby Dennis Fucci | May 30, 2023 | Software | 0
Moses is a statistical machine translation system that allows you to automatically train translation models for any language pair.
Read More
🎙️ Two people. Two languages. One conversation!
No delays. No switching languages. No one is left out.
This is what we are building.
#SpeechAI #MultilingualAI #HorizonEurope
Four years ago, NLLB set a milestone with MT for 200 languages. Today we present OMT: a family of models that extend support to 1600 languages while delivering competitive results in high/mid-resource language, with our 1B-8B models matching frontier and open 70B LLMs.
🧵(1/n)
📢I'm organizing a BoF session at #EACL2026 called Tokenization & Beyond, aiming to gather researchers exploring tokenization and alternatives such as byte-level and pixel-based approaches. Sign up using the form if you're interested! #NLProc @eaclmeeting