X-Media addresses the issue of knowledge management in complex distributed environments. It studies, develops and implements large scale methodologies and techniques for knowledge management able to support sharing and reuse of knowledge that is distributed in different media (images, documents and data) and repositories (data bases, knowledge bases, document repositories, etc.).
Our pick of the week by @FBKZhihangXie: "When End-to-End is Overkill: Rethinking Cascaded Speech-to-Text Translation" by Anna Min, et al, 2025.
Today's task: model compression!!
🎯 Goal: Compress a large, general-purpose multimodal model, making speech translation more efficient ⚡️, deployable 📲, and sustainable ♻️, while preserving translation quality ⭐️
#AI #SpeechTech #ModelCompression #LLMcompression
First up, a new task for 2025:
*Instruction-following for speech processing!*
Explore instruction-following for speech ⇨
Integrate speech foundation models with LLMs across tasks such as speech translation, recognition, summarization, and QA.
🔗:
📢Workshop gratuito 05/02: “Lo stato dell'arte nelle tecnologie per il riconoscimento del parlato.”
Diretta YouTube: https://www.youtube.com/live/i4x7w8fIIXo?si=wYvvrO3-MSh7Yik4
Registrazione: https://www.eventbrite.com/e/biglietti-lo-stato-dellarte-nelle-tecnologie-per-il-riconoscimento-del-parlato-1109098797359?aff=oddtdtcreator