Our pick of the week by @sarapapi: "Retrieval-Augmented Generation for AI-Generated Content: A Survey" by Penghao Zhao, Hailin Zhang, Qinhan Yu, Zhengren Wang, Yunteng Geng, Fangcheng Fu, Ling Yang, Wentao Zhang, Jie Jiang, Bin Cui.
#RAG #survey
An interesting survey about #RAG and its interplay with #multimodality: Retrieval-Augmented Generation for AI-Generated Content: A Survey
https://arxiv.org/pdf/2402.19473
@fbk_mt
Our pick of the week by @mgaido91: "Context-Driven Dynamic #Pruning for Large #Speech #Foundation Models" by Masao Someki, Shikhar Bharadwaj, Atharva Anand Joshi, Chyi-Jiunn Lin, Jinchuan Tian, Jee-weon Jung, @shinjiw_at_cmu, et al. (#INTERSPEECH2025).
as we are organizing the second edition of the IWSLT model compression task, happy to see new works on pruning large speech model based on external context (speaker, acoustic events, language)
https://arxiv.org/pdf/2505.18860
@fbk_mt
Our pick of the week by @FBKZhihangXie: "SimulMEGA: MoE Routers are Advanced Policy Makers for Simultaneous Speech Translation" by Chenyang Le, Bing Han, Jinshun Li, Songyong Chen, and Yanmin Qian (2025)
#Speech #Simultaneous #Translation #MOE #SpeechTech
🚀 SimulMEGA: MoE Routers as advanced policy makers for Simultaneous Speech Translation 🎧🌍
Mixture-of-Experts routing → smarter decisions on when & how to translate, balancing latency vs quality in real-time speech. Paper link at https://arxiv.org/pdf/2509.01200v1
Our pick of the week by @beomseok_lee_: "#Speech Discrete Tokens or Continuous Features? A Comparative Analysis for Spoken Language Understanding in #SpeechLLMs" by Dingdong Wang, Junan Li, Mingyu Cui, Dongchao Yang, Xueyuan Chen, Helen Meng (#EMNLP2025)
#SLU #SpeechTech
🤔 Ever wondered how discrete tokens vs. continuous features behave in SpeechLLMs?
This new work dives into 6 SLU tasks and reveals some interesting takeaways!
https://arxiv.org/abs/2508.17863