JUMAS addresses the need to build an infrastructure able to optimise the information workflow in order to facilitate later analysis. New models and techniques for representing and automatically extracting the embedded semantics derived from multiple data sources will be developed. The most important goal of the JUMAS system is to collect, enrich and share multimedia documents annotated with embedded semantic minimising manual transcription activity. JUMAS is tailored at managing situations in which multiple cameras and audio sources are used to record assemblies in which people debates and event sequences need to be semantically reconstructed for future consultations. The prototype of JUMAS will be tested interworking with legacy systems, but the system can be viewed as able to support business processes and problem-solving in a variety of domains.
ππΌ Excited to share our work on Speech Foundation Model for data crowdsourcing at COLING 2025 ππΌ
Our co-author Laurent Besacier (@laurent_besacie) at NAVER LABS Europe will be presenting -- don't miss it.
ππΌ Details: https://mt.fbk.eu/1-paper-accepted-at-coling-2025
Exciting news: @iwslt is co-located with #ACL2025NLP again this year! π
Interested in speech processing? Check out the new task on instruction following β any model can participate! π
π
Data release: April 1
β³ Submission deadline: April 15
Donβt miss it! π¬ #NLP #SpeechTech
Weekly pick from the #MeetweenScientificWatch: βVideo-SALMONN: Speech-enhanced audio-visual large language modelsβ β Redefining video comprehension with speech-aware AV-LLMs and groundbreaking QA accuracy. π₯π€π€
Iβm glad to announce that our work βHow "Real" is Your Real-Time Simultaneous Speech-to-Text Translation System?β has been accepted at the Transactions of @aclanthology (TACL)! π
The preprint is available here: