Audio Visual Tutorials

amiteliav/CSD-Audio-Visual

Building upon our prior work on audio-only CSD, this repository presents a multimodal approach that incorporates visual information to enhance performance. This new model expands the capabilities of ...

Frontiers

Embedding-based pair generation for contrastive representation learning in audio-visual surveillance data

Smart cities deploy various sensors such as microphones and RGB cameras to collect data to improve the safety and comfort of the citizens. As data annotation is expensive, self-supervised methods such ...

IEEE

CoAVT: A Cognition-Inspired Unified Audio-Visual-Text Pre-Training Model for Multimodal Processing

Abstract: There has been a long-standing quest for a unified audio-visual-text model to enable various multimodal understanding tasks, which mimics the listening, seeing, and reading process of human ...

Frontiers

Neural speech tracking in a virtual acoustic environment: audio-visual benefit for unscripted continuous speech

1 Neuropsychology Lab, Department of Psychology, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany 2 Department of Medical Physics and Acoustics, Carl von Ossietzky University of ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results