Abstract: Acoustic features play an important role in improving the quality of the synthesised speech. Currently, the Mel spectrogram is a widely employed acoustic feature in most acoustic models.
Abstract: In this work, we propose CleanMel, a single-channel Mel-spectrogram denoising and dereverberation network for improving both speech quality and automatic speech recognition (ASR) performance ...
A study published in the journal Information Sciences introduces a novel framework for speech emotion recognition using dual-channel spectrograms and optimized deep features. Their proposed ...
We incorporate effective components of the TasNet into a freq-domain separation method. We introduce a solution for directly optimizing the separation criterion in freq-domain networks. Our exp ...
This repo is the official implementation of "Accuracy Enhancement Method for Speech Emotion Recognition from Spectrogram using Temporal Frequency Correlation and Positional Information Learning ...