Journal of Zhejiang University SCIENCE C 2010 Vol.11 No.3 P.160-174


Evaluating single-channel speech separation performance in transform-domain

Author(s):  Pejman MOWLAEE, Abolghasem SAYADIYAN, Hamid SHEIKHZADEH

Affiliation(s):  Department of Electronic Engineering, Amirkabir University of Technology, Tehran 15875-4413, Iran

Corresponding email(s):   pmowlaee@ieee.org, {eeas335, hsheikh}@aut.ac.ir

Key Words:  Single-channel separation (SCS), Magnitude spectrum, Vector quantization (VQ), Subband perceptually weighted transformation (SPWT), Spectral distortion (SD)

single-channel separation (SCS) is a challenging scenario where the objective is to segregate speaker signals from their mixture with high accuracy. In this research a novel framework called subband perceptually weighted transformation (SPWT) is developed to offer a perceptually relevant feature to replace the commonly used magnitude of the short-time Fourier transform (STFT). The main objectives of the proposed SPWT are to lower the spectral distortion (SD) and to improve the ideal separation quality. The performance of the SPWT is compared to those obtained using mixmax and Wiener filter methods. A comprehensive statistical analysis is conducted to compare the SPWT quantization performance as well as the ideal separation quality with other features of log-spectrum and magnitude spectrum. Our evaluations show that the SPWT provides lower SD values and a more compact distribution of SD, leading to more acceptable subjective separation quality as evaluated using the mean opinion score.

A. Barari@ Department of Civil Engineering Aalborg University, Denmark Editor-in-chief of Int. J. of Int. Commu. in Civil Eng. (ICCE) Editor-in-chief of Int. J. of Math. and Com. Editor of Int. J. of Res. and Rev. in Appl. Sciences http://www.arpapress.com/ijrr<ab@civil.aau.dk>

2010-02-24 18:12:14

This paper present a subband transformation to the previously used STFT features in single-channel speech separation problem. Through experiments and by analyzing the upper-bound performance of different features, this paper shows that the proposed subband transformation improves the separation performance as well as the quantization behaviour. The proposed transform-based method is compared to the well-known mask-based and STFT-based method and it is demonstrated that the proposed method achieves a higher perceived speech quality in the separated signals compared to other benchmarks methods in single-channel speech separation problem.

Reihaneh Lavafi@PhD student of North Dakota State University & Editor for International Journal of Digital Multimedia Broadcasting (Hindawi Publishing Corporation)<reihaneh.lavafi@ndsu.edu>

2010-02-24 06:48:29

In this paper, a novel approach is proposed to solve single-channel speech separation. To this end, the predominantly used STFT features are replaced by the transformation based features. Through experiments, it is shown that using these features lead to outperformance in speech separation performance and the resulting perceived speech quality. Listening experiments also show that the proposed approach results in improved performance compared to previous methods of mask-based (both binary mask and Wiener filtering) and STFT-based methods. I recommend to read this paper since introduction and literature review is very comprehensive and the simulation results are very insightful. Upper-bound performance for separation is also studied.

Alireza @PhD student at Aalborg university<alr@iet.aau.dk>

2010-02-22 08:15:54

I highly recommend this paper for researchers in the field of single-channel speech separation. Since this paper presents a comprehensive literature review in its introduction, it is strongly recommended for those new to the field and is very helpful in this viewpoint. It also presents new results to confirm the outperformance of the proposed method compared to other previous methods.

Mehdi Hosseini@M.Sc student at electrical engineering, Amirkabir University of Technology<sm.hosseini62@gmail.com>

2010-02-21 22:03:06

This paper proposes a novel approach to solve open problem of single-channel speech separation. To this end, they introduced subband features in perceptual subbands and replaced the pre-dominantly used STFT features in previous papers. The paper derives the subband solution based on the theoretical derivations. They confirmed the outperformance of the proposed method through PESQ scores and informal llistening tests. The paper also presents the spectral distortion results of the STFT and the proposed SPWT approach showing the upper-bounds of the speech separation problem.

Emin Devrim fidan@Master Student on Youth Policy in Dumlupinar University<eminfidan@gmail.com>

2010-02-20 22:42:12

This paper focuses on presenting a novel approach to tackle an already open problem, single-channel speech separation. The paper presents comprehensive and insighful information in its introduction. The proposed subband transformation is derived based on theoretical derivations and the outperformance of the proposed method is confirmed through evaluting the perofrmance in terms of listening tests and PESQ scores. Spectral distortion is also used as an objective term to show the effetiveness of the proposed approach in terms of quantization behavior compared to the pre-dominantly used STFT feature vectors in separation problem.

A.Kimiaeifar@Editor of international Journal of Research and Reviews in Applied Sciences Editor of International Journal of Mathematics & Computation Department of Mechanical Engineering, Aalborg University, Pontoppidanstraede 101, DK-9220 Aalborg East, Denmark<a.kimiaeifar@gmail.com>

2010-02-20 22:13:39

The paper presents a novel approach to single-channel speech separation which is an open problem in signal and speecg processing for decades.
Introducing subband transformation, the paper shows improved results compared to the pre-dominantly used STFT-based and mask-based methods previously used as single-channel speech separation methods. The paper is based on firm mathematical derivations and comprehensive simulation and experimental results to confirm the outperformance of the proposed method. More specifically, a authors provided a webpage including the wave files for assessing the perceived speech quality of the separated signals and in order to compare the output files of the proposed method with those obtained by other separation methods. The listening results along with the PESQ scores show outperformance of the proposed method compared
to other benchmarks. The paper also presents the upper-bound performance obtained by the quantizer used. More specifically, the quantization performance for STFT and the proposed SPWT is evaluated for different code book size and other parameter setting.
All and all, I really enjoyed the introduction part which was very insightful as a literature review for me on single-channel speech separation. Besides, the discussions and the simulation results carefully show how the proposed method achieves a better performance compared to other methods.

