语音信号处理论文优选:Handling Background Noise in Neural Speech Generation - 代码天地

语音信号处理论文优选:Handling Background Noise in Neural Speech Generation

其他 2021-03-25 21:44:21 阅读次数: 0

声明：语音信号处理（DSP)论文优选系列主要分享论文，分享论文不做直接翻译，所写的内容主要是我对论文内容的概括和个人看法。如有转载，请标注来源。

欢迎关注微信公众号：低调奋进

Handling Background Noise in Neural Speech Generation

本文章是google在2021.02.23更新的文章，主要研究在语音编码器如何处理背景噪声，使声码器合成的语音质量更高。具体的文章链接

https://arxiv.org/pdf/2102.11906.pdf

（此类文章Wie经验分享类）

1 研究背景

低码率的语音编码器（语音编码器可参考http://www.ece.ubc.ca/~brucew/ebook/VOIP/004.pdf）由于基于神经网络的声码器的发展音质得到巨大提高。当输入的语音存有噪声的时候，语音编码器的音质将会下降，因此本文实验如何来处理该噪声，使合成的音质更高。

2 详细设计

本文主要在声码器前端加入denoiser模型来去噪。其实验主要对比以下5种方案：

1）c2c: clean-to-clean

2) n2n: noise-to-noisy

3) n2c: noise-to-clean

4) dc2c:在c2c前边使用denoiser模型进行处理

5) dn2n:在n2n前边使用denoiser模型进行处理

其中本文设计的声码器waveGRU如图1所示，其中encoder是把波形转成log melspectra，decoder把log melspectra转成语音波形。denoiser的模型TASNet如图2所示。

3 实验

实验先对比clean和noise的MOS值，clean的较高（图3）。以上几种方案的对比结果如下：

1）c2c: 可以很好处理clean的语音，但不能处理带噪的语音；

2）n2n:可以提高带噪语音质量，但牺牲了干净语音质量；

3）n2c:可以提高带噪语音质量，但会造成音素丢失；

4）dc2c:可以很好处理干净和带噪数据；

table1在n2n上展示使用denoiser具有提高音质效果。

4 总结

本文采用不同策略来处理神经网络噪声，使其可以很好的处理干净和带噪数据。

猜你喜欢

转载自blog.csdn.net/liyongqiang2420/article/details/115183825

语音信号处理论文优选:Handling Background Noise in Neural Speech Generation

语音合成论文优选: A Neural Text-to-Speech Model Utilizing Broadcast Data Mixed with Background Music

论文阅读：Noise-Resilient Training Method for Face Landmark Generation From Speech

Waveform Modeling and Generation Using Hierarchical Recurrent Neural Networks for Speech Bandwidth Extension

语音合成论文优选：AutoML优化TTSLightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search

语音合成论文优选：Triple M: A Practical Neural Text-to-speech System With Multi-guidance Attention And Multi-

论文《Chinese Poetry Generation with Recurrent Neural Network》阅读笔记

论文笔记：DRAW: A Recurrent Neural Network For Image Generation

论文阅读Practical Block-wise Neural Network Architecture Generation

论文翻译：2021_A New Real-Time Noise Suppression Algorithm for Far-Field Speech Communication Based on ...

Neural Speech Synthesis with Transformer Network

论文翻译：2020_Lightweight Online Noise Reduction on Embedded Devices using Hierarchical Recurrent Neural...

语音神经科学—02.Speech synthesis from neural decoding of spoken sentences

ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech告诉我上面论文的作者

【论文学习笔记】《Deep Voice: Real-time Neural Text-to-Speech》

A Deep Neural Network Approach To Speech Bandwidth Expansion

Non-Autoregressive Neural Text-to-Speech

机器学习论文笔记（二）：Practical Block-wise Neural Network Architecture Generation

论文笔记：Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

论文：Show, Attend and Tell: Neural Image Caption Generation with Visual Attention-阅读总结

paper解读：Decomposable Neural Paraphrase Generation

Speech Recognition，初见语音识别——语音信号处理学习（二）

Speech Separation，语音分离详解——语音信号处理学习（七）

Speech Synthesis，语音合成详解——语音信号处理学习（八）

论文阅读笔记六十四: Architectures for deep neural network based acoustic models defined over windowed speech waveforms(INTERSPEECH 2015)

背景建模或前景检测(Background Generation And Foreground Detection) ViBe算法

背景建模与前景检测（Background Generation And Foreground Detection）(转)

Deep Reinforcement Learning for Dialogue Generation 论文阅读 A Diversity-Promoting Objective Function for Neural Conversation Models论文阅读

(IN19)Full-Sentence Correlation:a Method to Handle Unpredictable Noise for Robust Speech Recognition

【NQG】Paragraph-level Neural Question Generation with Maxout Pointer and Gated Self-attention Networks论文笔记

今日推荐

周排行

LRU cache算法

windows10, 自带的OpenSSH, key权限问题, 文件权限问题

测试用例书写方法

HIVE-默认分隔符的（linux系统的特殊字符）查看，输入和修改

最贵的AMD 7nm显卡来了！这设计够狂野

java多线程简单demo

[ 转载 ]在Android系统上使用busybox——最简单的方法

QT connect学习

BFSIFT算法分析

Xcode10：library not found for -lstdc++.6.0.9 临时解决

每日归档

更多

2024-08-06(0)

2024-08-05(0)

2024-08-04(0)

2024-08-03(0)

2024-08-02(0)

2024-08-01(0)

2024-07-31(0)

2024-07-30(0)

2024-07-29(0)

2024-07-28(0)