Introduction VC aims to convert the non-linguistic information of the sp...
Introduction The ASR system can be categoried as three classes by its ou...
Background Automatic Speech Recognition (ASR) uses both acoustic model (...
Introduction In the previous articals, we have learnt the CTC loss makes...
Introduction Keyword Spotting (KWS) aims at detecting predefined key-wor...
Multi-headed Attention 一个attention head可能权重大部分在某处,不能提取丰富的信息,需要多个进行融合。 Fu...
注意力机制 RNN编码-解码模型 论文[1]中,从RNN编码-解码模型演进出注意力机制。RNN编码-解码模型中,编码器输入序列,是编码器RNN在...
背景 手写体识别、语音识别中,输入数据和输出的识别结果长度不一致、而且可变。直接用神经网络训练需要预分割、调整,得到对应关系,这很难做到。CTC...
网络架构 可以分为3个部分 Head Region Proposal Network(RPN) Classification Network R...