screen wav2letter/build/Train train --flagsfile wav2letter/tutorials/1-librispeech_clean/train.cfg
For a speech recognition dataset, we usually have access to a set of audio files and their transcriptions. Create an experiment path and download the dataset.
>W2LDIR=/home/$USER/w2l
>mkdir -p$W2LDIR
>wget -qO- http://www.openslr.org/resources/12/train-clean-100.tar.gz | tar xvz -C$W2LDIR
>wget -qO- http://www.openslr.org/resources/12/dev-clean.tar.gz | tar xvz -C$W2LDIR
>wget -qO- http://www.openslr.org/resources/12/test-clean.tar.gz | tar xvz -C$W2LDIR
此处文件不容易下载,实在不行就用下载工具然后自己解压。
wav2letter/tutorials/1-librispeech_clean/prepare_data.py --src $W2LDIR/LibriSpeech/ --dst $W2LDIR
wav2letter/tutorials/1-librispeech_clean/prepare_lm.py --dst $W2LDIR
prepare_lm.py 要下载一个3-gram.arpa.gz,又是超慢。忍受不了一样下载工具上,然后修改prepare_lm.py , 修改如下:
os.system(
"gunzip {lm}.arpa.gz -c > {o}".format(lm=lm, o=arpa_file)
)
Step 2: Training the Acoustic Model
首先将wav2letter/tutorials/1-librispeech_clean/train.cfg文件中的[...]替换成正确的路径
--datadir=/home/xxx/w2l/
--tokensdir=/home/xxx/w2l/
--rundir=/home/xxx/w2l/saved_models
--archdir=/home/xxx/wav2letter/tutorials/1-librispeech_clean/
然后执行:
/home/xxx/wav2letter/build/Train train --flagsfile /home/xxx/wav2letter/tutorials/1-librispeech_clean/train.cfg
训练后的logs保存在你设置的rundir目录下。
然后就code dump,据说libsnd有问题,有空在查。
terminate called after throwing an instance of 'terminate called recursively
terminate called recursively
terminate called recursively
*** Aborted at 1565019127 (unix time) try "date -d @1565019127" if you are using GNU date ***
terminate called recursively
terminate called recursively
std::runtime_error'
what(): loadSoundInfo: unknown format or could not open stream
terminate called recursively
terminate called recursively
terminate called recursively
terminate called recursively
terminate called recursively
terminate called recursively
PC: @ 0x7f8ae13dd428 gsignal
*** SIGABRT (@0x3e8000007d1) received by PID 2001 (TID 0x7f8ab1fff700) from PID 2001; stack trace: ***
@ 0x7f8ae20c7390 (unknown)
@ 0x7f8ae13dd428 gsignal
@ 0x7f8ae13df02a abort
@ 0x7f8b0033906d __gnu_cxx::__verbose_terminate_handler()
@ 0x7f8b002aa436 __cxxabiv1::__terminate()
@ 0x7f8b0032e349 __cxa_call_terminate
@ 0x7f8b002a9088 __gxx_personality_v0
@ 0x7f8b347c9aab __trunctfsf2
@ 0x7f8b347c9f49 __trunctfdf2
@ 0x5dc36b w2l::loadSoundInfo()
@ 0x5d22ec _ZN3w2l23W2lNumberedFilesDataset15loadSampleSizesEv._omp_fn.0
@ 0x7f8b347e56d5 gomp_display_affinity_thread
@ 0x7f8ae20bd6ba start_thread
@ 0x7f8ae14af41d clone
@ 0x0 (unknown)
Aborted (core dumped)