TensorFlow: Simple Audio Recognition

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/kangear/article/details/82559834

TensorFlow: Simple Audio Recognition

原文链接:https://www.tensorflow.org/tutorials/sequences/audio_recognition
我按照做了一遍实验,总结与原文不同的地方。

读这个文章之前需要先看在 Ubuntu 上安装 TensorFlow,安装tensorflow之后再进行Simple Audio Recognition。

Training

训练按照教程做,没有出错,只是花费时间有点长。i7处理器用了1晚上时间。

Training Finished

训练完进行语音识别,可以识别了。只是感觉速度有点慢,2S左右,有点小失望。不过没有放弃,继续向下进行。
文章中只用了Python的,我摸索出使用c++的

bazel run tensorflow/examples/speech_commands:label_wav_cc -- --graph=/tmp/my_frozen_graph.pb --labels=/tmp/speech_commands_train/conv_labels.txt --wav=/tmp/speech_dataset/left/a5d485dc_nohash_0.wav

速度比较快,只需要0.3秒;

Running the Model in an Android App

预编译的下载APK文件下载不起来,只能硬着头皮编译了。勉强编译过在Android上测试了,我把编译好的APK上传了,可以方便直接安装体验。https://blog.csdn.net/kangear/article/details/82052938

Streaming Accuracy

这一节主要是流式语音命令词识别,开始以为很简单,结果还是前后花了两天时间才搞定。bazel的安装、两个bazel run都需要花费很长时间。可能是我在一套新的tf源码运行的问题。bazel run这长串命令实质操作就是将tf编译成库,再编译指定的example,最后运行这个example。
重点说两点:
1. 如果Virtualenv下安装的tf,那么就需要在Virtualenv情况下运行bazel命令,否则会遇到一堆Python相关的错误;
2. 我在遵守第一第的情况下,还是出了不少问题。

问题:

AttributeError: '_NamespacePath' object has no attribute 'sort'
ImportError: No module named 'keras_applications'

解决方法:

pip3 install --upgrade pip setuptools keras_applications

运行成功之后是这样的:

....
2018-09-09 16:14:38.200902: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.5% matched, 42.1% correctly, 2.4% wrongly, 0.0% false positives 
2018-09-09 16:14:40.008166: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 498300ms: on: 0.716484 (Correct)
2018-09-09 16:14:40.008190: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.3% matched, 41.9% correctly, 2.4% wrongly, 0.0% false positives 
2018-09-09 16:14:42.398641: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 508320ms: down: 0.700959 (Correct)
2018-09-09 16:14:42.398670: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.1% matched, 41.8% correctly, 2.4% wrongly, 0.0% false positives 
2018-09-09 16:14:42.875008: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 510300ms: go: 0.750342 (Correct)
2018-09-09 16:14:42.875034: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.4% matched, 42.1% correctly, 2.3% wrongly, 0.0% false positives 
2018-09-09 16:14:43.983504: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 514950ms: stop: 0.713842 (Correct)
2018-09-09 16:14:43.983530: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.8% matched, 42.4% correctly, 2.3% wrongly, 0.0% false positives 
2018-09-09 16:14:46.054264: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 523410ms: go: 0.702253 (Correct)
2018-09-09 16:14:46.054293: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.6% matched, 42.3% correctly, 2.3% wrongly, 0.0% false positives 
2018-09-09 16:14:50.519370: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 540630ms: yes: 0.719828 (Wrong)
2018-09-09 16:14:50.519399: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 43.6% matched, 40.9% correctly, 2.8% wrongly, 0.0% false positives 
2018-09-09 16:14:51.325376: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 543870ms: left: 0.711105 (Correct)
2018-09-09 16:14:51.325408: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.0% matched, 41.2% correctly, 2.7% wrongly, 0.0% false positives 
2018-09-09 16:14:52.322932: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 547710ms: down: 0.712574 (Correct)
2018-09-09 16:14:52.322968: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.3% matched, 41.5% correctly, 2.7% wrongly, 0.0% false positives 
2018-09-09 16:14:56.220347: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 562950ms: left: 0.701853 (Correct)
2018-09-09 16:14:56.220376: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 43.6% matched, 41.0% correctly, 2.7% wrongly, 0.0% false positives 
2018-09-09 16:14:56.813999: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 565410ms: yes: 0.703025 (Correct)
2018-09-09 16:14:56.814025: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 43.9% matched, 41.3% correctly, 2.6% wrongly, 0.0% false positives 
2018-09-09 16:14:57.257761: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 567270ms: right: 0.727554 (Correct)
2018-09-09 16:14:57.257784: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.2% matched, 41.6% correctly, 2.6% wrongly, 0.0% false positives 
2018-09-09 16:14:58.166058: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 571050ms: yes: 0.714452 (Correct)
2018-09-09 16:14:58.166088: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.5% matched, 41.9% correctly, 2.6% wrongly, 0.0% false positives 
2018-09-09 16:15:00.302022: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 579840ms: up: 0.707446 (Correct)
2018-09-09 16:15:00.302048: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.3% matched, 41.8% correctly, 2.6% wrongly, 0.0% false positives 
2018-09-09 16:15:02.618039: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 589200ms: left: 0.718778 (Correct)
2018-09-09 16:15:02.618067: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.2% matched, 41.6% correctly, 2.5% wrongly, 0.0% false positives 
2018-09-09 16:15:04.055505: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 594990ms: right: 0.704798 (Correct)
2018-09-09 16:15:04.055531: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.2% matched, 41.7% correctly, 2.5% wrongly, 0.0% false positives 
2018-09-09 16:15:04.841136: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 598200ms: _unknown_: 0.70086 (Correct)
2018-09-09 16:15:04.841166: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.5% matched, 42.0% correctly, 2.5% wrongly, 0.0% false positives 
2018-09-09 16:15:05.253720: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.5% matched, 42.0% correctly, 2.5% wrongly, 0.0% false positives 
(tensorflow-dev) tony@pc:~/Work/01_atom/tf-demo/tf2$ 

可以看到这个example是直接运行并自动进行比较了。自己比较是Correct还是Wrong,没有显示从有这个音频到识别到花费了多长时间。

扫描二维码关注公众号,回复: 3241137 查看本文章

RecognizeCommands

还没有看对应的章节;

Advanced Training

还没有看对应的章节,用到了再继续看;

猜你喜欢

转载自blog.csdn.net/kangear/article/details/82559834