TensorFlow: Simple Audio Recognition

原文链接：https://www.tensorflow.org/tutorials/sequences/audio_recognition
我按照做了一遍实验，总结与原文不同的地方。

读这个文章之前需要先看在 Ubuntu 上安装 TensorFlow，安装tensorflow之后再进行Simple Audio Recognition。

Training

训练按照教程做，没有出错，只是花费时间有点长。i7处理器用了1晚上时间。

Training Finished

训练完进行语音识别，可以识别了。只是感觉速度有点慢，2S左右，有点小失望。不过没有放弃，继续向下进行。
文章中只用了Python的，我摸索出使用c++的

bazel run tensorflow/examples/speech_commands:label_wav_cc -- --graph=/tmp/my_frozen_graph.pb --labels=/tmp/speech_commands_train/conv_labels.txt --wav=/tmp/speech_dataset/left/a5d485dc_nohash_0.wav

速度比较快，只需要0.3秒；

Running the Model in an Android App

预编译的下载APK文件下载不起来，只能硬着头皮编译了。勉强编译过在Android上测试了，我把编译好的APK上传了，可以方便直接安装体验。https://blog.csdn.net/kangear/article/details/82052938

Streaming Accuracy

这一节主要是流式语音命令词识别，开始以为很简单，结果还是前后花了两天时间才搞定。bazel的安装、两个bazel run都需要花费很长时间。可能是我在一套新的tf源码运行的问题。bazel run这长串命令实质操作就是将tf编译成库，再编译指定的example，最后运行这个example。
重点说两点：
1. 如果Virtualenv下安装的tf，那么就需要在Virtualenv情况下运行bazel命令，否则会遇到一堆Python相关的错误；
2. 我在遵守第一第的情况下，还是出了不少问题。

问题：

AttributeError: '_NamespacePath' object has no attribute 'sort'
ImportError: No module named 'keras_applications'

解决方法：

pip3 install --upgrade pip setuptools keras_applications

运行成功之后是这样的：

....
2018-09-09 16:14:38.200902: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.5% matched, 42.1% correctly, 2.4% wrongly, 0.0% false positives 
2018-09-09 16:14:40.008166: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 498300ms: on: 0.716484 (Correct)
2018-09-09 16:14:40.008190: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.3% matched, 41.9% correctly, 2.4% wrongly, 0.0% false positives 
2018-09-09 16:14:42.398641: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 508320ms: down: 0.700959 (Correct)
2018-09-09 16:14:42.398670: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.1% matched, 41.8% correctly, 2.4% wrongly, 0.0% false positives 
2018-09-09 16:14:42.875008: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 510300ms: go: 0.750342 (Correct)
2018-09-09 16:14:42.875034: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.4% matched, 42.1% correctly, 2.3% wrongly, 0.0% false positives 
2018-09-09 16:14:43.983504: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 514950ms: stop: 0.713842 (Correct)
2018-09-09 16:14:43.983530: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.8% matched, 42.4% correctly, 2.3% wrongly, 0.0% false positives 
2018-09-09 16:14:46.054264: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 523410ms: go: 0.702253 (Correct)
2018-09-09 16:14:46.054293: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.6% matched, 42.3% correctly, 2.3% wrongly, 0.0% false positives 
2018-09-09 16:14:50.519370: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 540630ms: yes: 0.719828 (Wrong)
2018-09-09 16:14:50.519399: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 43.6% matched, 40.9% correctly, 2.8% wrongly, 0.0% false positives 
2018-09-09 16:14:51.325376: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 543870ms: left: 0.711105 (Correct)
2018-09-09 16:14:51.325408: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.0% matched, 41.2% correctly, 2.7% wrongly, 0.0% false positives 
2018-09-09 16:14:52.322932: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 547710ms: down: 0.712574 (Correct)
2018-09-09 16:14:52.322968: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.3% matched, 41.5% correctly, 2.7% wrongly, 0.0% false positives 
2018-09-09 16:14:56.220347: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 562950ms: left: 0.701853 (Correct)
2018-09-09 16:14:56.220376: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 43.6% matched, 41.0% correctly, 2.7% wrongly, 0.0% false positives 
2018-09-09 16:14:56.813999: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 565410ms: yes: 0.703025 (Correct)
2018-09-09 16:14:56.814025: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 43.9% matched, 41.3% correctly, 2.6% wrongly, 0.0% false positives 
2018-09-09 16:14:57.257761: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 567270ms: right: 0.727554 (Correct)
2018-09-09 16:14:57.257784: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.2% matched, 41.6% correctly, 2.6% wrongly, 0.0% false positives 
2018-09-09 16:14:58.166058: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 571050ms: yes: 0.714452 (Correct)
2018-09-09 16:14:58.166088: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.5% matched, 41.9% correctly, 2.6% wrongly, 0.0% false positives 
2018-09-09 16:15:00.302022: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 579840ms: up: 0.707446 (Correct)
2018-09-09 16:15:00.302048: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.3% matched, 41.8% correctly, 2.6% wrongly, 0.0% false positives 
2018-09-09 16:15:02.618039: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 589200ms: left: 0.718778 (Correct)
2018-09-09 16:15:02.618067: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.2% matched, 41.6% correctly, 2.5% wrongly, 0.0% false positives 
2018-09-09 16:15:04.055505: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 594990ms: right: 0.704798 (Correct)
2018-09-09 16:15:04.055531: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.2% matched, 41.7% correctly, 2.5% wrongly, 0.0% false positives 
2018-09-09 16:15:04.841136: I tensorflow/examples/speech_commands/test_streaming_accuracy.cc:298] 598200ms: _unknown_: 0.70086 (Correct)
2018-09-09 16:15:04.841166: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.5% matched, 42.0% correctly, 2.5% wrongly, 0.0% false positives 
2018-09-09 16:15:05.253720: I tensorflow/examples/speech_commands/accuracy_utils.cc:131] 44.5% matched, 42.0% correctly, 2.5% wrongly, 0.0% false positives 
(tensorflow-dev) tony@pc:~/Work/01_atom/tf-demo/tf2$

可以看到这个example是直接运行并自动进行比较了。自己比较是Correct还是Wrong，没有显示从有这个音频到识别到花费了多长时间。

扫描二维码关注公众号，回复： 3241137 查看本文章

RecognizeCommands

还没有看对应的章节；

Advanced Training

还没有看对应的章节，用到了再继续看；