speech recognition:
1.kladi
This is one of the newer speech recognition tool kits, but it has made a name for itself fast! Development began in 2009 at a workshop at John Hopkins University called “Low Development Cost, High Quality Speech Recognition for New Languages and Domains”.
After working on the project for a couple of years, the code for Kaldi was released on May 14, 2011. Kaldi quickly gained a reputation for its ease to work with.
Daniel Povey, who was one of the original developers, still maintains and updates Kaldi, so don’t expect this toolkit to go stale anytime soon. Here are all the resources you’ll need for Kaldi:
- Programming Language: C++
- About: http://kaldi-asr.org/doc/about.html
- History: http://kaldi-asr.org/doc/history.html
- Downloading and Installing: http://kaldi-asr.org/doc/install.html
- Tutorial: http://kaldi-asr.org/doc/tutorial.html
2.CMUSphinx
CMUSphinx, or called Sphinx for short, is actually a group of speech recognition systems developed by the Carnegie Mellon University. There are several packages, each designed for different tasks and applications.
One of these includes Pocketsphinx, which is a version of sphinx that can be used in embedded systems. Take a look at the resources below for everything you need to know regarding Sphinx:
- Programming Language: Java
- About: http://cmusphinx.sourceforge.net/wiki/about
- History: http://www.cs.cmu.edu/~rsingh/homepage/sphinx_history.html
- Downloading and Installing: http://cmusphinx.sourceforge.net/wiki/download
- Tutorial: http://cmusphinx.sourceforge.net/wiki/tutorial
3.HTK
Hidden Markov Model Toolkit (HTK) was made for handling HMMs. HMM is a statistical parametric synthesis technique. While HTK is mainly used for speech recognition, it can also be used for text-to-speech and for DNA sequencing.
HTK was developed at the Machine Learning Laboratory in the Cambridge University Engineering Department. Today, Microsoft has the copyright to the original HTK code. However, changes to the source code are encouraged by Microsoft.
New versions of HTK are released on a consistently, with the latest release in December 2015.
- Programming Language: C
- About: http://htk.eng.cam.ac.uk/
- History: http://htk.eng.cam.ac.uk/docs/history.shtml
- Downloading and Installing: http://htk.eng.cam.ac.uk/download.shtml
- Tutorial: http://htk.eng.cam.ac.uk/docs/docs.shtml (must be registered to access)
4.Simon
Simon is a speech recognition toolkit that provides an easy-to-use user interface. The simple structure and friendly user-interface are some of Simon’s biggest strengths. Simon actually uses CMUSphinx, HTK, and Julius (mentioned below) as the foundation of their toolkit.
Simon is known as a popular speech recognition tool for Linux, although it can also work with Windows.
- Programming Language: C++
- About: https://simon.kde.org/
- History: n/a
- Downloading and Installing: https://simon.kde.org/download
- Tutorial: https://userbase.kde.org/Simon/Handbook
5.Julius
Julius is a two-pass large vocabulary continuous speech recognition (LVCSR) engine. Born in 1997, Julius continues to be developed by the Interactive Speech Technology Consortium.
Currently, Japanese is the only language model that’s fully available with Julius. A sample English acoustic model is available, but cannot be used for commercial purposes. The VoxForge-Project is working on creating an English language acoustic model for Julius.
- Programming Language: C
- About: http://julius.osdn.jp/en_index.php
- History: n/a
- Downloading and Installing: http://julius.osdn.jp/en_index.php?q=index-en.html#download_julius
- Tutorial: http://julius.osdn.jp/en_index.php?q=index-en.html#documents_and_note
Machine Translation
1.OpenNMT
2.tensorflow
3.fairseq
4.Moses
5.THUMT "THUMT: An Open Source Toolkit for Neural Machine Translation"
6.sockeys https://github.com/awslabs/sockeye