前言
最近项目需要接入语音评测功能,公司有做过这方面的同事推荐了科大讯飞语音评测,于是根据官网的开发指南接入了sdk,可以成功评测用户的口语能力,并给出合适的分数,但是期间遇到了很多小问题,于是写在这篇文章记录一下开发及填坑的过程。
正文
1.接入sdk:
如何接入sdk请去看科大讯飞官网提供的接入指南,这里就不做介绍了
传送门:https://doc.xfyun.cn/msc_android/%E8%AF%AD%E9%9F%B3%E8%AF%84%E6%B5%8B.html
2.编写语音评测工具类:
因为有两个地方用到了这个评测功能,所以为了使用方便,写了一个工具类,直接上代码:
/**
* @ClassName: SpeechEvaluatorUtil
* @Desciption: //语音评测工具类
* @author: jesse
* @date: 2018-06-29
*/
public class SpeechEvaluatorUtil {
private static final String TAG = SpeechEvaluatorUtil.class.getSimpleName();
public static final String EVA_RECORD_PATH = Environment.getExternalStorageDirectory().getAbsolutePath() + "/msc/ise.wav";
private static SpeechEvaluator mIse;
public static void init(Context context) {
if (mIse == null) {
mIse = SpeechEvaluator.createEvaluator(context, null);
}
}
/**
* @param evaText 评测用句
* @param mEvaluatorListener 语音评测回调接口
* @return 评测录音存储路径
*/
public static void startSpeechEva(String evaText, EvaluatorListener mEvaluatorListener) {
setParams();
// 设置音频保存路径,保存音频格式支持pcm、wav,设置路径为sd卡请注意WRITE_EXTERNAL_STORAGE权限
// 注:AUDIO_FORMAT参数语记需要更新版本才能生效
mIse.startEvaluating(evaText, null, mEvaluatorListener);
}
//通过写入音频文件进行评测
public static void startEva(byte[] audioData,String evaText,EvaluatorListener mEvaluatorListener){
setParams();
//通过writeaudio方式直接写入音频时才需要此设置
mIse.setParameter(SpeechConstant.AUDIO_SOURCE,"-1");
int ret = mIse.startEvaluating(evaText, null, mEvaluatorListener);
//在startEvaluating接口调用之后,加入以下方法,即可通过直接
//写入音频的方式进行评测业务
if (ret != ErrorCode.SUCCESS) {
Log.i(TAG,"识别失败,错误码:" + ret);
} else {
if(audioData != null) {
//防止写入音频过早导致失败
try{
new Thread().sleep(100);
}catch (InterruptedException e) {
Log.d(TAG,"InterruptedException :"+e);
}
mIse.writeAudio(audioData,0,audioData.length);
mIse.stopEvaluating();
}else{
Log.i(TAG,"audioData == null");
}
}
}
private static void setParams() {
Log.i(TAG, "setParams()");
// 设置评测语种:英语
mIse.setParameter(SpeechConstant.LANGUAGE, "en_us");
// 设置评测题型:句子
mIse.setParameter(SpeechConstant.ISE_CATEGORY, "read_sentence");
mIse.setParameter(SpeechConstant.RESULT_LEVEL,"plain");
mIse.setParameter(SpeechConstant.ISE_AUDIO_PATH, EVA_RECORD_PATH);
mIse.setParameter(SpeechConstant.AUDIO_FORMAT, "wav");
}
//停止评测
public static void stopSpeechEva() {
if (mIse.isEvaluating()) {
mIse.stopEvaluating();
}
}
//取消评测
public static void cancelSpeechEva() {
mIse.cancel();
}
}
这里写了两种评测方式:
第一种是“直接根据mic录到的音频进行评测”(startSpeechEva()),这种方式会在EVA_RECORD_PATH路径下生成一个约44kb的wav格式音频文件,但是这里有一个巨大的坑--音频文件不会立即刷新覆盖上一次录音,大概会延迟0.7-1.2s的时间,这样就造成了一个问题:如果想要录音完后立即播放这次的录音的话,会发现播放的录音是上一次的录音!而很不巧,我就需要做这样的一个功能,所以我弃用了第一种方式,改用了第二种方式。
第二种是“先自己把音频录下来,生成wav格式文件,然后再转换成byte数组进行评测”(startEva()),这种方式因为是自己录音,所以没有刷新录音文件的延迟,可以实现录音完后立即播放录音音频的效果,这就解决了第一种方式里的大坑。但是还有个坑就是,较之第一种方式,这种方式的评分偏低很多(第一种方式能得90分的发音,第二种方式大概得70分)。如果有人能够解决这个坑的话,希望你能给我留言告知一下方法。
3.编写录音工具类
这里我写了两个工具类,一个用的是MediaRecorder进行录音,一个是用AudioRecord,第2步里的第二种方式用到的是AudioRecorder这个工具类。这里两种都奉上。
MediaRecorder工具类:
/**
* @ClassName: MediaRecordUtil
* @Desciption: //录音工具类
* @author: jesse
* @date: 2018-06-15
*/
public class MediaRecordUtil {
//文件路径
private String filePath;
//文件夹路径
private String FolderPath;
private MediaRecorder mMediaRecorder;
private final String TAG = MediaRecordUtil.class.getSimpleName();
public static final int MAX_LENGTH = 1000 * 60 * 10;// 最大录音时长1000*60*10;
private OnAudioStatusUpdateListener audioStatusUpdateListener;
/**
* 文件存储默认sdcard/record
*/
public MediaRecordUtil(){
//默认保存路径为/sdcard/record/下
this(Environment.getExternalStorageDirectory().getAbsolutePath()+"/ShushanRecord/");
}
public MediaRecordUtil(String filePath) {
File path = new File(filePath);
if(!path.exists())
path.mkdirs();
this.FolderPath = filePath;
}
private long startTime;
private long endTime;
/**
* 开始录音 使用amr格式
* 录音文件
* @return
*/
public void startRecord() {
// 开始录音
/* ①Initial:实例化MediaRecorder对象 */
if (mMediaRecorder == null)
mMediaRecorder = new MediaRecorder();
try {
/* ②setAudioSource/setVedioSource */
mMediaRecorder.setAudioSource(MediaRecorder.AudioSource.MIC);// 设置麦克风
/* ②设置音频文件的编码:AAC/AMR_NB/AMR_MB/Default 声音的(波形)的采样 */
mMediaRecorder.setOutputFormat(MediaRecorder.OutputFormat.DEFAULT);
/*
* ②设置输出文件的格式:THREE_GPP/MPEG-4/RAW_AMR/Default THREE_GPP(3gp格式
* ,H263视频/ARM音频编码)、MPEG-4、RAW_AMR(只支持音频且音频编码要求为AMR_NB)
*/
mMediaRecorder.setAudioEncoder(MediaRecorder.AudioEncoder.DEFAULT);
filePath = FolderPath + DateUtils.createFileName() + ".amr" ;
Log.i(TAG,"utils : filePath == "+filePath);
/* ③准备 */
mMediaRecorder.setOutputFile(filePath);
// mMediaRecorder.setMaxDuration(MAX_LENGTH);
mMediaRecorder.prepare();
/* ④开始 */
mMediaRecorder.start();
// AudioRecord audioRecord.
/* 获取开始时间* */
startTime = System.currentTimeMillis();
// updateMicStatus();
} catch (IllegalStateException e) {
e.printStackTrace();
Log.i(TAG, "call startAmr(File mRecAudioFile) failed!" + e.toString());
} catch (IOException e) {
e.printStackTrace();
Log.i(TAG, "call startAmr(File mRecAudioFile) failed!" + e.toString());
}
}
/**
* 停止录音
*/
public long stopRecord() {
if (mMediaRecorder == null)
return 0L;
endTime = System.currentTimeMillis();
//有一些网友反应在5.0以上在调用stop的时候会报错,翻阅了一下谷歌文档发现上面确实写的有可能会报错的情况,捕获异常清理一下就行了,感谢大家反馈!
try {
mMediaRecorder.setOnErrorListener(null);
mMediaRecorder.setOnInfoListener(null);
mMediaRecorder.setPreviewDisplay(null);
mMediaRecorder.stop();
mMediaRecorder.release();
audioStatusUpdateListener.onStop(filePath);
filePath = "";
}catch (RuntimeException e){
mMediaRecorder.release();
File file = new File(filePath);
if (file.exists())
file.delete();
filePath = "";
Log.i(TAG,"stopRecord : "+e.toString());
e.printStackTrace();
}finally {
mMediaRecorder = null;
}
return endTime - startTime;
}
/**
* 取消录音
*/
public void cancelRecord(){
try {
mMediaRecorder.stop();
mMediaRecorder.reset();
mMediaRecorder.release();
mMediaRecorder = null;
}catch (RuntimeException e){
mMediaRecorder.reset();
mMediaRecorder.release();
mMediaRecorder = null;
}
File file = new File(filePath);
if (file.exists())
file.delete();
filePath = "";
}
private final Handler mHandler = new Handler();
private Runnable mUpdateMicStatusTimer = new Runnable() {
public void run() {
// updateMicStatus();
}
};
private int BASE = 1;
private int SPACE = 100;// 间隔取样时间
public void setOnAudioStatusUpdateListener(OnAudioStatusUpdateListener audioStatusUpdateListener) {
this.audioStatusUpdateListener = audioStatusUpdateListener;
}
/**
* 更新麦克状态
*/
private void updateMicStatus() {
if (mMediaRecorder != null) {
double ratio = (double)mMediaRecorder.getMaxAmplitude() / BASE;
double db = 0;// 分贝
if (ratio > 1) {
db = 20 * Math.log10(ratio);
if(null != audioStatusUpdateListener) {
audioStatusUpdateListener.onUpdate(db, System.currentTimeMillis()-startTime);
}
}
mHandler.postDelayed(mUpdateMicStatusTimer, SPACE);
}
}
public interface OnAudioStatusUpdateListener {
/**
* 录音中...
* @param db 当前声音分贝
* @param time 录音时长
*/
public void onUpdate(double db, long time);
/**
* 停止录音
* @param filePath 保存路径
*/
public void onStop(String filePath);
}
}
AudioRecord工具类:
/**
* @ClassName: AudioRecordUtil
* @Desciption: //录制wav格式音频
* @author: jesse
* @date: 2018-07-21
*/
public class AudioRecordUtil {
private static AudioRecordUtil mInstance;
private AudioRecord recorder;
//录音源
private static int audioSource = MediaRecorder.AudioSource.MIC;
//录音的采样频率
private static int audioRate = 44100;
//录音的声道,单声道
private static int audioChannel = AudioFormat.CHANNEL_IN_MONO;
//量化的深度
private static int audioFormat = AudioFormat.ENCODING_PCM_16BIT;
//缓存的大小
private static int bufferSize = AudioRecord.getMinBufferSize(audioRate,audioChannel,audioFormat);
//记录播放状态
private boolean isRecording = false;
//数字信号数组
private byte [] noteArray;
//PCM文件
private File pcmFile;
//WAV文件
private File wavFile;
//文件输出流
private OutputStream os;
//文件根目录
private String basePath = Environment.getExternalStorageDirectory().getAbsolutePath()+"/eva/";
//wav文件目录
private String outFileName = basePath+"/eva.wav";
//pcm文件目录
private String inFileName = basePath+"/eva.pcm";
private AudioRecordUtil(){
createFile();//创建文件
recorder = new AudioRecord(audioSource,audioRate,audioChannel,audioFormat,bufferSize);
}
public synchronized static AudioRecordUtil getInstance(){
if(mInstance == null){
mInstance = new AudioRecordUtil();
}
return mInstance;
}
//读取录音数字数据线程
class WriteThread implements Runnable{
public void run(){
writeData();
}
}
//开始录音
public void startRecord(){
isRecording = true;
recorder.startRecording();
}
//停止录音
public void stopRecord(){
isRecording = false;
recorder.stop();
}
//将数据写入文件夹,文件的写入没有做优化
public void writeData(){
noteArray = new byte[bufferSize];
//建立文件输出流
try {
os = new BufferedOutputStream(new FileOutputStream(pcmFile));
}catch (IOException e){
}
while(isRecording == true){
int recordSize = recorder.read(noteArray,0,bufferSize);
if(recordSize>0){
try{
os.write(noteArray);
}catch(IOException e){
}
}
}
if (os != null) {
try {
os.close();
}catch (IOException e){
}
}
}
// 这里得到可播放的音频文件
public void convertWaveFile() {
FileInputStream in = null;
FileOutputStream out = null;
long totalAudioLen = 0;
long totalDataLen;
long longSampleRate = AudioRecordUtil.audioRate;
int channels = 1;
long byteRate = 16 *AudioRecordUtil.audioRate * channels / 8;
byte[] data = new byte[bufferSize];
try {
in = new FileInputStream(inFileName);
out = new FileOutputStream(outFileName);
totalAudioLen = in.getChannel().size();
//由于不包括RIFF和WAV
totalDataLen = totalAudioLen + 36;
WriteWaveFileHeader(out, totalAudioLen, totalDataLen, longSampleRate, channels, byteRate);
while (in.read(data) != -1) {
out.write(data);
}
in.close();
out.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
/* 任何一种文件在头部添加相应的头文件才能够确定的表示这种文件的格式,wave是RIFF文件结构,每一部分为一个chunk,其中有RIFF WAVE chunk, FMT Chunk,Fact chunk,Data chunk,其中Fact chunk是可以选择的, */
private void WriteWaveFileHeader(FileOutputStream out, long totalAudioLen, long totalDataLen, long longSampleRate,
int channels, long byteRate) throws IOException {
byte[] header = new byte[44];
header[0] = 'R'; // RIFF
header[1] = 'I';
header[2] = 'F';
header[3] = 'F';
header[4] = (byte) (totalDataLen & 0xff);//数据大小
header[5] = (byte) ((totalDataLen >> 8) & 0xff);
header[6] = (byte) ((totalDataLen >> 16) & 0xff);
header[7] = (byte) ((totalDataLen >> 24) & 0xff);
header[8] = 'W';//WAVE
header[9] = 'A';
header[10] = 'V';
header[11] = 'E';
//FMT Chunk
header[12] = 'f'; // 'fmt '
header[13] = 'm';
header[14] = 't';
header[15] = ' ';//过渡字节
//数据大小
header[16] = 16; // 4 bytes: size of 'fmt ' chunk
header[17] = 0;
header[18] = 0;
header[19] = 0;
//编码方式 10H为PCM编码格式
header[20] = 1; // format = 1
header[21] = 0;
//通道数
header[22] = (byte) channels;
header[23] = 0;
//采样率,每个通道的播放速度
header[24] = (byte) (longSampleRate & 0xff);
header[25] = (byte) ((longSampleRate >> 8) & 0xff);
header[26] = (byte) ((longSampleRate >> 16) & 0xff);
header[27] = (byte) ((longSampleRate >> 24) & 0xff);
//音频数据传送速率,采样率*通道数*采样深度/8
header[28] = (byte) (byteRate & 0xff);
header[29] = (byte) ((byteRate >> 8) & 0xff);
header[30] = (byte) ((byteRate >> 16) & 0xff);
header[31] = (byte) ((byteRate >> 24) & 0xff);
// 确定系统一次要处理多少个这样字节的数据,确定缓冲区,通道数*采样位数
header[32] = (byte) (1 * 16 / 8);
header[33] = 0;
//每个样本的数据位数
header[34] = 16;
header[35] = 0;
//Data chunk
header[36] = 'd';//data
header[37] = 'a';
header[38] = 't';
header[39] = 'a';
header[40] = (byte) (totalAudioLen & 0xff);
header[41] = (byte) ((totalAudioLen >> 8) & 0xff);
header[42] = (byte) ((totalAudioLen >> 16) & 0xff);
header[43] = (byte) ((totalAudioLen >> 24) & 0xff);
out.write(header, 0, 44);
}
//创建文件夹,首先创建目录,然后创建对应的文件
public void createFile(){
File baseFile = new File(basePath);
if(!baseFile.exists())
baseFile.mkdirs();
pcmFile = new File(basePath+"/eva.pcm");
wavFile = new File(basePath+"/eva.wav");
if(pcmFile.exists()){
pcmFile.delete();
}
if(wavFile.exists()){
wavFile.delete();
}
try{
pcmFile.createNewFile();
wavFile.createNewFile();
}catch(IOException e){
}
}
//音频文件转byte数组
public static byte[] getAudioData(String audioPath){
byte[] buffer = null;
try {
File file = new File(audioPath);
FileInputStream fis = new FileInputStream(file);
ByteArrayOutputStream bos = new ByteArrayOutputStream(1000);
byte[] b = new byte[1000];
int n;
while ((n = fis.read(b)) != -1) {
bos.write(b, 0, n);
}
fis.close();
bos.close();
buffer = bos.toByteArray();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
return buffer;
}
//记录数据
public void recordData(){
new Thread(new WriteThread()).start();
}
public String getOutFileName() {
return outFileName;
}
4.开始评测
1.录音开始
AudioRecordUtil.getInstance().startRecord();
AudioRecordUtil.getInstance().recordData();
2.录音结束
AudioRecordUtil.getInstance().stopRecord();
AudioRecordUtil.getInstance().convertWaveFile();
3.开始评测
SpeechEvaluatorUtil.startEva(AudioRecordUtil.getAudioData(recordPath),text,ReadReciteExamFragment.this);
注意: 使用评测前先要执行SpeechEvaluatorUtil.init(getContext());我是在fragment的oncreate方法中执行的。
4.显示得分
这一步需要实现EvaluatorListener接口,这里只分享一下onResult这个回调的实现:
@Override
public void onResult(EvaluatorResult result, boolean isLast) {
Log.d(TAG,"onresult : isLast == "+isLast);
if (isLast) {
StringBuilder builder = new StringBuilder();
builder.append(result.getResultString());
if(oldScorePad != null && oldScorePad.getVisibility() == View.VISIBLE){
oldScorePad.setVisibility(View.GONE);
}
showScorePad(parseXml(builder.toString()));
}
}
parseXml方法,直接返回得分,满分5分(换算成100分制乘以20即可):
private float parseXml(String xmlStr){
float totalScore = 0f;
try {
XmlPullParserFactory factory = XmlPullParserFactory.newInstance();
XmlPullParser xmlPullParser = factory.newPullParser();
xmlPullParser.setInput(new StringReader(xmlStr));
int eventType = xmlPullParser.getEventType();
String value;
while(eventType != xmlPullParser.END_DOCUMENT) {
String nodeName = xmlPullParser.getName();
switch (eventType){
case XmlPullParser.START_TAG:
if("total_score".equals(nodeName)){
value = xmlPullParser.getAttributeValue(0);
totalScore = Float.parseFloat(value);
}
break;
case XmlPullParser.END_TAG:
break;
}
eventType = xmlPullParser.next();
}
}catch (XmlPullParserException xppe){
Log.i(TAG,xppe.toString());
}catch (IOException ioe){
Log.i(TAG,ioe.toString());
}
return totalScore;
}
再发一个完整录音代码,可以作为参考:
ExamAudioPlayUtil.stopPlay();
coverPopup = PopupWindowUtil.showCoverPopupWindow(getActivity(),rootView);
final Button btn = btnList.get(index).get(index);
AudioRecordUtil.getInstance().startRecord();
AudioRecordUtil.getInstance().recordData();
pb.countBack(pb,(int) ((end-begin)*1000));
Timer timer = new Timer();
timer.schedule(new TimerTask() {
@Override
public void run() {
AudioRecordUtil.getInstance().stopRecord();
AudioRecordUtil.getInstance().convertWaveFile();
mHandler.sendMessage(mHandler.obtainMessage(0,btn));
SpeechEvaluatorUtil.startEva(AudioRecordUtil.getAudioData(recordPath),text,ReadReciteExamFragment.this);
ExamAudioPlayUtil.playAudio(pbRecordPlay, recordPath, new MediaPlayer.OnCompletionListener() {
@Override
public void onCompletion(MediaPlayer mp) {
coverPopup.dismiss();
}
});
}
},(long)((end-begin)*1000));
以上。
总结
科大讯飞的语音评测,评分还是蛮准的,但是其中也有一些坑,文中已经介绍。这里再次希望如果有人能填上坑的话,给我留言说一下方法,也希望这篇文章能够帮到想要接入科大讯飞语音评测功能的安卓工程师。
说句题外话,csdn写博客的体验真是越来越好了!