用java多线程实现“百度翻译接口API快速翻译”

不知道为啥，突然开始想写博客，可能是想找个地方写点东西，煽情文艺的咱写不了，就写技术贴好了。不当之处，还希望同志们多多指教，不胜感激。

API准备：自己先到百度去申请一个百度翻译API，话说百度翻译还是可以的，每个月200W字符的免费翻译，不做商业的基本够用了，感谢百度。百度不仅提供了API，还提供了各个编程语言的demo，我们这边使用java语言版。自己可以试着玩一玩。

我要做的是用百度翻译来帮我翻译大概有十几个英文文档，长短不一，十几KB到一百多KB不等，总的在600KB左右。

经过试验，单线程跑了一个小时左右，而多线程大概跑了12分钟（最后就剩下那个最长的文档在跑，长板效应！！不是因为它的话时间会缩短很多！！！）。

以前感觉多线程、并行计算老牛逼了，其实实现后发现这实现起来也太简单了点。其实就是把需要相同重复执行的代码封装到run()方法里面。多线程有个很基本的点就是：子线程的执行不影响主线程的执行，想明白这点就差不多了。

//extends Thread 就是继承多线程类
public class MultiTranslate extends Thread{
	private static final String APP_ID = "你的API_ID";
	private static final String SECURITY_KEY = "API密匙";
	//通过定义类属性，来传递变量，Thread类的run（）方法不能带参数
	String filename="";
	public MultiTranslate(String filename) {
		super();
		this.filename = filename;
	}
	
	@Override
	public void run() {
		String ThreadName=Thread.currentThread().getName();
		//调用百度翻译API
		TransApi api = new TransApi(APP_ID, SECURITY_KEY);
		//获得待翻译文档路径，文档为eclipse下项目Source文件，filename为文件名
		String url = newMain.class.getClassLoader().getResource(filename).getPath();
		try {
			String newFile = URLDecoder.decode(url, "UTF-8");
			BufferedReader reader = new BufferedReader(new InputStreamReader(
					new FileInputStream(newFile), "utf-8"));
			String line = null;
			//翻译完输出到新文档，我给新文档名加了前缀“ZH”
			FileOutputStream out = new FileOutputStream(new File("ZH"+filename),true);
			//正则匹配，百度翻译完传回来一个json形式，包括很多内容，我只取翻译结果
			String questionRegex = "\"dst\":\"(.*)\"";
			Pattern pattern = Pattern.compile(questionRegex);
			while ((line = reader.readLine()) != null) {
				String[] words=line.split(" ");
				StringBuilder NewLine=new StringBuilder();
				for(int i=0;i<words.length;i++){
					//调用接口，进行翻译
					String get = new String(api.getTransResult(words[i], "auto", "zh"));
					Matcher matcher = pattern.matcher(get);
					while (matcher.find()) {
						String getword = matcher.group(1);
						//API返回的是Unicode编码，import org.apache.commons.lang.StringEscapeUtils，
						//调用StringEscapeUtils.unescapeJava()方法可解决编码问题
						String newgetword = StringEscapeUtils.unescapeJava(getword);
						NewLine.append(newgetword+" ");
					}
				}
				String newline=NewLine.toString().trim()+"\n";
				System.out.println(filename+" "+ThreadName+" newline="+newline);
				out.write(newline.getBytes("utf-8"));
			}
			out.close();
			reader.close();
		} catch (Exception e) {
			throw new RuntimeException("文档加载失败！");
		}
	}
	
	public static void main(String[] args) throws FileNotFoundException, IOException, InterruptedException {
		String path="xxx";
		ArrayList<String> filenameList=new ArrayList<String>();
		filenameList=getAllFile(path);
		for (int i = 0; i < filenameList.size(); i++) {
			//对每个文档生成一个线程，for循环生成多个线程分别执行翻译任务
			Thread myThread=new MultiTranslate(filenameList.get(i));
			//Thread会去执行run()方法
			myThread.start();
			//如果多加了join（）方法会变成单线程效果，主线程会等待子线程执行完再继续。
			//myThread.join();
		}
	}
	/**
	 * 读取某个文件夹下的所有文件，包括子文件夹下的文件
	 * 
	 * @param filepath
	 *            文件夹的路径
	 * @return 
	 * @throws FileNotFoundException
	 * @throws IOException
	 */
	public static ArrayList<String> getAllFile(String filepath)
			throws FileNotFoundException, IOException {
		File file = new File(filepath);
		ArrayList<String> pathList=new ArrayList<String>();
		File[] filelist = null;
		try {
			filelist = file.listFiles();
			for (File onefile : filelist) {
				File readfile = new File(onefile.getPath());
				if (readfile.isDirectory()) {
					getAllFile(onefile.getPath());
				} else {
					String filename=getDocName(readfile.getPath());
					pathList.add(filename);
				}
			}
		} catch (FileNotFoundException e) {
			throw new RuntimeException("getAllFile()   Exception:"
					+ e.getMessage());
		}
		return pathList;
	}
	
	//取得文档标题
	public static String getDocName(String path) {
		String[] temp = path.split("\\\\");
		if (temp.length >= 1) {
			String strtemp = temp[temp.length - 1];
			return strtemp.trim();
		}
		return "";
	}
}

转发请注明原文出处，谢谢。

用java多线程实现“百度翻译接口API快速翻译”

猜你喜欢