java实现汉字判断+中文符号判断

java一般的中文判断都是利用正则表达式


[java]  view plain  copy
  1. Pattern pattern = Pattern.compile("[\u4e00-\u9fcc]+");  
  2. System.out.println(pattern.matcher(str).find());  

或者


[java]  view plain  copy
  1. System.out.println(str.matches("[\u4e00-\u9fcc]+"));  


 
 

这种方式会有一些生僻的汉字不能识别,而且也不能匹配中文符号如“”,;等

在Java中,主要使用 Character类处理字符有关功能,这里使用它提供的内部类来判断中文

1.Character.UnicodeBlock

判断汉字:

[java]  view plain  copy
  1. private static boolean isChineseByBlock(char c) {  
  2.         Character.UnicodeBlock ub = Character.UnicodeBlock.of(c);  
  3.         if (ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS  
  4.                 || ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A  
  5.                 || ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_B  
  6.                 || ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_C//jdk1.7  
  7.                 || ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_D//jdk1.7  
  8.                 || ub == Character.UnicodeBlock.CJK_COMPATIBILITY_IDEOGRAPHS  
  9.                 || ub == Character.UnicodeBlock.CJK_COMPATIBILITY_IDEOGRAPHS_SUPPLEMENT) {  
  10.             return true;  
  11.         }  
  12.         return false;  
  13.     }  



判断中文符号:

[java]  view plain  copy
  1. private static boolean isChinesePuctuation(char c) {  
  2.         Character.UnicodeBlock ub = Character.UnicodeBlock.of(c);  
  3.         if (ub == Character.UnicodeBlock.GENERAL_PUNCTUATION  
  4.                 || ub == Character.UnicodeBlock.CJK_SYMBOLS_AND_PUNCTUATION  
  5.                 || ub == Character.UnicodeBlock.HALFWIDTH_AND_FULLWIDTH_FORMS  
  6.                 || ub == Character.UnicodeBlock.CJK_COMPATIBILITY_FORMS  
  7.                 || ub == Character.UnicodeBlock.VERTICAL_FORMS) {//jdk1.7  
  8.             return true;  
  9.         }  
  10.         return false;  
  11.     }  


2.Character.UnicodeScript(jdk1.7才提供)

Character.UnicodeScript提供了更为简洁的判断汉字的方式:


[java]  view plain  copy
  1. private static boolean isChineseByScript(char c) {  
  2.         Character.UnicodeScript sc = Character.UnicodeScript.of(c);  
  3.         if (sc == Character.UnicodeScript.HAN) {  
  4.             return true;  
  5.         }  
  6.         return false;  
  7.     }  

猜你喜欢

转载自blog.csdn.net/seabiscuityj/article/details/80338891