MySQL latin1 编码转 utf-8 格式

Mysql 的 latin1 不等于标准的 latin1（iso-8859-1）和cp1252,比iso-8859-1多了0x80-0x9f 字符，比cp1252多了0x81,0x8d,0x8f,0x90,0x9d 一共5个字符。

官方文档有说明:

http://dev.mysql.com/doc/refman/5.0/en/charset-we-sets.html

这样在Java中，如果使用标准的iso-8859-1或者cp1252解码可能出现乱码。
s.getBytes(“iso-8859-1”) 或者 s.getBytes(“cp1252”); 所以用以下方法解决。

public String convertCharset(String s)
    {
        if (s != null)
        {
            try
            {
                int length = s.length();
                byte[] buffer = new byte[length];
                //0x81 to Unicode 0x0081, 0x8d to 0x008d, 0x8f to 0x008f, 0x90 to 0x0090, and 0x9d to 0x009d.
                for (int i = 0; i < length; ++i)
                {
                    char c = s.charAt(i);
                    if (c == 0x0081)
                    {
                        buffer[i] = (byte) 0x81;
                    }
                    else if (c == 0x008d)
                    {
                        buffer[i] = (byte) 0x8d;
                    }
                    else if (c == 0x008f)
                    {
                        buffer[i] = (byte) 0x8f;
                    }
                    else if (c == 0x0090)
                    {
                        buffer[i] = (byte) 0x90;
                    }
                    else if (c == 0x009d)
                    {
                        buffer[i] = (byte) 0x9d;
                    }
                    else
                    {
                        buffer[i] = Character.toString(c).getBytes("CP1252")[0];
                    }
                }
                String result = new String(buffer, "UTF-8");
                return result;
            }
            catch (UnsupportedEncodingException e)
            {
                e.printStackTrace();
            }
        }
        return null;
    }

MySQL latin1 编码 转 utf-8 格式

猜你喜欢

MySQL latin1 编码转 utf-8 格式