Java String.getBytes()详解

基础概念

Jvm 内存中 String 的表示是采用 unicode 编码
UTF-8 是 Unicode 的实现方式之一

JDK

    /**
     * Encodes this {@code String} into a sequence of bytes using the named
     * charset, storing the result into a new byte array.
     */
    public byte[] getBytes(String charsetName) throws UnsupportedEncodingException {
    }

    /**
     * Constructs a new {@code String} by decoding the specified array of
     * bytes using the specified {@linkplain java.nio.charset.Charset charset}.
     * The length of the new {@code String} is a function of the charset, and
     * hence may not be equal to the length of the byte array.
     */
    public String(byte bytes[], Charset charset) {
    }

getBytes(String charsetName)

对字符串按照 charsetName 进行编码（unicode→charsetName），返回编码后的字节。
getBytes() 表示按照系统默认编码方式进行。

String(byte bytes[], Charset charset)

对字节按照 charset 进行解码（charset→unicode），返回解码后的字符串。
String(byte bytes[]) 表示按照系统默认编码方式进行

示例

正确用法

String s = "浣犲ソ"; //这是"你好"的gbk编码的字符串
String ss = new String(s.getBytes("GBK"), "UTF-8");
System.out.println(ss);

System.out.println( new String(str.getBytes("UTF-8"),"UTF-8"));

错误用法

System.out.println( new String(str.getBytes("UTF-8"),"GBK"));

参考文档

©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成，浏览时请结合常识与多方信息审慎甄别。
平台声明：文章内容（如有图片或视频亦包括在内）由作者上传并发布，文章内容仅代表作者本人观点，简书系信息发布平台，仅提供信息存储服务。