python json模块使用时关于中文编码的问题

json.loads()

  1. json.loads()将json字符串转化为python对象时的对应关系如下:


    映射关系图
  2. json.loads()加载字符串时默认以utf-8的编码方式将其转为unicode对象,如果传入的中文字符不是utf-8编码的,需要传入字符编码,方式如下:
json_instance = json.loads(input,"gbk")

这个函数的相关声明如下:

If ``s`` is a ``str`` instance and is encoded with an ASCII based encoding other than utf-8 (e.g. latin-1) then an appropriate ``encoding`` name must be specified. Encodings that are not ASCII based (such as UCS-2) are not allowed and should be decoded to ``unicode`` first.

如果字符的编码方式不是基于ASCII的(utf-8是基于ASCII的),就必须先将该字符串转成unicode,然后再用这个函数加载生成python对象。

json.dumps()

  1. json.dumps()的默认编码也是“utf-8”
  2. ensure_ascii属性
    默认为True,其含义如下:
If ensure_ascii is true (the default), all non-ASCII characters in the output are escaped with \uXXXX sequences, and the result is a str instance consisting of ASCII characters only. If ensure_ascii is false, some chunks written to fp may be unicode instances. This usually happens because the input contains unicode strings or the encoding parameter is used. Unless fp.write() explicitly understands unicode (as in codecs.getwriter()) this is likely to cause an error.
  1. seperators属性
    为tuple,可以确定json串中分隔符的类型,默认是(', ',': ')含有空格,如果想去掉后面的空格,可以输入(',',':')
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容