jsonpath用来解析多层嵌套的json数据
jsonpath官方文档
安装
pip install jsonpath
语法
JSONPath | 描述 |
---|---|
$ | 根节点 |
. or [] | 子节点 |
.. | 不管位置,选择所有符合条件的条件 |
使用
字典的根节点为最外部大括号
jsonpath()
返回一个结果列表
import jsonpath
dict_data = { "store": {
"book": [
{ "category": "reference",
"author": "Nigel Rees",
"title": "Sayings of the Century",
"price": 8.95
},
{ "category": "fiction",
"author": "Evelyn Waugh",
"title": "Sword of Honour",
"price": 12.99
},
{ "category": "fiction",
"author": "Herman Melville",
"title": "Moby Dick",
"isbn": "0-553-21311-3",
"price": 8.99
},
{ "category": "fiction",
"author": "J. R. R. Tolkien",
"title": "The Lord of the Rings",
"isbn": "0-395-19395-8",
"price": 22.99
}
],
"bicycle": {
"color": "red",
"price": 19.95
}
}
}
print(jsonpath.jsonpath(dict_data, "$.store.bicycle.price"))
>>[19.95]
print(jsonpath.jsonpath(dict_data, "$..price"))
>>[8.95, 12.99, 8.99, 22.99, 19.95]
练习
爬取bilibili电影分类下的欧美电影数据
import json
import jsonpath
import requests
url="https://api.bilibili.com/archive_rank/getarchiverankbypartion?jsonp=jsonp&tid=145&pn=1"
headers={"User-Agent":"Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.81 Mobile Safari/537.36"}
responses=requests.get(url,headers=headers)
html_dict=json.loads(responses.content)
movie=jsonpath.jsonpath(html_dict,"$..data..archives..title")
for i in movie:
with open("bilibili.txt","a",encoding='utf-8') as f:
f.write(i+"\n")