今天没事翻看数据,无意当中注意到选股宝的7x24即时播报的数据和华为街见闻中的快讯要闻是同一个api,如下:
华为街要闻地址:华尔街见闻
选股宝:选股宝
访问的接口:https://api-prod.wallstreetcn.com/apiv1/content/lives
参数问题:本次抓取的是华尔街实时新闻,大家可自主选择需要抓取的channel,或者搞个并发,同时抓取五个channel。参数中有个channel的参数。它的值可以有global-channel,a-stock-channel,us-stock-channel,forex-channel,commodity-channel,blockchain-channel,对应着华为街见闻里面的要闻,A股,美股,外汇,商品,区块链。
pc_params = {'channel':'global-channel',
'client':'pc',
'cursor':0,
'limit':40}
image.png
image.png
代码块:
import requests
import time
import pandas as pd
from collections import OrderedDict
def getNewsDetail(item_list):
news_list = []
for item in item_list:
news=OrderedDict()
news['time'] = time.strftime('%Y-%m-%d %H:%M:%S',time.localtime(item['display_time']))
news['id'] = item['id']
news['content'] = item['content_text']
news_list.append(news)
return news_list
APIurl = 'https://api-prod.wallstreetcn.com/apiv1/content/lives'
headers = {'User-Agent':'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36',
'Accept':'application/json, text/plain, */*'}
pc_params = {'channel':'global-channel',
'client':'pc',
'cursor':0,
'limit':40}
news_list = []
for Loop_count in range(10):
resp = requests.get(APIurl,headers=headers,params=pc_params,verify = False)
content = resp.json()['data']
pc_params['cursor'] = content['next_cursor']
news_list.extend(getNewsDetail(content['items']))
# print(news_list)
df = pd.DataFrame(news_list)
df.to_excel('华尔街.xlsx')
在本地会生成一个文件,得到的数据:
image.png