plotly那些事儿(4)
本部分主要是plotly那些事儿第4个系列,对plotly官方文档中常用的科学化图表做一个归纳整理,主要会包含地区分布图、基于Mapbox的地图标注等。
首先要说明的是plotly.figure_factory需要将plotly升级到2.5.1+,以及需要依赖一些K-lab本身没有包含的包shapely==1.6.3、geopandas==0.3.0、pyshp==1.2.10,我们先通过以下代码将所需要的安装包完整安装。
In [ ]:
<pre style="box-sizing: border-box; overflow: auto; font-size: inherit; line-height: inherit; display: block; color: rgb(51, 51, 51); font-family: "Source Code Pro", monospace !important; padding: 0px; margin: 0px; word-break: break-all; overflow-wrap: break-word; background-color: transparent; border: none; border-radius: 2px; width: 922.313px;">!pip install -i https://pypi.tuna.tsinghua.edu.cn/simple plotly==2.5.1
!pip install -i https://pypi.tuna.tsinghua.edu.cn/simple shapely==1.6.3
!pip install -i https://pypi.tuna.tsinghua.edu.cn/simple geopandas==0.3.0
!pip install -i https://pypi.tuna.tsinghua.edu.cn/simple pyshp==1.2.10
</pre>
In [1]:
<pre style="box-sizing: border-box; overflow: auto; font-size: inherit; line-height: inherit; display: block; color: rgb(51, 51, 51); font-family: "Source Code Pro", monospace !important; padding: 0px; margin: 0px; word-break: break-all; overflow-wrap: break-word; background-color: transparent; border: none; border-radius: 2px; width: 922.313px;">import plotly
plotly.version
</pre>
Out[1]:
<pre style="box-sizing: border-box; overflow: auto; font-size: inherit; line-height: inherit; display: block; color: inherit; font-family: "Source Code Pro", monospace !important; padding: 0px; margin: 0px; word-break: break-all; overflow-wrap: break-word; background-color: transparent; border: 0px; border-radius: 0px; width: 933.5px; vertical-align: baseline;">'2.5.1'</pre>
初始化notebook中plotly的视图嵌入。
In [2]:
<pre style="box-sizing: border-box; overflow: auto; font-size: inherit; line-height: inherit; display: block; color: rgb(51, 51, 51); font-family: "Source Code Pro", monospace !important; padding: 0px; margin: 0px; word-break: break-all; overflow-wrap: break-word; background-color: transparent; border: none; border-radius: 2px; width: 922.313px;">from plotly.offline import init_notebook_mode, iplot
init_notebook_mode(connected = True)
</pre>
由于现在plotly的地图只支持世界地图、美国(包含各州、城市)、欧洲、亚洲、非洲、北美和南美的地理地图,故无法内嵌其他国家的国家地图。
设定方法为:通过设定layout.geo.scope的取值为( enumerated : "world" | "usa" | "europe" | "asia" | "africa" | "north america" | "south america" )。
地区分布图
首先我们利用plotly的地区分布图来展现一下美国各个州在2011年出口总值的地区分布。
其中输入的数据主要包含了州的英文缩写,州的出口总值数,以及出口的各类商品的单独出口金额。
In [5]:
<pre style="box-sizing: border-box; overflow: auto; font-size: inherit; line-height: inherit; display: block; color: rgb(51, 51, 51); font-family: "Source Code Pro", monospace !important; padding: 0px; margin: 0px; word-break: break-all; overflow-wrap: break-word; background-color: transparent; border: none; border-radius: 2px; width: 922.313px;">import pandas as pd
df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/2011_us_ag_exports.csv')
for col in df.columns:
df[col] = df[col].astype(str)
设立颜色条色彩渐变
colorscale = [[0.0, 'rgb(242,240,247)'],[0.2, 'rgb(218,218,235)'],[0.4, 'rgb(188,189,220)'],
[0.6, 'rgb(158,154,200)'],[0.8, 'rgb(117,107,177)'],[1.0, 'rgb(84,39,143)']]
设立悬浮信息
df['text'] = df['state'] + '
' +
'Beef '+df['beef']+' Dairy '+df['dairy']+'
'+
'Fruits '+df['total fruits']+' Veggies ' + df['total veggies']+'
'+
'Wheat '+df['wheat']+' Corn '+df['corn']
data = [ dict(
type='choropleth',
colorscale = colorscale,
autocolorscale = False,
# 位置由各州的编号,即缩写表示
locations = df['code'],
# 每个州的颜色深度由出口总值表示,出口总值越高颜色越紫
z = df['total exports'].astype(float),
locationmode = 'USA-states',
text = df['text'],
marker = dict(
line = dict (
color = 'rgb(255,255,255)',
width = 2
) ),
# 设立颜色条注释
colorbar = dict(
title = "Millions USD")
) ]
layout = dict(
title = '2011年美国各大州农业出口情况',
geo = dict(
scope='usa',
projection=dict( type='albers usa' )
)
)
fig = dict( data=data, layout=layout )
iplot(fig)
</pre>
<svg class="main-svg" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="933.5" height="525" style="background: rgb(255, 255, 255);"><text class="js-plot-link-container" y="516" text-anchor="end" x="926.5" style="font-family: "Open Sans", Arial, sans-serif; font-size: 12px; fill: rgb(68, 68, 68); pointer-events: all;"><tspan class="js-link-to-tool"><a xlink:href="https://www.kesci.com/api/notebooks/5c17b91decb012002b5771a8/RenderedContent#" class="link--impt link--embedview" font-weight="bold">Export to plot.ly »</a></tspan></text></svg><svg class="main-svg" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="933.5" height="525"><g class="infolayer"><g class="cb23da09 colorbar" transform="translate(80,100)"><g class="cbaxis crisp" transform="translate(0,-100)"><g class="ycb23da09tick"><text text-anchor="start" x="794.4" y="4.199999999999999" data-unformatted="2k" data-math="N" transform="translate(0,398.37)" style="font-family: "Open Sans", verdana, arial, sans-serif; font-size: 12px; fill: rgb(68, 68, 68); fill-opacity: 1; white-space: pre;">2k</text></g><g class="ycb23da09tick"><text text-anchor="start" x="794.4" y="4.199999999999999" data-unformatted="4k" data-math="N" transform="translate(0,361.49)" style="font-family: "Open Sans", verdana, arial, sans-serif; font-size: 12px; fill: rgb(68, 68, 68); fill-opacity: 1; white-space: pre;">4k</text></g><g class="ycb23da09tick"><text text-anchor="start" x="794.4" y="4.199999999999999" data-unformatted="6k" data-math="N" transform="translate(0,324.61)" style="font-family: "Open Sans", verdana, arial, sans-serif; font-size: 12px; fill: rgb(68, 68, 68); fill-opacity: 1; white-space: pre;">6k</text></g><g class="ycb23da09tick"><text text-anchor="start" x="794.4" y="4.199999999999999" data-unformatted="8k" data-math="N" transform="translate(0,287.73)" style="font-family: "Open Sans", verdana, arial, sans-serif; font-size: 12px; fill: rgb(68, 68, 68); fill-opacity: 1; white-space: pre;">8k</text></g><g class="ycb23da09tick"><text text-anchor="start" x="794.4" y="4.199999999999999" data-unformatted="10k" data-math="N" transform="translate(0,250.85)" style="font-family: "Open Sans", verdana, arial, sans-serif; font-size: 12px; fill: rgb(68, 68, 68); fill-opacity: 1; white-space: pre;">10k</text></g><g class="ycb23da09tick"><text text-anchor="start" x="794.4" y="4.199999999999999" data-unformatted="12k" data-math="N" transform="translate(0,213.98000000000002)" style="font-family: "Open Sans", verdana, arial, sans-serif; font-size: 12px; fill: rgb(68, 68, 68); fill-opacity: 1; white-space: pre;">12k</text></g><g class="ycb23da09tick"><text text-anchor="start" x="794.4" y="4.199999999999999" data-unformatted="14k" data-math="N" transform="translate(0,177.1)" style="font-family: "Open Sans", verdana, arial, sans-serif; font-size: 12px; fill: rgb(68, 68, 68); fill-opacity: 1; white-space: pre;">14k</text></g><g class="ycb23da09tick"><text text-anchor="start" x="794.4" y="4.199999999999999" data-unformatted="16k" data-math="N" transform="translate(0,140.22)" style="font-family: "Open Sans", verdana, arial, sans-serif; font-size: 12px; fill: rgb(68, 68, 68); fill-opacity: 1; white-space: pre;">16k</text></g></g><g class="cbtitleunshift" transform="translate(-80,-100)"><g class="cbtitle" transform="translate(-0.5,-0.5)"><text class="ycb23da09title" x="840.7199999999999" y="122" text-anchor="start" data-unformatted="Millions USD" data-math="N" style="font-family: "Open Sans", verdana, arial, sans-serif; font-size: 12px; fill: rgb(68, 68, 68); opacity: 1; font-weight: normal; white-space: pre;">Millions USD</text></g></g></g><g class="g-gtitle"><text class="gtitle" x="466.75" y="50" text-anchor="middle" data-unformatted="2011年美国各大州农业出口情况" data-math="N" style="font-family: "Open Sans", verdana, arial, sans-serif; font-size: 17px; fill: rgb(68, 68, 68); opacity: 1; font-weight: normal; white-space: pre;">2011年美国各大州农业出口情况</text></g></g></svg>
接下来,我们来利用地区分布图的全球地理信息来描绘下2014年全球各个国家的GDP总值的差异。
In [4]:
<pre style="box-sizing: border-box; overflow: auto; font-size: inherit; line-height: inherit; display: block; color: rgb(51, 51, 51); font-family: "Source Code Pro", monospace !important; padding: 0px; margin: 0px; word-break: break-all; overflow-wrap: break-word; background-color: transparent; border: none; border-radius: 2px; width: 922.313px;">import pandas as pd
df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/2014_world_gdp_with_codes.csv')
data = [ dict(
type = 'choropleth',
locations = df['CODE'],
z = df['GDP (BILLIONS)'],
text = df['COUNTRY'],
colorscale = [[0,"rgb(5, 10, 172)"],[0.35,"rgb(40, 60, 190)"],[0.5,"rgb(70, 100, 245)"],
[0.6,"rgb(90, 120, 245)"],[0.7,"rgb(106, 137, 247)"],[1,"rgb(220, 220, 220)"]],
autocolorscale = False,
reversescale = True,
marker = dict(
line = dict (
color = 'rgb(180,180,180)',
width = 0.5
) ),
colorbar = dict(
autotick = False,
tickprefix = '/pre>,
title = 'GDP
Billions US/pre>),
) ]
layout = dict(
title = '2014 全球GDP分布',
geo = dict(
showframe = True,
showcoastlines = False,
projection = dict(
type = 'Mercator'
)
)
)
fig = dict( data=data, layout=layout )
iplot( fig, validate=False )
</pre>
<svg class="main-svg" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="933.5" height="525" style="background: rgb(255, 255, 255);"><text class="js-plot-link-container" y="516" text-anchor="end" x="926.5" style="font-family: "Open Sans", Arial, sans-serif; font-size: 12px; fill: rgb(68, 68, 68); pointer-events: all;"><tspan class="js-link-to-tool"><a xlink:href="https://www.kesci.com/api/notebooks/5c17b91decb012002b5771a8/RenderedContent#" class="link--impt link--embedview" font-weight="bold">Export to plot.ly »</a></tspan></text></svg><svg class="main-svg" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="933.5" height="525"><g class="infolayer"><g class="cbfd76e6 colorbar" transform="translate(80,100)"><g class="cbaxis crisp" transform="translate(0,-100)"><g class="ycbfd76e6tick"><text text-anchor="start" x="799.4" y="4.199999999999999" data-unformatted="0</text></g><g class="ycbfd76e6tick"><text text-anchor="start" x="799.4" y="4.199999999999999" data-unformatted="5k</text></g><g class="ycbfd76e6tick"><text text-anchor="start" x="799.4" y="4.199999999999999" data-unformatted="10k</text></g><g class="ycbfd76e6tick"><text text-anchor="start" x="799.4" y="4.199999999999999" data-unformatted="15k</text></g></g><g class="cbtitleunshift" transform="translate(-80,-100)"><g class="cbtitle" transform="translate(-0.5,-0.5)"><text class="ycbfd76e6title" x="845.8199999999999" y="122" text-anchor="start" data-unformatted="GDP
Billions US</tspan></text></g></g></g><g class="g-gtitle"><text class="gtitle" x="466.75" y="50" text-anchor="middle" data-unformatted="2014 全球GDP分布" data-math="N" style="font-family: "Open Sans", verdana, arial, sans-serif; font-size: 17px; fill: rgb(68, 68, 68); opacity: 1; font-weight: normal; white-space: pre;">2014 全球GDP分布</text></g></g></svg>
Reference
- Choropleth Maps in Python (https://plot.ly/python/choropleth-maps/)
- 2011美国农业出口数据 (https://raw.githubusercontent.com/plotly/datasets/master/2011_us_ag_exports.csv)
- 2014世界GDP分布数据 (https://raw.githubusercontent.com/plotly/datasets/master/2014_world_gdp_with_codes.csv)
基于Mapbox的地图标注
由于在上一部分已经提到了plotly本身自带的地区分布的局限性,因此我们采用plotly提供的mapbox通信接口来实现下如何在地图上标注散点和直线。
这一部分我们在地图上绘制散点和直线。
需要说明的是此处需要提供通讯密钥,申请地址 Mapbox Access Token。
In [3]:
<pre style="box-sizing: border-box; overflow: auto; font-size: inherit; line-height: inherit; display: block; color: rgb(51, 51, 51); font-family: "Source Code Pro", monospace !important; padding: 0px; margin: 0px; word-break: break-all; overflow-wrap: break-word; background-color: transparent; border: none; border-radius: 2px; width: 922.313px;">from plotly.graph_objs import *
mapbox_access_token = 'ADD_YOUR_TOKEN_HERE'
data = Data([
Scattermapbox(
# 绘制散点的经纬度
lat=['45.7017','45.7017'],
lon=['-73.5673','-74.9673'],
mode='markers',
marker=Marker(
size=20,
color=['#00ffee','#eeff00']
),
# 散点对应的文本
text=['Montreal'],
name='scatter'
),
Scattermapbox(
# 绘制直线端点的经纬度
lat=['45.9017','45.2017'],
lon=['-72.0673','-72.9673'],
mode='lines',
line=Line(
color='#ff00ee',
width=5
),
# 散点对应的文本
text=['Montreal'],
name='line'
)
])
layout = Layout(
title='基于mapbox的地图标注',
autosize=True,
hovermode='closest',
mapbox=dict(
accesstoken=mapbox_access_token,
bearing=0,
# 地图中心坐标,不要远离绘制的散点坐标
center=dict(
lat=45.5,
lon=-73.5
),
pitch=0,
zoom=7
),
)
fig = dict(data=data, layout=layout)
iplot(fig)
</pre>
<svg class="main-svg" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="933.5" height="525" style="background: rgb(255, 255, 255);"><text class="js-plot-link-container" y="516" text-anchor="end" x="926.5" style="font-family: "Open Sans", Arial, sans-serif; font-size: 12px; fill: rgb(68, 68, 68); pointer-events: all;"><tspan class="js-link-to-tool"><a xlink:href="https://www.kesci.com/api/notebooks/5c17b91decb012002b5771a8/RenderedContent#" class="link--impt link--embedview" font-weight="bold">Export to plot.ly »</a></tspan></text></svg>
<canvas class="gl-canvas gl-canvas-context" width="933.5" height="525" style="box-sizing: border-box; display: inline-block; vertical-align: baseline; position: absolute; top: 0px; left: 0px; width: 933.5px; height: 525px; overflow: visible; pointer-events: none;"></canvas>
<canvas class="gl-canvas gl-canvas-focus" width="933.5" height="525" style="box-sizing: border-box; display: inline-block; vertical-align: baseline; position: absolute; top: 0px; left: 0px; width: 933.5px; height: 525px; overflow: visible; pointer-events: none;"></canvas>
<canvas class="gl-canvas gl-canvas-pick" width="933.5" height="525" style="box-sizing: border-box; display: inline-block; vertical-align: baseline; position: absolute; top: 0px; left: 0px; width: 933.5px; height: 525px; overflow: visible; pointer-events: none;"></canvas>
<canvas class="mapboxgl-canvas" tabindex="0" aria-label="Map" width="1486" height="690" style="box-sizing: border-box; display: inline-block; vertical-align: baseline; position: absolute; width: 743px; height: 345px; left: 0px; top: 0px;"></canvas>
<svg class="main-svg" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="933.5" height="525"><g class="infolayer"><g class="legend" pointer-events="all" transform="translate(837.86, 100)"><g class="scrollbox" transform="translate(0, 0)" clip-path="url(#legend6645e4)"><g class="groups"><g class="traces" transform="translate(0, 14.5)" style="opacity: 1;"><text class="legendtext user-select-none" text-anchor="start" x="40" y="4.680000000000001" data-unformatted="scatter" data-math="N" style="font-family: "Open Sans", verdana, arial, sans-serif; font-size: 12px; fill: rgb(68, 68, 68); fill-opacity: 1; white-space: pre;">scatter</text></g><g class="traces" transform="translate(0, 33.5)" style="opacity: 1;"><text class="legendtext user-select-none" text-anchor="start" x="40" y="4.680000000000001" data-unformatted="line" data-math="N" style="font-family: "Open Sans", verdana, arial, sans-serif; font-size: 12px; fill: rgb(68, 68, 68); fill-opacity: 1; white-space: pre;">line</text></g></g></g></g><g class="g-gtitle"><text class="gtitle" x="466.75" y="50" text-anchor="middle" data-unformatted="基于mapbox的地图标注" data-math="N" style="font-family: "Open Sans", verdana, arial, sans-serif; font-size: 17px; fill: rgb(68, 68, 68); opacity: 1; font-weight: normal; white-space: pre;">基于mapbox的地图标注</text></g></g></svg>
通过研究plotly基于mapbox的代码实现我们能发现,其以mapbox提供的地图为画布,在画布上只要输入经纬度坐标,即可基于坐标点实现和Scatter一样的效果,无论是散点标注marker还是点与点之间的连线lines。
Reference
- Scatter Plots on Mapbox in Python (https://plot.ly/python/scattermapbox/)
- Mapbox Access Token (https://www.mapbox.com/studio)
在地图可视化层面,plotly提供的功能仍旧不太完善,譬如对于除美国外的其他国家未提供geo map,需利用mapbox的接口进行开发,期待下一波版本更新对layout.geo.scope可供输入的参数做一些完善。