数据源:链接: https://pan.baidu.com/s/1EFqJFXf70t2Rubkh6D19aw 提取码: syqg
数据源示例:
探索酒类消费数据
步骤1 导入必要的库
import pandas as pd
步骤2 从以下地址导入数据
path1='pandas_exercise\exercise_data\drinks.csv'
步骤3 将数据框命名为drinks
drinks=pd.read_csv(path1)
print(drinks.head())
步骤4 哪个大陆(continent)平均消耗的啤酒(beer)更多?
print(drinks.groupby('continent').beer_servings.mean())
步骤5 打印出每个大陆(continent)的红酒消耗(wine_servings)的描述性统计值
print(drinks.groupby('continent').wine_servings.describe())
步骤6 打印出每个大陆每种酒类别的消耗平均值
print(drinks.groupby('continent').mean())
步骤7 打印出每个大陆每种酒类别的消耗中位数
print(drinks.groupby('continent').median())
步骤8 打印出每个大陆对spirit饮品消耗的平均值,最大值和最小值
print(drinks.groupby('continent').spirit_servings.agg(['mean','max','min']))
输出
# 步骤3
country beer_servings ... total_litres_of_pure_alcohol continent
0 Afghanistan 0 ... 0.0 AS
1 Albania 89 ... 4.9 EU
2 Algeria 25 ... 0.7 AF
3 Andorra 245 ... 12.4 EU
4 Angola 217 ... 5.9 AF
[5 rows x 6 columns]
# 步骤4
continent
AF 61.471698
AS 37.045455
EU 193.777778
OC 89.687500
SA 175.083333
Name: beer_servings, dtype: float64
# 步骤5
count mean std min 25% 50% 75% max
continent
AF 53.0 16.264151 38.846419 0.0 1.0 2.0 13.00 233.0
AS 44.0 9.068182 21.667034 0.0 0.0 1.0 8.00 123.0
EU 45.0 142.222222 97.421738 0.0 59.0 128.0 195.00 370.0
OC 16.0 35.625000 64.555790 0.0 1.0 8.5 23.25 212.0
SA 12.0 62.416667 88.620189 1.0 3.0 12.0 98.50 221.0
# 步骤6
beer_servings ... total_litres_of_pure_alcohol
continent ...
AF 61.471698 ... 3.007547
AS 37.045455 ... 2.170455
EU 193.777778 ... 8.617778
OC 89.687500 ... 3.381250
SA 175.083333 ... 6.308333
[5 rows x 4 columns]
# 步骤7
beer_servings ... total_litres_of_pure_alcohol
continent ...
AF 32.0 ... 2.30
AS 17.5 ... 1.20
EU 219.0 ... 10.00
OC 52.5 ... 1.75
SA 162.5 ... 6.85
[5 rows x 4 columns]
# 步骤8
mean max min
continent
AF 16.339623 152 0
AS 60.840909 326 0
EU 132.555556 373 0
OC 58.437500 254 0
SA 114.750000 302 25