练习5-合并
探索虚拟姓名数据
步骤1 导入必要的库
运行以下代码
import pandas as pd
import numpy as np
步骤2 按照如下的元数据内容创建数据框
运行以下代码
raw_data_1 = {
'subject_id':['1','2','3','4','5'],
'first_name':['Alex','Amy','Allen','Alice','Ayoung'],
'last_name':['Anderson','Ackerman','Ali','Aoni','Atiches']}
raw_data_2 = {
'subject_id':['4','5','6','7','8'],
'first_name':['Billy','Brian','Bran','Bryce','Betty'],
'last_name':['Bonder','Black','Balwner','Brice','Btisan']}
raw_data_3 = {
'subject_id':['1','2','3','4','5','7','8','9','10','11'],
'test_id':[51,15,15,61,16,14,15,1,61,16]}
步骤3 将上述的数据框分别命名为data1, data2, data3
运行以下代码
data1 = pd.DataFrame(raw_data_1,columns = ['subject_id','first_name','last_name'])
data2 = pd.DataFrame(raw_data_2,columns = ['subject_id','first_name','last_name'])
data3 = pd.DataFrame(raw_data_3,columns = ['subject_id','test_id'])
步骤4 将data1和data2两个数据框按照行的维度进行合并,命名为all_data
运行以下代码
all_data = pd.concat([data1,data2])
all_data
步骤5 将data1和data2两个数据框按照列的维度进行合并,命名为all_data_col
运行以下代码
all_data_col = pd.concat([data1,data2],axis = 1)
all_data_col
步骤6 打印data3
运行以下代码
data3
步骤7 按照subject_id的值对all_data和data3作合并
运行以下代码
pd.merge(all_data,data3,on='subject_id')
步骤8 对data1和data2按照subject_id作连接
运行以下代码
pd.merge(data1,data2,on='subject_id',how='inner')
步骤9 找到 data1 和 data2 合并之后的所有匹配结果
运行以下代码
pd.merge(data1,data2,on='subject_id',how='outer')
代码截图
拓展延伸
#对data1和data2按照subject_id作左连接
pd.merge(data1,data2,on='subject_id',how='left')
#对data1和data2按照subject_id作右连接
pd.merge(data1,data2,on='subject_id',how='right')