继续前面的练习,之前的文章参考:
- pandas实例-了解你的数据-Chipotle
- pandas实例-筛选与排序-Chipotle
- pandas实例-数据可视化-Chipotle
- pandas实例-了解你的数据-Occupation
- pandas实例-筛选与过滤-Euro 12
- pandas实例-筛选与过滤-Fictional Army
- pandas实例-聚合-Alcohol Consumption
- pandas实例-聚合-Occupation
- pandas实例-聚合-Regiment
- pandas实例-Apply-Student Alcohol Consumption
- pandas实例-Apply-Crime Rates
- pandas实例-Merge-MPG Cars
加载数据集:
raw_data_1 = {
'subject_id': ['1', '2', '3', '4', '5'],
'first_name': ['Alex', 'Amy', 'Allen', 'Alice', 'Ayoung'],
'last_name': ['Anderson', 'Ackerman', 'Ali', 'Aoni', 'Atiches']}
raw_data_2 = {
'subject_id': ['4', '5', '6', '7', '8'],
'first_name': ['Billy', 'Brian', 'Bran', 'Bryce', 'Betty'],
'last_name': ['Bonder', 'Black', 'Balwner', 'Brice', 'Btisan']}
raw_data_3 = {
'subject_id': ['1', '2', '3', '4', '5', '7', '8', '9', '10', '11'],
'test_id': [51, 15, 15, 61, 16, 14, 15, 1, 61, 16]}
df1 = pd.DataFrame(raw_data_1)
df2 = pd.DataFrame(raw_data_2)
df3 = pd.DataFrame(raw_data_3)
1. Join the two dataframes along rows and assign all_data
沿着行连接两个dataframes并分配all_data
有道翻译,真厉害
就是说把2个DataFrame拼接起来
在上一篇,我们其实使用了append
函数
df1.append(df2)
就像这样,直接追加,这里又有一个新的函数可以使用
pandas.concat
pd.concat([df1,df2])
为了忽略index,使用
pd.concat([df1,df2] , ignore_index=True)
2. Join the two dataframes along columns and assing to all_data_col
和上一题类似,这一回是根据column来拼接
pd.concat([df1 , df2] , axis=1)
3. Merge all_data and data3 along the subject_id value
把上面那个拼接好的DataFrame,再和data3拼一下
这里要用到merge
函数
pd.merge(all_data , df3 , on='subject_id')
关于函数使用,我一会儿单独写一篇介绍下,这里的merge默认是内关联,就和SQL中的join一样
4. Merge only the data that has the same 'subject_id' on both data1 and data2
现在把df1和df2关联起来
继续使用merge就行了
pd.merge(df1 , df2 , on='subject_id')
5. Merge all values in data1 and data2, with matching records from both sides where available
这样我想起了SQL中的full join
这里也有参数配置
outer: use union of keys from both frames, similar to a SQL full outer join; sort keys lexicographically.
pd.merge(df1 , df2 , on='subject_id' , how='outer')
好了,练习题,就到这里了,主要和2个函数有关,函数使用,请参考下一篇吧