1.选择DataFrame里面某一列等于某个值的所有行,用一条命令即可解决即:
df.loc[df['columnName']=='the value']
2.对某一列的字段值进行去重
task_id_sets = df['taskid'].drop_duplicates()
3.Pandas把dataframe转成array
df=df.values
4.对某一列的值出现的次数进行统计【默认情况第一列为索引列】
task_id_all_data['tac_photo'].value_counts()
5..对某一列的值出现的次数进行统计【对第一列和计数列进行列名的重命名】
tac_photo_times=task_id_all_data['tac_photo'].value_counts().rename_axis('tac_photo').reset_index(name='counts')
6.将指定列的数据信息挑选出来
df_selected = df[['doh_dt','taskcode','tachograph_single_info','taskid','tac_photo']]
7.创建一个空的dataframe
df = pd.DataFrame(columns = ["ebayno", "p_sku", "sale", "sku"]) #创建一个空的dataframe
8.指定列名
tac_photo_split.columns=['http','pic','date','ID','num','random_value','jpg']
9.索引——>列
df['index'] = df.index
10.指定行的值
task_id_all_data.loc[[0]]
11.指定行列的值
task_id_all_data.iloc[0,5]
12.排序
task_uuid_all_data.sort_values(by=["tac_time1"],inplace=True,ascending=[True])
13.两列文本合并成一列
merge_df['type_action'] = merge_df['type'] +"(" + merge_df['action']+")"
14.group-by
a=merge_df.groupby(["index", "type"], as_index=False)['task_uuid'].count()
15.整数列转换为字符串
merge_df['index'].apply(str)