Pandas_Numpy_cheatsheet

Pandas Cheatsheet

First refers to pandas DataFrame cheatsheet.pdf

Series

DataFrame

Initialization

Load and write data from other sources

  • csv
  • MySQL
  • Hadoop (impyla(as_pandas), happybase)

Woring with row and column index

df.index
df.columns

Work with columns of data (axis=1)

Work with rows of data (axis=0)

Work with cells

  • do a comprehensive summary of pandas indexing !!!

Join/combine DataFrame

Split DataFrame

  • use list comprehension

target = [x[11] for x in dataset]
train = [x[0:11] for x in dataset]

Work with whole DataFrame

Work with dates, times and their indexes

Work with strings

Work with missing and non-finite value

Basic Statistics

Work with Categorical data

Annoying Part:

Copy vs View

use of direct index will return a new copy of data, therefore is not recommended for modify things
http://stackoverflow.com/questions/20625582/how-to-deal-with-this-pandas-warning
From what I gather, SettingWithCopyWarning was created to flag potentially confusing "chained" assignments, such as the following, which don't always work as expected, particularly when the first selection returns a copy. [see GH5390 and GH5597 for background discussion.]

df[df['A'] > 2]['B'] = new_val # new_val not set in df
The warning offers a suggestion to rewrite as follows:

df.loc[df['A'] > 2, 'B'] = new_val
However, this doesn't fit your usage, which is equivalent to:

df = df[df['A'] > 2]
df['B'] = new_val

modify in place vs return a new value

index of row and column

select index from row or column by direct index is extremely similar with subtle difference:

change column name

change of column order

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容