1. Query the column
titanic_df['Embarked'][titanic_df['Embarked'] =='S'] = 0titanic_df['Embarked'][titanic_df['Embarked'] =='Q'] = 1titanic_df['Embarked'][titanic_df['Embarked'] =='C'] = 2titanic_df['Embarked'] = titanic_df['Embarked'].astype(np.int64)
2. map()
titanic_df['Embarked'] = titanic_df['Embarked'].map({'S': 0, 'Q': 1, 'C': 2})
3. apply()
def get_number(c): dic = {'S': 0, 'Q': 1, 'C': 2} return dic[c]titanic_df['Embarked'] = titanic_df['Embarked'].apply(get_number)
4. LabelEncoder()
from sklearn import preprocessinglbl = preprocessing.LabelEncoder()lbl.fit(np.unique(list(titanic_df['Embarked'].values) + list(test_df['Embarked'].values)))titanic_df['Embarked'] = lbl.transform(list(titanic_df['Embarked'].values))test_df['Embarked'] = lbl.transform(list(test_df['Embarked'].values))
5. pd.to_numeric()
This method introduced in version 0.17. Someone already asked a similar [question][1] .
And the list goes on ....
[1]:: http://stackoverflow.com/questions/15891038/pandas-change-data-type-of-columns