Pandas Tips and Tricks for Beginners¶
Lesson 1: Unexpected behavior after vertically stacking data frames¶
In [26]:
import pandas as pd
Example¶
In [27]:
food = pd.DataFrame([["ramen",100],["strawbery",1000]],columns=["Item","Price"]); food
Out[27]:
In [28]:
morefood = pd.DataFrame([["Roll Cake",300],["Rice",50]],columns=["Item","Price"]); morefood
Out[28]:
In [29]:
allthefood = pd.concat([food,morefood],sort=False)
Two rows are returned when the row with index zero is selected.¶
In [30]:
allthefood.loc[0,:]
Out[30]:
Hmm? Σ(・ิ¬・ิ)¶
Solution¶
In [24]:
allthefood = pd.concat([food,morefood]).reset_index(drop=True)
Be sure to reset the index to avoid unexpected behavior later on when vertically stacking data frames.¶
In [25]:
allthefood.loc[0,:]
Out[25]:
Pandas Tips and Tricks for Beginners¶
In [41]:
import pandas as pd
import numpy as np
Example¶
In [51]:
mycats = pd.DataFrame([["Russian Blue","male",1,"Chekhov",np.nan],["Bengal","female",.5,"Nina",np.nan]],columns=["Breed","Sex","Age","Name","Favorite napping spot"]); mycats
Out[51]:
In [53]:
mycats[mycats.Age<1]['Favorite napping spot'] = 'couch'
Ach! (╯︵╰,)¶
Solution¶
In [57]:
mycats.loc[mycats.Age<1,'Favorite napping spot'] = 'couch'
In [58]:
mycats
Out[58]:
When the error messages says to:
Try using .loc[row_indexer,col_indexer] = value instead
This is exactly what you should do
In the example above, the row indexer is mycats.Age<1
In [62]:
mycats.Age<1
Out[62]: