[ Pandas: Get an if statement/.loc to return the index for that row ]
I've got a dataframe with 2 columns and I'm adding a 3rd.
I want the 3rd column to be dependant on the value of the 2nd either returning a set answer or the corresponding index for that row.
An example the database is below:
print (df)
Amount Percentage
Country
Belgium 20 .0952
France 50 .2380
Germany 60 .2857
UK 80 .3809
Now I want my new third column to say 'Other' if the percentage is below 25% and to say the name of the country if the percentage is above 25%. So this is what I've written:
df.['Country']='Other')
df.loc[df['percentage']>0.25, 'Country']=df.index
Unfortunately my output doesn't give the equivalent index; it just gives the index in order:
print (df)
Amount Percentage Country
Country
Belgium 20 .0952 Other
France 50 .2380 Other
Germany 60 .2857 Belgium
UK 80 .3809 France
Obviously I want to see Germany across from Germany and UK across from UK. How can I get it to give me the index which is in the same row as the number which trips the threshold in my code?
Answer 1
You can try numpy.where
:
df['Country'] = np.where(df['Percentage']>0.25, df.index, 'Other')
print df
Amount Percentage Country
Country
Belgium 20 0.0952 Other
France 50 0.2380 Other
Germany 60 0.2857 Germany
UK 80 0.3809 UK
Or create Series
from index
by to_series
:
df['Country']='Other'
df.loc[df['Percentage']>0.25, 'Country']=df.index.to_series()
print df
Amount Percentage Country
Country
Belgium 20 0.0952 Other
France 50 0.2380 Other
Germany 60 0.2857 Germany
UK 80 0.3809 UK
Answer 2
To use the method you were trying to implement:
df['Country'] = 'Other'
df.loc[df['Percentage'] > 0.25, 'Country'] = df.loc[df['Percentage'] > 0.25].index
>>> df
Amount Percentage Country
Country
Belgium 20 0.0952 Other
France 50 0.2380 Other
Germany 60 0.2857 Germany
UK 80 0.3809 UK
Because the filter is the same on both sides, it is often best to use a mask on large datasets so that you only do the comparison once:
mask = df['Percentage'] > 0.25
df.loc[mask, 'Country'] = df.loc[mask].index
# Delete the mask once finished with it to save memory if needed.
del mask