Skip to content Skip to sidebar Skip to footer

Pandas Conditional Groupby Count

Given this data frame: import pandas as pd df = pd.DataFrame( {'A' : ['foo', 'foo', 'foo', 'foo', 'bar', 'bar', 'bar', 'bar'], 'D' : [2, 4, 4, 2, 5, 4, 3, 2]})

Solution 1:

Does this warning matter in this case?

I see that warning for a lot of things, and it's never once made a difference to me. I just ignore it.

Also, how does pandas know to match the rows up correctly if it's taking them from another dataframe?

pandas is using the index of the DataFrame. Here's your example, rewritten slightly for clarity:

df2 = df.query('A=="foo" and D==2')
df2['Dcount'] = len(df2)

The resulting DataFrame is

A  D  Dcount
0  foo  223  foo  22

Notice the 0 and 3 in the index? That's what pandas uses to the line everything up. So I could just use the above with

df['Dcount'] = df2['Dcount']

and I will get your same result. The right-hand side of that assignment is a Series, so the index is built-in.

On the other hand, I would get an error is I had tried to assign an array:

df['Dcount'] = df2['Dcount'].values  # length error

Post a Comment for "Pandas Conditional Groupby Count"