Pandas: Detect Value Change In String Column With Groupby; Ignoring First Entry
How can I create an indicator variable that detects changes in a column, using groupby, that ignores the first instance of arriving at a new group. import pandas as pd # generate d
Solution 1:
Let's try testing both ne
andnotna
to keep not equals
from matching the NaN
created bygroupby shift
:
s = df.groupby('case')["val1"].shift(1)
df['out'] = (s.ne(df['val1']) & s.notna()).astype(int)
Or use fillna
to fill the NaN
values from val1
:
df['out'] = (
df.groupby('case')["val1"].shift(1)
.fillna(df['val1'])
.ne(df['val1']).astype(int)
)
df
:
case val1 expectation out
0A Cat1 001A Cat1 002A Cat1 003A Cat1 004B Cat1 005B Cat2 116B Cat2 007B Cat1 118 C Cat2 009 C Cat1 1110 C Cat1 0011 C Cat2 11
Post a Comment for "Pandas: Detect Value Change In String Column With Groupby; Ignoring First Entry"