Skip to content Skip to sidebar Skip to footer

Pandas: Detect Value Change In String Column With Groupby; Ignoring First Entry

How can I create an indicator variable that detects changes in a column, using groupby, that ignores the first instance of arriving at a new group. import pandas as pd # generate d

Solution 1:

Let's try testing both neandnotna to keep not equals from matching the NaN created bygroupby shift:

s = df.groupby('case')["val1"].shift(1)
df['out'] = (s.ne(df['val1']) & s.notna()).astype(int)

Or use fillna to fill the NaN values from val1:

df['out'] = (
    df.groupby('case')["val1"].shift(1)
        .fillna(df['val1'])
        .ne(df['val1']).astype(int)
)

df:

   case  val1  expectation  out
0A  Cat1            001A  Cat1            002A  Cat1            003A  Cat1            004B  Cat1            005B  Cat2            116B  Cat2            007B  Cat1            118     C  Cat2            009     C  Cat1            1110    C  Cat1            0011    C  Cat2            11

Post a Comment for "Pandas: Detect Value Change In String Column With Groupby; Ignoring First Entry"