2 回答

TA贡献1804条经验 获得超8个赞
这是使用groupby自定义函数和进行转换的一种方法:
# check which Binary values are 1 and group the series by User
g = df.Binary.eq(1).groupby(df.User)
# transform to either idxmax or the last index depending
# on whether there are any Trues or not
m = g.transform(lambda x: x.idxmax() if x.any() else x.index[-1])
# index the dataframe where the index is smaler or eq m
out = df[df.index <= m]
print(out)
User Binary
0 UserA 0
1 UserA 0
2 UserA 0
3 UserA 1
8 UserB 0
9 UserB 0
10 UserB 0
11 UserB 0
12 UserB 0
13 UserB 1
16 UserC 0
17 UserC 0

TA贡献1796条经验 获得超4个赞
想法是按交换顺序测试连续值的最大值DataFrame.iloc
,如果仅0
或仅1
正确分组值,什么也有效:
def f(x):
s = x.cumsum()
return s.eq(s.max())
df = df[df.iloc[::-1].groupby('User')['Binary'].transform(f).sort_index()]
print (df)
User Binary
0 UserA 0
1 UserA 0
2 UserA 0
3 UserA 1
8 UserB 0
9 UserB 0
10 UserB 0
11 UserB 0
12 UserB 0
13 UserB 1
16 UserC 0
17 UserC 0
添加回答
举报