为了账号安全,请及时绑定邮箱和手机立即绑定

Pandas groupby 并因此获得具有最大值的行

Pandas groupby 并因此获得具有最大值的行

炎炎设计 2022-07-26 20:59:22
我有一个带有索引日期时间的 pandas 数据框,我想按秒分组,结果得到列“a_ABS”中具有最大值的行,但我只得到每列的最大值。import pandas as pddata = {'lat':[4.2471, 4.2646,4.2945, 4.2819,4.2635,4.2616,4.2731,4.2555],        'lng':[-76.7504,-76.7198,-76.7069,-76.7251,-76.726,-76.7196,-76.715,-767.118],       'a':[208.999,-894.0,-171.0,108.999,-162.0,-29.0,-143.999,-133.0],       'e':[0.105,0.209,0.934,0.150,0.158,0.347,0.333,0.089]}df = pd.DataFrame(data)df = pd.DataFrame(data, index =['2020-01-01 16:32:14.105000-05:00', '2020-01-01 16:32:14.112000-05:00',                                '2020-01-01 16:32:14.175000-05:00', '2020-01-01 16:32:14.176000-05:00',                                '2020-01-01 16:32:14.211000-05:00','2020-01-01 16:32:14.220000-05:00',                               '2020-01-01 16:32:14.310000-05:00','2020-01-01 16:32:14.327000-05:00'])df.index = pd.to_datetime(df.index)a=dfa['a_ABS']=a['a'].abs()aa=a.groupby([a.index.floor('s')], as_index=True).max()
查看完整描述

2 回答

?
牛魔王的故事

TA贡献1830条经验 获得超3个赞

您快到了。使用 排序后选择第一行a.iloc[:1]。完整代码:


import pandas as pd


data = {'lat':[4.2471, 4.2646,4.2945, 4.2819,4.2635,4.2616,4.2731,4.2555],

        'lng':[-76.7504,-76.7198,-76.7069,-76.7251,-76.726,-76.7196,-76.715,-767.118],

       'a':[208.999,-894.0,-171.0,108.999,-162.0,-29.0,-143.999,-133.0],

       'e':[0.105,0.209,0.934,0.150,0.158,0.347,0.333,0.089]}


df = pd.DataFrame(data)

df = pd.DataFrame(data, index =['2020-01-01 16:32:14.105000-05:00', '2020-01-01 16:32:14.112000-05:00',

                                '2020-01-01 16:32:14.175000-05:00', '2020-01-01 16:32:14.176000-05:00',

                                '2020-01-01 16:32:14.211000-05:00','2020-01-01 16:32:14.220000-05:00',

                               '2020-01-01 16:32:14.310000-05:00','2020-01-01 16:32:14.327000-05:00'])

df.index = pd.to_datetime(df.index)



a=df

a['a_ABS']=a['a'].abs()


a=a.sort_values(by="a_ABS", ascending=False)

first_df=a.iloc[:1]


print(first_df)


查看完整回答
反对 回复 2022-07-26
?
繁华开满天机

TA贡献1816条经验 获得超4个赞

像这样的东西会起作用:


import pandas as pd


# create dataframe:

df = pd.DataFrame({

    'lat':[4.2471, 4.2646,4.2945, 4.2819,4.2635,4.2616,4.2731,4.2555],

    'lng':[-76.7504,-76.7198,-76.7069,-76.7251,-76.726,-76.7196,-76.715,-767.118],

    'a':[208.999,-894.0,-171.0,108.999,-162.0,-29.0,-143.999,-133.0],

    'e':[0.105,0.209,0.934,0.150,0.158,0.347,0.333,0.089]

})


# set index:

df.index = pd.to_datetime([

    '2020-01-01 16:32:14.105000-05:00', '2020-01-01 16:32:14.112000-05:00',

    '2020-01-01 16:32:14.175000-05:00', '2020-01-01 16:32:14.176000-05:00',

    '2020-01-01 16:32:14.211000-05:00', '2020-01-01 16:32:15.220000-05:00',

    '2020-01-01 16:32:14.310000-05:00', '2020-01-01 16:32:15.327000-05:00',

])


# create absolute column:

df['a_ABS'] = df['a'].abs()


# create seconds column:

df['seconds'] = df.index.second


# group columns by seconds:

df_grouped = df.groupby(['seconds']).max()


# extract only the 'a_ABS' column:

df_grouped = df_grouped['a_ABS']


# reset index:

df_grouped = df_grouped.reset_index()


查看完整回答
反对 回复 2022-07-26
  • 2 回答
  • 0 关注
  • 107 浏览
慕课专栏
更多

添加回答

举报

0/150
提交
取消
微信客服

购课补贴
联系客服咨询优惠详情

帮助反馈 APP下载

慕课网APP
您的移动学习伙伴

公众号

扫描二维码
关注慕课网微信公众号