3 回答

TA贡献1875条经验 获得超5个赞
使用Series.shift:
performance_df['LY Revenue']=performance_df['Revenue'].shift(365)
print(performance_df)
Revenue LY Revenue
Date
2018-01-01 25891.8% nan%
2018-01-02 25851.6% nan%
2018-01-03 25037.7% nan%
2018-01-04 26715.8% nan%
2018-01-05 23988.4% nan%
... ... ...
2019-12-27 3539.6% 25744.1%
2019-12-28 3535.0% 27119.7%
2019-12-29 3527.7% 28894.6%
2019-12-30 3489.9% 30321.4%
2019-12-31 3287.5% 29665.6%
[730 rows x 2 columns]
在这里您可以看到 2019 年的开始:
print(performance_df[364:366])
Revenue LY Revenue
Date
2018-12-31 29665.6% nan%
2019-01-01 28601.7% 25891.8%

TA贡献1772条经验 获得超8个赞
IIUC,你需要这个。
这仅在您将日期时间作为索引时才有效。我们在这里所做的是使用日期时间值按日和月分组,即使日期在闰年和正常年之间,这也应该有效。
performance_df['LY_Revenue'] = performance_df.groupby([performance_df.index.month,performance_df.index.day])['Revenue'].shift()
print(performance_df)
输出
Revenue LY_Revenue
Date
2018-01-01 25891.846787 NaN
2018-01-02 25851.615541 NaN
2018-01-03 25037.711900 NaN
2018-01-04 26715.764965 NaN
2018-01-05 23988.356950 NaN
2018-01-06 19029.057983 NaN
2018-01-07 16935.481705 NaN
2018-01-08 22756.072913 NaN
2018-01-09 30385.672829 NaN
2018-01-10 32970.132175 NaN
2018-01-11 31089.167075 NaN
2018-01-12 24262.972415 NaN
2018-01-13 18261.273832 NaN
2018-01-14 18304.754084 NaN
2018-01-15 26297.835665 NaN
2018-01-16 32619.669405 NaN
2018-01-17 35565.262225 NaN
2018-01-18 33229.971940 NaN
2018-01-19 25405.647136 NaN
2018-01-20 19980.890375 NaN
2018-01-21 20487.553161 NaN
2018-01-22 29709.032322 NaN
2018-01-23 38164.493648 NaN
2018-01-24 39050.801147 NaN
2018-01-25 36612.554433 NaN
2018-01-26 28169.782524 NaN
2018-01-27 22086.641618 NaN
2018-01-28 21631.662706 NaN
2018-01-29 28419.945290 NaN
2018-01-30 35644.617364 NaN
... ... ...
2019-12-02 2973.892113 28289.697207
2019-12-03 2674.316864 34737.317368
2019-12-04 2460.238549 40574.910348
2019-12-05 2800.034200 40556.066887
2019-12-06 3195.262337 39927.322507
2019-12-07 3107.693557 34634.748383
2019-12-08 2961.140812 27666.467364
2019-12-09 2340.478044 27774.363832
2019-12-10 1931.373925 33950.846875
2019-12-11 1847.123639 39518.061312
2019-12-12 2179.325333 39587.568701
2019-12-13 2438.035383 38832.660311
2019-12-14 2379.865127 32258.462222
2019-12-15 2255.598970 23343.008315
2019-12-16 1870.926018 23914.895775
2019-12-17 1620.608382 28173.094175
2019-12-18 1511.311007 30306.555827
2019-12-19 1685.967616 28284.310392
2019-12-20 2099.849763 24228.754426
2019-12-21 2430.507619 20495.999365
2019-12-22 2701.975519 19302.936445
2019-12-23 2997.630051 21391.090777
2019-12-24 2977.347247 21072.220129
2019-12-25 2893.576704 19770.681250
2019-12-26 3207.467022 22751.205447
2019-12-27 3539.618050 25744.075480
2019-12-28 3534.997476 27119.697589
2019-12-29 3527.721147 28894.626077
2019-12-30 3489.915430 30321.364425
2019-12-31 3287.543337 29665.558703

TA贡献1817条经验 获得超14个赞
鉴于您的数据是时间索引的,您可以使用freq
performance_df['LY Revenue'] = performance_df.Revenue.shift(freq='365d')
输出:
Revenue LY Revenue
Date
2018-01-01 25891.8% nan%
2018-01-02 25851.6% nan%
2018-01-03 25037.7% nan%
2018-01-04 26715.8% nan%
2018-01-05 23988.4% nan%
2018-01-06 19029.1% nan%
2018-01-07 16935.5% nan%
2018-01-08 22756.1% nan%
2018-01-09 30385.7% nan%
...
2019-12-21 2430.5% 20496.0%
2019-12-22 2702.0% 19302.9%
2019-12-23 2997.6% 21391.1%
2019-12-24 2977.3% 21072.2%
2019-12-25 2893.6% 19770.7%
2019-12-26 3207.5% 22751.2%
2019-12-27 3539.6% 25744.1%
2019-12-28 3535.0% 27119.7%
2019-12-29 3527.7% 28894.6%
2019-12-30 3489.9% 30321.4%
2019-12-31 3287.5% 29665.6%
但请注意,365D通常不一定是一年。
添加回答
举报