1 回答

TA贡献1799条经验 获得超9个赞
编写您的逻辑代码,一切都很好
freq='S'
没有任何意义,您将生成与开始日期和结束日期之间的秒数一样多的行在随机化开始时间后,使用当前行和下一行作为结束时间随机函数的种子。这是作为列表理解吗
在范围的开始和结束处获取 UTC 秒数时更聪明一些
import pandas as pd
import numpy as np
from datetime import datetime
# date_rng = pd.date_range(start='5/18/2019', end='7/22/2020', freq='S')
date_rng = pd.date_range(start='5/18/2019', end='5/19/2019', freq='min')
sec = [(date_rng.min() - datetime(1970, 1, 1)).total_seconds(),
(date_rng.max() - datetime(1970, 1, 1)).total_seconds() ]
df = pd.DataFrame(date_rng, columns=['start_timestamp'])
df['start_timestamp'] = np.random.randint(sec[0],sec[1],size=(len(date_rng)))
df = df.sort_values(by="start_timestamp")
l = df["start_timestamp"].tolist() # get randomised start times
l[-1] = sec[1] # set last time to end of range
# randomise end time between two start times
df['end_timestamp'] = [np.random.randint(l[i], l[i+1]) if i<len(l)-1 and l[i]<l[i+1] else l[i] for i, s in enumerate(l)]
df['start_timestamp'] = pd.to_datetime(df['start_timestamp'],unit='s')
df['end_timestamp'] = pd.to_datetime(df['end_timestamp'],unit='s')
添加回答
举报