1 回答
TA贡献1878条经验 获得超4个赞
如果您这样做
frequency, bins = np.histogram(latlong['Lat'], bins=20)
print(frequency)
print(bins)
你得到
[ 1 7 12 18 301 35831 504342 22081 1256 580
63 12 8 1 2 0 0 0 0 1]
[40.07 40.1725 40.275 40.3775 40.48 40.5825 40.685 40.7875 40.89
40.9925 41.095 41.1975 41.3 41.4025 41.505 41.6075 41.71 41.8125
41.915 42.0175 42.12 ]
你可以看到,有些计数与平均值相去甚远。
您可以通过在指定的最小值和最大值之间剪切感兴趣的变量来忽略那些远离均值的条柱,然后绘制直方图,如下所示
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
#Loading data
url = 'https://raw.githubusercontent.com/diggledoot/dataset/master/uber-raw-data-apr14.csv'
latlong = pd.read_csv(url)
#Plot
plt.figure(figsize=(8,6))
plt.title('Rides based on latitude')
plt.hist(np.clip(latlong['Lat'], 40.6, 40.9),bins=50,color='cyan')
plt.xlabel('Latitude')
plt.ylabel('Frequency')
plt.show()
添加回答
举报
