将 JSON API 数据规范化为列

Python

ITMISS 2022-11-29 14:47:27

我正在尝试从我们的 Hubspot CRM 数据库中获取数据并使用 pandas 将其转换为数据框。我仍然是 python 的初学者，但我无法让 json_normalize 工作。数据库的输出是这样的 JSON 格式：{'archived': False, 'archived_at': None, 'associations': None, 'created_at': datetime.datetime(2019, 12, 21, 17, 56, 24, 739000, tzinfo=tzutc()), 'id': 'xxx', 'properties': {'createdate': '2019-12-21T17:56:24.739Z', 'email': 'xxxxx@xxxxx.com', 'firstname': 'John', 'hs_object_id': 'xxx', 'lastmodifieddate': '2020-04-22T04:37:40.274Z', 'lastname': 'Hansen'}, 'updated_at': datetime.datetime(2020, 4, 22, 4, 37, 40, 274000, tzinfo=tzutc())}, {'archived': False, 'archived_at': None, 'associations': None, 'created_at': datetime.datetime(2019, 12, 21, 17, 52, 38, 485000, tzinfo=tzutc()), 'id': 'bbb', 'properties': {'createdate': '2019-12-21T17:52:38.485Z', 'email': 'bbb@bbb.dk', 'firstname': 'John2', 'hs_object_id': 'bbb', 'lastmodifieddate': '2020-05-19T07:18:28.384Z', 'lastname': 'Hansen2'}, 'updated_at': datetime.datetime(2020, 5, 19, 7, 18, 28, 384000, tzinfo=tzutc())}, {'archived': False, 'archived_at': None, 'associations': None,等尝试使用此代码将其放入数据框中：import hubspotimport pandas as pdimport jsonfrom pandas.io.json import json_normalizeimport osclient = hubspot.Client.create(api_key='################')all_contacts = contacts_client = client.crm.contacts.get_all()df=pd.io.json.json_normalize(all_contacts,'properties')df.headdf.to_csv ('All contacts.csv')但是我不断收到无法解决的错误。我也试过pd.dataframe(all_contacts)和pf.dataframe.from_dict(all_contacts)

查看完整描述

1 回答

宝慕林4294392

TA贡献2021条经验获得超8个赞

all_contacts 变量是一个类似字典的元素列表。因此，为了创建数据框，我使用列表理解来创建一个元组，该元组仅包含每个类似字典的元素的“属性”。

import datetime

import pandas as pd

from dateutil.tz import tzutc

data = ({'archived': False,

'archived_at': None,

'associations': None,

'created_at': datetime.datetime(2019, 12, 21, 17, 56, 24, 739000, tzinfo=tzutc()),

'id': 'xxx',

'properties': {'createdate': '2019-12-21T17:56:24.739Z',

'email': 'xxxxx@xxxxx.com',

'firstname': 'John',

'hs_object_id': 'xxx',

'lastmodifieddate': '2020-04-22T04:37:40.274Z',

'lastname': 'Hansen'},

'updated_at': datetime.datetime(2020, 4, 22, 4, 37, 40, 274000, tzinfo=tzutc())},

{'archived': False,

'archived_at': None,

'associations': None,

'created_at': datetime.datetime(2019, 12, 21, 17, 52, 38, 485000, tzinfo=tzutc()),

'id': 'bbb',

'properties': {

'createdate': '2019-12-21T17:52:38.485Z',

'email': 'bbb@bbb.dk',

'firstname': 'John2',

'hs_object_id': 'bbb',

'lastmodifieddate': '2020-05-19T07:18:28.384Z',

'lastname': 'Hansen2'},

'updated_at': datetime.datetime(2020, 5, 19, 7, 18, 28, 384000, tzinfo=tzutc())})

df = pd.DataFrame([row['properties'] for row in data])

print(df)

输出：

createdate email ... lastmodifieddate lastname

0 2019-12-21T17:56:24.739Z xxxxx@xxxxx.com ... 2020-04-22T04:37:40.274Z Hansen

1 2019-12-21T17:52:38.485Z bbb@bbb.dk ... 2020-05-19T07:18:28.384Z Hansen2

[2 rows x 6 columns]

反对回复 2022-11-29

热搜

最近搜索清空

将 JSON API 数据规范化为列

将 JSON API 数据规范化为列

1 回答

添加回答