为了账号安全,请及时绑定邮箱和手机立即绑定

将熊猫数据框转换为嵌套的 json

将熊猫数据框转换为嵌套的 json

鸿蒙传说 2023-02-07 16:33:55
我有一个如下所示的数据框,其中一列包含已经嵌套的字典列表:import pandas as pddata = {'First':  ['First value', 'Second value'],    'Second': ['First value', 'Second value'],    'third': ['First value', 'Second value'],    'forth': ['[{"values": "","entity": "datetime","","Turn":  [{"expression": "","tid": "","type": "", "value": "","mod": "","anchor": "","beginPoint": "","endPoint": ""}]}]','[{"values": "","entity": "datetime","Turn": [{"expression": "","tid": "","type": "", "value": "","mod": "","anchor": "","beginPoint": "","endPoint": ""}]}]'],    }df = pd.DataFrame (data, columns = ['First','second','third','forth'])我想将其转换为以下 json 格式并保存为:[  {    "first": "",    "second": "",    "third": "",    "forth": [        {          "values": "",          "entity": "",          "TIMEX3": [            {              "expression": "",              "tid": "",              "type": "",              "value": "",              "mod": "",              "anchorTimeID": "",              "beginPoint": "",              "endPoint": ""                    }                  ]                }              ]            },...我试过以下,但输出太乱,看起来不像我想保存的输出  my_json = (df.groupby(['text','intent','domain'], as_index=False)               .apply(lambda x: x[['entities']].to_dict('r'))               .reset_index()               .to_json(orient='records',indent= 2))
查看完整描述

1 回答

?
慕妹3242003

TA贡献1824条经验 获得超6个赞

我相信,您离想要的格式不远了。唯一的问题是列forth包含字典作为字符串。一种可能的方法是将所有内容转换回字典,使用 eval 将字符串转换回字典,并使用 json 解析器很好地打印它:


import pandas as pd

import json


data = {'First':  ['First value', 'Second value'],

    'Second': ['First value', 'Second value'],

    'third': ['First value', 'Second value'],

    'forth': ['[{"values": "","entity": "datetime","Turn":  [{"expression": "","tid": "","type": "", "value": "","mod": "","anchor": "","beginPoint": "","endPoint": ""}]}]','[{"values": "","entity": "datetime","Turn": [{"expression": "","tid": "","type": "", "value": "","mod": "","anchor": "","beginPoint": "","endPoint": ""}]}]'],

    }

df = pd.DataFrame (data, columns = ['First','Second','third','forth'])


my_dict = df.to_dict(orient='records')

for row in my_dict:

    row['forth'] = eval(row['forth'])

my_json = json.dumps(my_dict, indent=2)

print(my_json)

有两个小的更正,密钥大写Second和无效条目:, "", 在您的forth密钥中。


这是我的输出的副本:


[

  {

    "First": "First value",

    "Second": "First value",

    "third": "First value",

    "forth": [

      {

        "values": "",

        "entity": "datetime",

        "Turn": [

          {

            "expression": "",

            "tid": "",

            "type": "",

            "value": "",

            "mod": "",

            "anchor": "",

            "beginPoint": "",

            "endPoint": ""

          }

        ]

      }

    ]

  },  ...

如果列forth已经是数据框中的字典,您可以to_json直接调用,格式将是您想要的。例如,您可以尝试将更正后的数据转换回my_dict数据帧:


test_df = pd.DataFrame(my_dict)

print(test_df.to_json(orient='records', indent=2))


查看完整回答
反对 回复 2023-02-07
  • 1 回答
  • 0 关注
  • 134 浏览
慕课专栏
更多

添加回答

举报

0/150
提交
取消
微信客服

购课补贴
联系客服咨询优惠详情

帮助反馈 APP下载

慕课网APP
您的移动学习伙伴

公众号

扫描二维码
关注慕课网微信公众号