为了账号安全,请及时绑定邮箱和手机立即绑定

通过迭代嵌套字典中的第 n 层值来创建数据帧

通过迭代嵌套字典中的第 n 层值来创建数据帧

FFIVE 2023-07-18 13:57:03
我从网上人类患病icd-11分类下载了一个json文件,该数据最多有8层嵌套,例如:  "name":"br08403",    "children":[    {        "name":"01 Certain infectious or parasitic diseases",        "children":[        {            "name":"Gastroenteritis or colitis of infectious origin",            "children":[            {                "name":"Bacterial intestinal infections",                "children":[                {                    "name":"1A00  Cholera",                    "children":[                    {                        "name":"H00110  Cholera"                    }我尝试使用以下代码:def flatten_json(nested_json):    """        Flatten json object with nested keys into a single level.        Args:            nested_json: A nested json object.        Returns:            The flattened json object if successful, None otherwise.    """    out = {}    def flatten(x, name=''):        if type(x) is dict:            for a in x:                flatten(x[a], name + a + '_')        elif type(x) is list:            i = 0            for a in x:                flatten(a, name + str(i) + '_')                i += 1        else:            out[name[:-1]] = x    flatten(nested_json)    return outdf2 = pd.Series(flatten_json(dictionary)).to_frame()我得到的输出是:name    br08403children_0_name 01 Certain infectious or parasitic diseaseschildren_0_children_0_name  Gastroenteritis or colitis of infectious originchildren_0_children_0_children_0_name   Bacterial intestinal infectionschildren_0_children_0_children_0_children_0_name    1A00 Cholera... ...children_21_children_17_children_10_name    NF0A Certain early complications of trauma, n...children_21_children_17_children_11_name    NF0Y Other specified effects of external causeschildren_21_children_17_children_12_name    NF0Z Unspecified effects of external causeschildren_21_children_18_name    NF2Y Other specified injury, poisoning or cer...children_21_children_19_name    NF2Z Unspecified injury, poisoning or certain..但所需的输出是一个具有 8 列的数据框,它可以容纳嵌套名称键的最后深度,例如:
查看完整描述

1 回答

?
肥皂起泡泡

TA贡献1829条经验 获得超6个赞

一种简单的pandas迭代方法。


res = requests.get("https://www.genome.jp/kegg-bin/download_htext?htext=br08403.keg&format=json&filedir=")

js = res.json()


df = pd.json_normalize(js)

for i in range(20):

    df = pd.json_normalize(df.explode("children").to_dict(orient="records"))

    if "children" in df.columns: df.drop(columns="children", inplace=True)

    df = df.rename(columns={"children.name":f"level{i}","children.children":"children"})

    if df[f"level{i}"].isna().all() or "children" not in df.columns: break


查看完整回答
反对 回复 2023-07-18
  • 1 回答
  • 0 关注
  • 54 浏览
慕课专栏
更多

添加回答

举报

0/150
提交
取消
意见反馈 帮助中心 APP下载
官方微信