如何将嵌套的字典转换为数据帧

2022-03-02 00:00:00 python pandas dictionary json data-science

问题描述

假设我有一个API响应,如下所示:

{
    "fact": {
        "UP": [{
            "SCODE": "CNB",
            "SNAME": "Kanpur Central"
        }, {
            "SCODE": "JHS",
            "SNAME": "Jhansi Junction"
        }],
        "MP": [{
            "SCODE": "BPL",
            "SNAME": "Bhopal Junction"
        }, {
            "SCODE": "JBP",
            "SNAME": "Jabalpur Junction"
        }]
    }
}

我必须将其转换为如下所示的数据帧(预期输出):

fact    SCODE   SNAME
UP      CNB     Kanpur Central
UP      JHS     Jhansi Junction
MP      BPL     Bhopal Junction
MP      JBP     Jabalpur Junction

我的努力:我尝试使用json_Normize(),但没有达到预期输出:

pd.json_normalize(response).apply(pd.Series.explode)

解决方案

一个选项是使用Python重塑:

df = pd.DataFrame([{'fact': k, **item}
                   for k, lst in response['fact'].items()
                   for item in lst])
  fact SCODE              SNAME
0   UP   CNB     Kanpur Central
1   UP   JHS    Jhansi Junction
2   MP   BPL    Bhopal Junction
3   MP   JBP  Jabalpur Junction

Apandas选项通过explode+applypd.Series

df = (
    pd.DataFrame(response)['fact']
        .explode()
        .apply(pd.Series)
        .rename_axis('fact')
        .reset_index()
)
  fact SCODE              SNAME
0   MP   BPL    Bhopal Junction
1   MP   JBP  Jabalpur Junction
2   UP   CNB     Kanpur Central
3   UP   JHS    Jhansi Junction

相关文章