首页猿问转换为 Pandas 数据帧时保留...

转换为 Pandas 数据帧时保留 R 数据帧索引值

Python

HUX布斯 2021-10-26 15:58:08

使用 R（基础版本 3.5.2）包 LME4 拟合混合效果模型，通过 Python 3.6 的 rpy2 2.9.4 运行能够将随机效应打印为索引数据帧，其中索引值是用于定义组的分类变量的值（使用氡数据）：import rpy2.robjects as rofrom rpy2.robjects import pandas2ri, default_converterfrom rpy2.robjects.conversion import localconverterfrom rpy2.robjects.packages import importrlme4 = importr('lme4')mod = lme4.lmer(**kwargs) # Omitting arguments for brevityr_ranef = ro.r['ranef']re = r_ranef(mod)print(re[1]) Uppm (Intercept) floor (Intercept)AITKIN -0.0026783361 -2.588735e-03 1.742426e-09 -0.0052003670ANOKA -0.0056688495 -6.418760e-03 -4.482764e-09 -0.0128942943BECKER 0.0021906431 1.190746e-03 1.211201e-09 0.0023920238BELTRAMI 0.0093246041 8.190172e-03 5.135196e-09 0.0164527872BENTON 0.0018747838 1.049496e-03 1.746748e-09 0.0021082742BIG STONE -0.0073756824 -2.430404e-03 0.000000e+00 -0.0048823057BLUE EARTH 0.0112939204 4.176931e-03 5.507525e-09 0.0083908075BROWN 0.0069223055 2.544912e-03 4.911563e-11 0.0051123339将其转换为 Pandas DataFrame，分类值将从索引中丢失并替换为整数：pandas2ri.ri2py_dataframe(r_ranef[1]) # r_ranef is a dict of dataframes Uppm (Intercept) floor (Intercept)0 -0.002678 -0.002589 1.742426e-09 -0.0052001 -0.005669 -0.006419 -4.482764e-09 -0.0128942 0.002191 0.001191 1.211201e-09 0.0023923 0.009325 0.008190 5.135196e-09 0.0164534 0.001875 0.001049 1.746748e-09 0.0021085 -0.007376 -0.002430 0.000000e+00 -0.0048826 0.011294 0.004177 5.507525e-09 0.0083917 0.006922 0.002545 4.911563e-11 0.005112如何保留原始索引的值？该文档建议as.data.frame可能包含grp，这可能是我所追求的值，但我正在努力通过 rpy2 实现它；例如，r_ranef = ro.r['ranef.as.data.frame']不起作用

查看完整描述

2 回答

qq_遁去的一_1

TA贡献1725条经验获得超7个赞

考虑row.names在 R 数据框中添加一个新列，然后将此列用于set_indexPandas 数据框中：

base = importr('base')

# ADD NEW COLUMN TO R DATA FRAME

re[1] = base.transform(re[1], index = base.row_names(re[1]))

# SET INDEX IN PANDAS DATA FRAME

py_df = (pandas2ri.ri2py_dataframe(re[1])

.set_index('index')

.rename_axis(None)

)

要对列表中的所有数据框执行此操作，请使用 R 的lapply循环，然后使用 Python 的列表理解来获取 Pandas 索引数据框的新列表。

base = importr('base')

mod = lme4.lmer(**kwargs) # Omitting arguments for brevity

r_ranef = lme4.ranef(mod)

# R LAPPLY

new_r_ranef = base.lapply(r_ranef, lambda df:

base.transform(df, index=base.row_names(df)))

# PYTHON LIST COMPREHENSION

py_df_list = [(pandas2ri.ri2py_dataframe(df)

.set_index('index')

.rename_axis(None)

) for df in new_r_ranef]

反对回复 2021-10-26

烙印99

TA贡献1829条经验获得超13个赞

import rpy2.robjects as ro

from rpy2.robjects import pandas2ri, default_converter

from rpy2.robjects.conversion import localconverter

r_dataf = ro.r("""

data.frame(

Uppm = rnorm(5),

row.names = letters[1:5]

)

""")

with localconverter(default_converter + pandas2ri.converter) as conv:

pd_dataf = conv.rpy2py(r_dataf)

# row names are "a".."f"

print(r_dataf)

# row names / indexes are now 0..4

print(pd_dataf)

这可能是 rpy2 中的一个小错误/缺失功能，但解决方法相当简单：

with localconverter(default_converter + pandas2ri.converter) as conv:

pd_dataf = conv.rpy2py(r_dataf)

pd_dataf.index = r_dataf.rownames

反对回复 2021-10-26

2 回答
0 关注
152 浏览

关注

添加回答

0/150

提交

取消

热搜

最近搜索清空

转换为 Pandas 数据帧时保留 R 数据帧索引值

转换为 Pandas 数据帧时保留 R 数据帧索引值

2 回答

添加回答