通过将每一行转换为熊猫数据框中的字典来创建新列

所以我从以下 csv 创建了一个 pandas 数据框：id age00 education marital gender ethnic industry income000 51.965 17 0 1 0 5 761101 41.807 12 1 0 0 1 432162 36.331 12 1 0 1 3 521183 56.758 9 1 1 2 2 47770我的目标是创建一个名为future_income的新列，它获取每一行并使用我的模型计算未来收入。这是由我在下面创建的类中的predictFinalIncome变量完成的：class myModel: def __init__(self, bias) : self.bias = bias # bias is a dictionary with info to set bias on the gender function and the ethnic function def b_gender(self, gender): effect = 0 if (self.bias["gender"]): # if there is gender bias in this model/world (from the constructor) effect = -0.0005 if (gender<1) else 0.0005 # This amount to 1.2% difference annually return self.scale * effect def b_ethnic(self, ethnic): effect = 0 if (self.bias["ethnic"]): # if there is ethnic bias in this model/world (from the constructor) effect = -0.0007 if (ethnic < 1) else -0.0003 if (ethnic < 2) else 0.0005 return self.scale * effect # other methods/functions def predictGrowthFactor( self, person ): # edited factor = 1 + person['education'] + person['marital'] + person['income'] + person['industry'] return factor def predictIncome( self, person ): # perdict the new income one MONTH later. (At least on average, each month the income grows.) return person['income']*self.predictGrowthFactor( person ) def predictFinalIncome( self, n, person ): n_income = self.predictIncome( person ) for i in range(n): n_income = n_income * i return n_income在这种情况下，n 是 120。所以简而言之。我想取出每一行，将其放入名为predictFinalIncome的类函数中，并在我的 df 上有一个名为 future_income 的新变量，这是他们在 120 个月内的收入。

查看完整描述

1 回答

萧十郎

TA贡献1815条经验获得超13个赞

我认为你只是让它变得非常复杂，你所做的所有计算实际上都可以通过一个函数来完成，除非你需要你的中间结果用于其他用途。

您可以创建一个可应用于数据框每一行的函数：

def predictFinalIncome(row, n):

factor = 1 + row['education'] + row['marital'] + row['income'] + row['industry']

n_income = row['income'] * factor

for i in range(n):

n_income = n_income * i

return n_income

然后，使用df.apply：

df.apply(lambda r: predictFinalIncome(r, 120), axis=1)

它返回 0，因为当你这样做时for i in range(n)，你实际上是从 0 开始的，所以结果总是 0。你需要修复它。

更新：使函数存在于Model类中

从您的帖子中，我没有看到此函数存在于模型中的明显原因，特别是此函数不使用任何其他方法，也没有使用您创建的偏差属性，但它就是这样。

class myModel:

def __init__(self, bias) :

self.bias = bias

def predictFinalIncome(self, row, n):

factor = 1 + row['education'] + row['marital'] + row['income'] + row['industry']

n_income = row['income'] * factor

for i in range(n):

n_income = n_income * i

return n_income

# to use:

model = myModel(bias)

df.apply(lambda r: model.predictFinalIncome(r, 120), axis=1)

反对回复 2022-11-01

热搜

最近搜索清空

通过将每一行转换为熊猫数据框中的字典来创建新列

通过将每一行转换为熊猫数据框中的字典来创建新列

1 回答

添加回答