Python - 在数据框行内的列表中搜索元素

我正在尝试捕获列表格式的数据框/熊猫中的元素。如果字符串存在，下面会捕获整个列表，我如何仅按行捕获特定字符串的元素而忽略其余部分？这是我尝试过的...l1 = [1,2,3,4,5,6]l2 = ['hello world \n my world','world is a great place \n we live in it','planet earth',np.NaN,'\n save the water','']df = pd.DataFrame(list(zip(l1,l2)), columns=['id','sentence'])df['sentence_split'] = df['sentence'].str.split('\n')print(df)这段代码的结果：df[df.sentence_split.str.join(' ').str.contains('world', na=False)] # does the trick but still not exactly what I am looking for. id sentence sentence_split1 hello world \n my world [hello world , my world]2 world is a great place \n we live in it [world is a great place , we live in it]但寻找：id sentence sentence_split1 hello world \n my world hello world; my world2 world is a great place \n we live in it world is a great place

查看完整描述

1 回答

UYOU

TA贡献1878条经验获得超4个赞

您正在寻找在一系列列表中搜索字符串。一种方法是：

# Drop NaN rows

df = df.dropna(subset=["sentence_split"])

应用仅保留您要查找的列表中的元素的函数

# Apply this lamda function

df["sentence_split"] = df["sentence_split"].apply(lambda x: [i for i in x if "world" in i])

id sentence sentence_split

0 1 hello world \n my world [hello world , my world]

1 2 world is a great place \n we live in it [world is a great place ]

2 3 planet earth []

4 5 \n save the water []

5 6 []

反对回复 2022-01-05

热搜

最近搜索清空

Python - 在数据框行内的列表中搜索元素

Python - 在数据框行内的列表中搜索元素

1 回答

添加回答