数据框看起来像这样col_aPython PY is a general purpose PY languageProgramming PY language in Python PY Its easier to understand PYThe syntax of the language is clean PY这段代码我试图实现此功能,但无法获得预期的输出。如果有任何帮助表示赞赏。这是我使用正则表达式处理的以下代码:df['col_a'].str.extract(r"([a-zA-Z'-]+\s+PY)\b")期望的输出:col_a col_b_PY Python PY is a general purpose language Python PY purpose PYProgramming PY language in Python PY Python PY Programming PY Its easier to understand PY understand PY The syntax of the language is clean PY clean PY
2 回答
扬帆大鱼
TA贡献1799条经验 获得超9个赞
import re
def app(row):
return ' '.join(re.findall(r'\w+\s+PY', row.col_a))
df['col_b_PY'] = df.apply(app, axis=1)
您需要连接应用函数中每一行的所有匹配项。也可以使用它来做到这extractall一点,但我发现这更简单、更直接。
添加回答
举报
0/150
提交
取消
