如何在正则表达式匹配后使用正则表达式删除特定文本部分

Python

www说 2022-06-14 15:06:31

假设我有这个列表：names = ['your name', 'the name', 'his name', 'her name', 'their name', 'employer name', "employer's name", "father's name", "mother's name", "maiden name", "son's name", "daughter's name", "brother's name", "sister's name"]假设我有这段文字：text = "What is your name? Well, uh it's John Smith. Thanks for asking. Anyway, I'd doing well."如何使用正则表达式在文本中查找列表名称的每个元素，并立即用“[name]”替换元素之后的文本块（例如，长度为 50）。所以我的输出是：text = "What is your name [name] Anyway, I'd doing well."到目前为止，我在下面有这段代码，但它只用“[name]”替换了元素，而不是元素后面的实际文本。def my_replace3(match): match = match.group() return " [name] "def no_name(text): names = ['your name', 'the name', 'his name', 'her name', 'their name', 'employer name', "employer's name", "father's name", "mother's name", "maiden name", "son's name", "daughter's name", "brother's name", "sister's name"] regex = re.compile(r'\b(' + '|'.join(names) + r')\b', re.IGNORECASE) text = re.sub(regex, my_replace3, text) return text我不是一个伟大的正则表达式专家，所以你的帮助将不胜感激。

查看完整描述

1 回答

三国纷争

TA贡献1804条经验获得超7个赞

如果要在匹配后替换 50 个字符，请添加.{50}到正则表达式。

然后在替换字符串中使用反向引用将匹配的字符串复制到替换。

def no_name(text):

names = ['your name', 'the name', 'his name', 'her name', 'their name', 'employer name', "employer's name", "father's name",

"mother's name", "maiden name", "son's name", "daughter's name", "brother's name", "sister's name"]

regex = re.compile(r'\b(' + '|'.join(map(re.escape, names)) + r')\b.{50}', re.IGNORECASE)

text = re.sub(regex, r'\1 [name]', text)

return text

您还应该re.escape()在将应该完全匹配的字符串插入到正则表达式中时使用，以防它们中的任何一个包含正则表达式运算符。

反对回复 2022-06-14

热搜

最近搜索清空

如何在正则表达式匹配后使用正则表达式删除特定文本部分

如何在正则表达式匹配后使用正则表达式删除特定文本部分

1 回答

添加回答