假设我有这个列表:names = ['your name', 'the name', 'his name', 'her name', 'their name', 'employer name', "employer's name", "father's name", "mother's name", "maiden name", "son's name", "daughter's name", "brother's name", "sister's name"]假设我有这段文字:text = "What is your name? Well, uh it's John Smith. Thanks for asking. Anyway, I'd doing well."如何使用正则表达式在文本中查找列表名称的每个元素,并立即用“[name]”替换元素之后的文本块(例如,长度为 50)。所以我的输出是:text = "What is your name [name] Anyway, I'd doing well."到目前为止,我在下面有这段代码,但它只用“[name]”替换了元素,而不是元素后面的实际文本。def my_replace3(match): match = match.group() return " [name] "def no_name(text): names = ['your name', 'the name', 'his name', 'her name', 'their name', 'employer name', "employer's name", "father's name", "mother's name", "maiden name", "son's name", "daughter's name", "brother's name", "sister's name"] regex = re.compile(r'\b(' + '|'.join(names) + r')\b', re.IGNORECASE) text = re.sub(regex, my_replace3, text) return text我不是一个伟大的正则表达式专家,所以你的帮助将不胜感激。
1 回答
三国纷争
TA贡献1804条经验 获得超7个赞
如果要在匹配后替换 50 个字符,请添加.{50}到正则表达式。
然后在替换字符串中使用反向引用将匹配的字符串复制到替换。
def no_name(text):
names = ['your name', 'the name', 'his name', 'her name', 'their name', 'employer name', "employer's name", "father's name",
"mother's name", "maiden name", "son's name", "daughter's name", "brother's name", "sister's name"]
regex = re.compile(r'\b(' + '|'.join(map(re.escape, names)) + r')\b.{50}', re.IGNORECASE)
text = re.sub(regex, r'\1 [name]', text)
return text
您还应该re.escape()在将应该完全匹配的字符串插入到正则表达式中时使用,以防它们中的任何一个包含正则表达式运算符。
添加回答
举报
0/150
提交
取消
