为了账号安全,请及时绑定邮箱和手机立即绑定

需要重新格式化文本文件,将演讲者文本向上移动一行到演讲者标签

需要重新格式化文本文件,将演讲者文本向上移动一行到演讲者标签

凤凰求蛊 2023-02-22 16:46:15
我有许多包含需要重新格式化的文本的 .txt 文件。具体来说,我有 Speaker A 和 Speaker B,文本在下一行。A:I can not believe the weather today .B:It is beautiful outside .A:Really nice .B:Okay , how are you doing ?A:I am good .B:Good to hear .A:Thank you .可以有更多的发言者,但所有人都会在他们的标签前加上 : 。我希望文件输出为:A: I can not believe the weather today .B: It is beautiful outside .A: Really nice .B: Okay , how are you doing ?A: I am good .B: Good to hear .A: Thank you .谢谢。编辑:另外,如果说话者标签之间有多行文本,是否有解决方案?例如:A:Well hello . Long time no see . How are you doing ? B:Good . How are you ?A:Really great .B:Good .有了预期的结果...A: Well hello . Long time no see . How are you doing ? B: Good . How are you ?A: Really great .B: Good .
查看完整描述

2 回答

?
GCT1015

TA贡献1827条经验 获得超4个赞

正则表达式替换可以处理这个:


import re


text = """A:

I can not believe the weather today .

B:

It is beautiful outside ."""


text = re.sub(r"^(\w+:)\s*", r"\1 ", text, flags=re.MULTILINE)


print(text)


# A: I can not believe the weather today .

# B: It is beautiful outside .

编辑:

基于更新的问题,对于多线对话:


import re


text = """A:

Well hello . 

Long time no see . 

How are you doing ? 

B:

Good . 

How are you ?"""


text = re.sub(r"(.*?)\s*\n(?!\w+:)", r"\1 ", text, flags=re.MULTILINE)


print(text)


# A: Well hello . Long time no see . How are you doing ?

# B: Good . How are you ?


查看完整回答
反对 回复 2023-02-22
?
繁花不似锦

TA贡献1851条经验 获得超4个赞

如果短语在一行上,这应该有效:


lines = file.readlines()

for ii in range(1,len(lines),2):

    print(lines[ii-1][:-1]+lines[ii])


查看完整回答
反对 回复 2023-02-22
  • 2 回答
  • 0 关注
  • 72 浏览
慕课专栏
更多

添加回答

举报

0/150
提交
取消
意见反馈 帮助中心 APP下载
官方微信