为了账号安全,请及时绑定邮箱和手机立即绑定

如何拆分字符串并将所有拆分添加到一个长列中?

如何拆分字符串并将所有拆分添加到一个长列中?

慕标琳琳 2023-10-26 10:50:32
我有一个包含一列和多行的数据框。每行包含一首歌曲的歌词,行由“\n”分隔,到目前为止我所拥有的是with open('Lyrics_Pavement.json') as json_data:data = json.load(json_data)df = pd.DataFrame(data['songs'])df1 = df.lyrics.str.split(pat="\n")然后 df1 包含一个 1 列数据帧,其中歌词已被删除并被“[]”包围。1    [It's the shouting, it's the shouting, It's the Dutchman, it's the Dutchman shout, Get it away, I don't need your shaft, It's the shouting, it's the shouting, It's the shouting, it's the Dutchman shout, Give it away, I don't need your shaft, (yes I do), It's the shouting, it's the shouting, It's the shouting, it's the Dutchman shout, Get it away, I don't need your shaft] 这是第 1 行的示例。我如何让数据显示为这样:It's the shouting,It's the shouting,It's the dutchman等等。上面的每一新行都是数据帧的一行。然后对于第 2 行,将相同的歌词附加到该数据帧。谢谢!
查看完整描述

3 回答

?
GCT1015

TA贡献1827条经验 获得超4个赞

尝试:

df1 = df.lyrics.str.split(pat="\n").explode()


查看完整回答
反对 回复 2023-10-26
?
森林海

TA贡献2011条经验 获得超2个赞

我从你的帖子中得知,歌词df1只是一长串,而不是实际的list?如果是这种情况,那么我只需使用内置字符串方法将该split字符串用逗号连接起来,然后重新组装成数据帧:


s = "[It's the shouting, it's the shouting, It's the Dutchman, it's the Dutchman shout, Get it away, I don't need your shaft, It's the shouting, it's the shouting, It's the shouting, it's the Dutchman shout, Give it away, I don't need your shaft, (yes I do), It's the shouting, it's the shouting, It's the shouting, it's the Dutchman shout, Get it away, I don't need your shaft]"


lines = [i.strip() for i in s[1:-1].split(',')]

df = pd.DataFrame(lines)

输出:


                          0

0         It's the shouting

1         it's the shouting

2         It's the Dutchman

3   it's the Dutchman shout

4               Get it away

5   I don't need your shaft

6         It's the shouting

7         it's the shouting

8         It's the shouting

9   it's the Dutchman shout

10             Give it away

11  I don't need your shaft

12               (yes I do)

13        It's the shouting

14        it's the shouting

15        It's the shouting

16  it's the Dutchman shout

17              Get it away

18  I don't need your shaft

s[1:-1]省略括号

.split(',')用逗号分隔

.strip()删除多余的空格

lines = s[1:-1].split(', ')如果您知道每首歌词之间总是有一个逗号+一个空格,您也可以这样做。


如果您的完整歌词是 的一部分df1,您可以loc(或w/e)访问该字符串并遵循此答案。


查看完整回答
反对 回复 2023-10-26
?
繁星淼淼

TA贡献1775条经验 获得超11个赞

import pandas as pd


longstring = '''It's the shouting, it's the shouting, It's the Dutchman, it's the Dutchman shout, Get it away, I don't need your shaft, It's the shouting, it's the shouting, It's the shouting, it's the Dutchman shout, Give it away, I don't need your shaft, (yes I do), It's the shouting, it's the shouting, It's the shouting, it's the Dutchman shout, Get it away, I don't need your shaft'''



splitstring = [e.strip()+"," for e in longstring.split(",")]

splitstring[-1] = splitstring[-1].replace(",","")

df1 = pd.DataFrame(splitstring)

print(df1)  



#                           0

#0         It's the shouting,

#1         it's the shouting,

#2         It's the Dutchman,

#3   it's the Dutchman shout,

#4               Get it away,

#5   I don't need your shaft,

#6         It's the shouting,

#7         it's the shouting,

#8         It's the shouting,

#9   it's the Dutchman shout,

#10             Give it away,

#11  I don't need your shaft,

#12               (yes I do),

#13        It's the shouting,

#14        it's the shouting,

#15        It's the shouting,

#16  it's the Dutchman shout,

#17              Get it away,

#18   I don't need your shaft


查看完整回答
反对 回复 2023-10-26
  • 3 回答
  • 0 关注
  • 83 浏览
慕课专栏
更多

添加回答

举报

0/150
提交
取消
意见反馈 帮助中心 APP下载
官方微信