为了账号安全,请及时绑定邮箱和手机立即绑定

如何从嵌套列表中找到包含较高值的列表并返回这些列表?

如何从嵌套列表中找到包含较高值的列表并返回这些列表?

jeck猫 2023-03-01 16:54:52
我有这个包含重复条目的嵌套列表:[['Coloring book moana', 'ART_AND_DESIGN', '3.9', 967, '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up'], ['Coloring book moana', 'FAMILY', '3.9', 974, '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up'], ['Gmail', 'COMMUNICATION', '4.3', 4604324, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 2, 2018', 'Varies with device', 'Varies with device'], ['Gmail', 'COMMUNICATION', '4.3', 4604483, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 2, 2018', 'Varies with device', 'Varies with device'], ['Instagram', 'SOCIAL', '4.5', 66577313, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'], ['Instagram', 'SOCIAL', '4.5', 66577446, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'], ['Instagram', 'SOCIAL', '4.5', 66509917, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']]我想通过 i[3] 过滤嵌套列表,所以最终输出将是这样的[['Gmail', 'COMMUNICATION', '4.3', 4604483, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 2, 2018', 'Varies with device', 'Varies with device'], ['Coloring book moana', 'FAMILY', '3.9', 974, '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up'], ['Instagram', 'SOCIAL', '4.5', 66577446, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']]我尝试了一个 for 循环,但我无法弄清楚如何获得重复列表的最高值
查看完整描述

3 回答

?
ITMISS

TA贡献1871条经验 获得超8个赞

这是我能想到的最 pythonic 的方式。我的做法是先对列表的列表进行排序,按sublist[3],这意味着当我们遍历列表时,我们最终会在遇到重复项之前遇到具有最大评论数的子列表。这个技巧将用于构建最终列表。


meta_list = [['Coloring book moana', 'ART_AND_DESIGN', '3.9', 967, '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up'],

 ['Coloring book moana', 'FAMILY', '3.9', 974, '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up'],

 ['Gmail', 'COMMUNICATION', '4.3', 4604324, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 2, 2018', 'Varies with device', 'Varies with device'],

 ['Gmail', 'COMMUNICATION', '4.3', 4604483, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 2, 2018', 'Varies with device', 'Varies with device'],

 ['Instagram', 'SOCIAL', '4.5', 66577313, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'],

 ['Instagram', 'SOCIAL', '4.5', 66577446, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'],

 ['Instagram', 'SOCIAL', '4.5', 66509917, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']]


# Sort the list by review count and review name - make sure the highest review is first

meta_list.sort(key=lambda x: (int(x[3]), x[0]), reverse=True)


# This is the list we'll use to store the final data in

final_list = []

# Go through all the items in the meta_list

for meta in meta_list:

    

    if not meta[0] in [item[0] for item in final_list]:

        '''

        If another meta with the same name (0th index)

        doesn't already exist in final_list, add it

        '''

        final_list.append(meta)

输出-


[['Instagram',

  'SOCIAL',

  '4.5',

  66577446,

  'Varies with device',

  '1,000,000,000+',

  'Free',

  '0',

  'Teen',

  'Social',

  'July 31, 2018',

  'Varies with device',

  'Varies with device'],

 ['Gmail',

  'COMMUNICATION',

  '4.3',

  4604483,

  'Varies with device',

  '1,000,000,000+',

  'Free',

  '0',

  'Everyone',

  'Communication',

  'August 2, 2018',

  'Varies with device',

  'Varies with device'],

 ['Coloring book moana',

  'FAMILY',

  '3.9',

  974,

  '14M',

  '500,000+',

  'Free',

  '0',

  'Everyone',

  'Art & Design;Pretend Play',

  'January 15, 2018',

  '2.0.0',

  '4.0.3 and up']]

基本上它将所有不存在的元数据添加到final_list. 为什么这行得通?因为您在循环时遇到的第一个元数据是评论数最高的元数据。所以一旦那个被添加,它的复制品就不能被添加,我们就完成了。


注意:这不会保留评论本身的顺序。它只会确保只保留评论数最高的评论,以防出现同名的重复评论。


查看完整回答
反对 回复 2023-03-01
?
MMTTMM

TA贡献1869条经验 获得超4个赞

这个问题可能有更优雅/pythonic 的解决方案,但这是一个可能的途径:


my_list = [...] # Nested list here


def compare_duplicates(nested_list, name_index=0, compare_index=3):

    max_values = dict() # Used two dictionaries for readability

    final_indexes = dict()


    for i, item in enumerate(nested_list):

        name, value = item[name_index], item[compare_index]


        if value > max_values.get(name, 0):

            max_values[name] = value

            final_indexes[name] = i


    return [nested_list[i] for i in final_indexes.values()]


print(compare_duplicates(my_list))


查看完整回答
反对 回复 2023-03-01
?
忽然笑

TA贡献1806条经验 获得超5个赞

是这样的:


_DATA = [

    ['Coloring book moana', 'ART_AND_DESIGN', '3.9', 967, '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up'],

    ['Coloring book moana', 'ART_AND_DESIGN', '3.9', 974, '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up'],

    ['Gmail', 'COMMUNICATION', '4.3', 4604324, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 2, 2018', 'Varies with device', 'Varies with device'],

    ['Gmail', 'COMMUNICATION', '4.3', 4604483, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 2, 2018', 'Varies with device', 'Varies with device'],

    ['Instagram', 'SOCIAL', '4.5', 66577313, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'],

    ['Instagram', 'SOCIAL', '4.5', 66577446, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'],

    ['Instagram', 'SOCIAL', '4.5', 66509917, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']

]



def print_highest(data):

    list_map = {}

    for d in data:

        key = str(d[0:3] + d[4:])

        if key not in list_map:

            list_map[key] = d

            continue


        if d[3] > list_map[key][3]:

            list_map[key] = d


    for l in list_map.values():

        print(l)



print_highest(_DATA)

输出:


['Coloring book moana', 'ART_AND_DESIGN', '3.9', 974, '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up']

['Gmail', 'COMMUNICATION', '4.3', 4604483, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 2, 2018', 'Varies with device', 'Varies with device']

['Instagram', 'SOCIAL', '4.5', 66577446, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']



查看完整回答
反对 回复 2023-03-01
  • 3 回答
  • 0 关注
  • 76 浏览
慕课专栏
更多

添加回答

举报

0/150
提交
取消
意见反馈 帮助中心 APP下载
官方微信