为了账号安全,请及时绑定邮箱和手机立即绑定

根据嵌套列表python中的类别计算用户数

根据嵌套列表python中的类别计算用户数

千巷猫影 2022-06-22 15:48:40
我有一个包含两个子列表的列表。这里看起来像这样a = [['user1', 'referral'], ['user2', 'referral'], ['user1', 'referral'], ['user1', 'affiliate'], ['user7', 'affiliate'], ['user1', 'affiliate'], ['user9', 'affiliate'], ['user4', 'cpc'], ['user4', 'referral'], ['user2', 'referral'], ['user7', 'affiliate'], ['user14', 'cpc'], ['user3', 'orgainic'], ['user2', 'orgainic'], ['user4', 'cpc'], ['user2', 'cpc'], ['user8', 'cpc'], ['user2', 'orgainic']]我想根据类别计算用户(唯一)。必需的:required = [['referral',3],['affiliate',3],['cpc',4],['orgainic',2]]我得到的输出:{'referral': 3, 'affiliate': 2, 'cpc': 4, 'orgainic': 3}算错了。这是我尝试过的代码:a = [['user1', 'referral'], ['user2', 'referral'], ['user1', 'referral'], ['user1', 'affiliate'], ['user7', 'affiliate'], ['user1', 'affiliate'], ['user9', 'affiliate'], ['user4', 'cpc'], ['user4', 'referral'], ['user2', 'referral'], ['user7', 'affiliate'], ['user14', 'cpc'], ['user3', 'orgainic'], ['user2', 'orgainic'], ['user4', 'cpc'], ['user2', 'cpc'], ['user8', 'cpc'], ['user2', 'orgainic']]required = [['referral',3],['affiliate',3],['cpc',4],['orgainic',2]]c = {}visits = []for i in a:    # print(i)    for j in i[1:]:        if j not in c and i[0] not in visits:            c[j] = 1            visits.append(i[0])        elif j in c and i[0] not in visits:            c[j] = c[j]+1print(c)帮我解决一些问题...
查看完整描述

4 回答

?
牧羊人nacy

TA贡献1862条经验 获得超7个赞

这是一种使用collections.defaultdict.


前任:


from collections import defaultdict


a = [['user1', 'referral'], ['user2', 'referral'], ['user1', 'referral'], ['user1', 'affiliate'], ['user7', 'affiliate'], ['user1', 'affiliate'], ['user9', 'affiliate'], ['user4', 'cpc'], ['user4', 'referral'], ['user2', 'referral'], ['user7', 'affiliate'], ['user14', 'cpc'], ['user3', 'orgainic'], ['user2', 'orgainic'], ['user4', 'cpc'], ['user2', 'cpc'], ['user8', 'cpc'], ['user2', 'orgainic']]

result = defaultdict(int)

seen = set()

for k, v in a:

    key = "{}_{}".format(k, v)

    if key not in seen:

        result[v] += 1

        seen.add(key)

print(list(map(list, result.items())))

输出:


[['referral', 3], ['affiliate', 3], ['cpc', 4], ['orgainic', 2]]


查看完整回答
反对 回复 2022-06-22
?
白衣非少年

TA贡献1155条经验 获得超0个赞

首先让我们使条目独一无二:


c = {tuple(sublist) for sublist in a}

现在我们有了一对独特的用户和类型。


对于我们不需要用户的计数,因此让我们将其设为仅包含第二个参数的列表:


c = [elem[1] for elem in c]

现在我们可以很容易地计算它:


from collections import Counter

c = Counter(c)

结果:Counter({'cpc': 4, 'affiliate': 3, 'referral': 3, 'orgainic': 2})


现在把它们放在一起:


from collections import Counter


c = Counter(elem[1] for elem in {tuple(sublist) for sublist in a})


查看完整回答
反对 回复 2022-06-22
?
繁星coding

TA贡献1797条经验 获得超4个赞

defaultdict和基于循环的解决方案

这可以使用defaultdict:


d = defaultdict(set)

for user, category in a:

    d[category].add(user)

res = [[category, len(users)] for category, users in d.items()]

输出:


# [['affiliate', 3], ['cpc', 4], ['orgainic', 2], ['referral', 3]]

groupby基于解决方案

或者,这可以使用groupbyfrom来完成itertools:


from itertools import groupby

from operator import itemgetter


a = [['user1', 'referral'], ['user2', 'referral'], ['user1', 'referral'], ...]


# Sort the items according to the category so groupby will collect the pairs accordingly

res = {category: len({user for user, _ in pairs}) for category, pairs in

       groupby(sorted(a, key=itemgetter(1)), key=itemgetter(1))}


res = [list(pair) for pair in res.items()]

输出:


# [['affiliate', 3], ['cpc', 4], ['orgainic', 2], ['referral', 3]]


查看完整回答
反对 回复 2022-06-22
?
撒科打诨

TA贡献1934条经验 获得超2个赞

这听起来像是熊猫的案例,您的列表已经是正确的形状:


import pandas as pd

a = [['user1', 'referral'], ['user2', 'referral'], ['user1', 'referral'], ['user1', 'affiliate'], ['user7', 'affiliate'], ['user1', 'affiliate'], ['user9', 'affiliate'], ['user4', 'cpc'], ['user4', 'referral'], ['user2', 'referral'], ['user7', 'affiliate'], ['user14', 'cpc'], ['user3', 'orgainic'], ['user2', 'orgainic'], ['user4', 'cpc'], ['user2', 'cpc'], ['user8', 'cpc'], ['user2', 'orgainic']]


df = pd.DataFrame(a)

df.columns=["user", "type"]


unique_per_type = df.groupby("type")["user"].unique()

现在 unique_per_type 是:


type

affiliate            [user1, user7, user9]

cpc          [user4, user14, user2, user8]

orgainic                    [user3, user2]

referral             [user1, user2, user4]

Name: user, dtype: object

您可以执行以下操作:


# access length by key

len(unique_per_type["affiliate"]) 


# or use it like a dict

for key, val in unique_per_type.items():

    print(key, len(val)))

这个解决方案添加了 pandas,这是一个巨大的依赖。但是,一旦您将数据放入 DataFrame 中,您就可以用它做很多事情:


df["user"].unique() # shows all unique users


df.query("user=='user1'") # shows all observations involving user1


查看完整回答
反对 回复 2022-06-22
  • 4 回答
  • 0 关注
  • 218 浏览
慕课专栏
更多

添加回答

举报

0/150
提交
取消
微信客服

购课补贴
联系客服咨询优惠详情

帮助反馈 APP下载

慕课网APP
您的移动学习伙伴

公众号

扫描二维码
关注慕课网微信公众号