1 回答

TA贡献1821条经验 获得超5个赞
调用combined = future.result()会阻塞,直到结果完成,因此您不会在前一个请求完成之前向池提交后续请求。换句话说,您永远不会运行多个子进程。至少您应该将代码更改为:
with ProcessPoolExecutor(max_workers=10) as pool:
the_futures = []
for samples in tqdm(sample_list):
future = pool.submit(reduce, add, [item2item[s] for s in samples], Counter())
the_futures.append(future) # save it
results = [f.result() for f in the_futures()] # all the results
另一种方式:
with ProcessPoolExecutor(max_workers=10) as pool:
the_futures = []
for samples in tqdm(sample_list):
future = pool.submit(reduce, add, [item2item[s] for s in samples], Counter())
the_futures.append(future) # save it
# you need: from concurrent.futures import as_completed
for future in as_completed(the_futures): # not necessarily the order of submission
result = future.result() # do something with this
此外,如果您未指定构造函数,则默认为您机器上的处理器数量max_workers。ProcessPoolExecutor指定一个大于您实际拥有的处理器数量的值不会有任何收获。
更新
如果您想在结果完成后立即处理结果并需要一种方法将结果与原始请求联系起来,您可以将期货作为键存储在字典中,其中相应的值表示请求的参数。在这种情况下:
with ProcessPoolExecutor(max_workers=10) as pool:
the_futures = {}
for samples in tqdm(sample_list):
future = pool.submit(reduce, add, [item2item[s] for s in samples], Counter())
the_futures[future] = samples # map future to request
# you need: from concurrent.futures import as_completed
for future in as_completed(the_futures): # not necessarily the order of submission
samples = the_futures[future] # the request
result = future.result() # the result
添加回答
举报