python - Counting total number of tasks executed in a multiprocessing.Pool during execution -



python - Counting total number of tasks executed in a multiprocessing.Pool during execution -

i'd love give indication of current talk in total only. i'm farming work out , know current progress. if sent 100 jobs 10 processors, how can show current number of jobs have returned is. can id's but how count number of completed returned jobs map function.

i'm calling function following:

op_list = pool.map(ppmdr_star, list(varg))

and in function can print current name

current = multiprocessing.current_process() print 'running: ', current.name, current._identity

if utilize pool.map_async can pull info out of mapresult instance gets returned. example:

import multiprocessing import time def worker(i): time.sleep(i) homecoming if __name__ == "__main__": pool = multiprocessing.pool() result = pool.map_async(worker, range(15)) while not result.ready(): print("num left: {}".format(result._number_left)) time.sleep(1) real_result = result.get() pool.close() pool.join()

output:

num left: 15 num left: 14 num left: 13 num left: 12 num left: 11 num left: 10 num left: 9 num left: 9 num left: 8 num left: 8 num left: 7 num left: 7 num left: 6 num left: 6 num left: 6 num left: 5 num left: 5 num left: 5 num left: 4 num left: 4 num left: 4 num left: 3 num left: 3 num left: 3 num left: 2 num left: 2 num left: 2 num left: 2 num left: 1 num left: 1 num left: 1 num left: 1

multiprocessing internally breaks iterable pass map chunks, , passes each chunk children processes. so, _number_left attribute keeps track of number of chunks remaining, not individual elements in iterable. maintain in mind if see odd looking numbers when utilize big iterables. uses chunking improve ipc performance, if seeing accurate tally of completed results more of import added performance, can utilize chunksize=1 keyword argumment map_async create _num_left more accurate. (the chunksize makes noticable performance difference big iterables. seek see if matters usecase).

as mentioned in comments, because pool.map blocking, can't unless start background thread did polling while main thread blocked in map call, i'm not sure there's benefit doing on above approach.

the other thing maintain in mind you're using internal attribute of mapresult, it's possible break in future versions of python.

python parallel-processing multiprocessing

Comments

Popular posts from this blog

formatting - SAS SQL Datepart function returning odd values -

c++ - Apple Mach-O Linker Error(Duplicate Symbols For Architecture armv7) -

php - Yii 2: Unable to find a class into the extension 'yii2-admin' -