python - Counting total number of tasks executed in a multiprocessing.Pool during execution -
python - Counting total number of tasks executed in a multiprocessing.Pool during execution -
i'd love give indication of current talk in total only. i'm farming work out , know current progress. if sent 100
jobs 10
processors, how can show current number of jobs have returned is. can id's but how count number of completed returned jobs map function.
i'm calling function following:
op_list = pool.map(ppmdr_star, list(varg))
and in function can print current name
current = multiprocessing.current_process() print 'running: ', current.name, current._identity
if utilize pool.map_async
can pull info out of mapresult
instance gets returned. example:
import multiprocessing import time def worker(i): time.sleep(i) homecoming if __name__ == "__main__": pool = multiprocessing.pool() result = pool.map_async(worker, range(15)) while not result.ready(): print("num left: {}".format(result._number_left)) time.sleep(1) real_result = result.get() pool.close() pool.join()
output:
num left: 15 num left: 14 num left: 13 num left: 12 num left: 11 num left: 10 num left: 9 num left: 9 num left: 8 num left: 8 num left: 7 num left: 7 num left: 6 num left: 6 num left: 6 num left: 5 num left: 5 num left: 5 num left: 4 num left: 4 num left: 4 num left: 3 num left: 3 num left: 3 num left: 2 num left: 2 num left: 2 num left: 2 num left: 1 num left: 1 num left: 1 num left: 1
multiprocessing
internally breaks iterable pass map
chunks, , passes each chunk children processes. so, _number_left
attribute keeps track of number of chunks remaining, not individual elements in iterable. maintain in mind if see odd looking numbers when utilize big iterables. uses chunking improve ipc performance, if seeing accurate tally of completed results more of import added performance, can utilize chunksize=1
keyword argumment map_async
create _num_left
more accurate. (the chunksize
makes noticable performance difference big iterables. seek see if matters usecase).
as mentioned in comments, because pool.map
blocking, can't unless start background thread did polling while main thread blocked in map
call, i'm not sure there's benefit doing on above approach.
the other thing maintain in mind you're using internal attribute of mapresult
, it's possible break in future versions of python.
python parallel-processing multiprocessing
Comments
Post a Comment