I had been playing with multithreading in Haskell recently, and I was impressed to see that it had a function mapConcurrently, which allowed you to run functions in parallel. I thought I would implement this functionality in Python.
Multithreading is something I had played with before in Python, but my prior attempts were completely inelegant. My latest attempt contains few lines of code, and has looser coupling. Here’s some generic code for map_concurrently, which requires a helper class:
import threading class Strand(threading.Thread): def __init__(self, func, arg): threading.Thread.__init__(self) self.func = func self.arg = arg def run(self): self.result = self.func(self.arg) def map_concurrently(func, arg_list): strands = [Strand(func, el) for el in arg_list] for s in strands: s.start() for s in strands: s.join() results = [s.result for s in strands] return results
This now replicates the functionality of Haskell’s mapConcurrently.I don’t know of any way of avoiding the help class (Strand), because I need a place to store the result of the computation for later collection.
Stash the code above into a module somewhere, and call it using the following example:
import time def myfunc(x): time.sleep(x) print("Hello from myfunc. Arg is", x) return x + 10 print(map_concurrently(myfunc, [1.5, 1.4, 1.6]))
As you can see, map_concurrently takes a function and a list of arguments, applies that function to the arguments individually, and collects the results into a list.
In the example, myfunc sleeps for the number of seconds passed in as an argument. This allows us to simulate some latency in the function. It then notifies us that it is alive, and returns the argument passed in, incremented by 10. Here is the output of running the code:
Hello from myfunc. Arg is 1.4 Hello from myfunc. Arg is 1.5 Hello from myfunc. Arg is 1.6 [11.5, 11.4, 11.6]
You can see that the function notifies you based on the length of the delay, but the result that is collected is based on original argument order.
You could use map_concurrently to download multiple urls in parallel; which is the original motivation for my interest in multithreading.