Multithreading, Python and passed arguments
Recently I’ve had a project that required precompiling the firmware for a device so that the end user could program the device, but not have the source code. We’re not talking about a few versions of the code, but almost 1000. This is something that no person would want to do, especially since it would have to be redone every time the source code changes. Python to the rescue. It was simple enough to write a program that would copy the source code, change a bit of information in a header file, compile it and save the binary to the appropriate location. Controlling other programs is pretty easy with the subprocess module. That’s great and all, but doing it single-threaded, that’s so 90s. Python makes multithreading pretty simple using its multiprocessing library. The trick is not stepping on any toes when you do it.
For starters, I have eight (okay, four hyperthreaded) cores. Since I still have to use the system, I set up a pool to use up to six threads. The object that is returned from the pool.apply_async() function was arrayed to keep track of all the desired processes. This is because in the time it takes for a dozen of the compilations to finish, I’ve already queued up almost 1000 compilations. At this point the main python thread just sits there checking to see if a task has completed and stores the output information.
Gotchas
Timing
One of the problems that I encountered was that of making sure that I wasn’t copying the source code into the same place to compile. To create unique directories I tried to timestamp them down to the microsecond. That would have worked wonderfully… except that the time stamp I was getting out of the time module in windows was only granular to about 10 milliseconds. That caused some of the threads to fail because they were trying to use the same directory. I found another way to uniquely name the directories and then we were good./p>
Keeping passed data safe from other threads
You’re familiar with the terms shallow and deep copy, right? Do you know when each one is done in python? Most of the time it is a shallow copy. Well, when you pass the same dictionary, with slightly different parameters, as the argument in each pool.apply_async() call… you can see where this is going. By the time the dictionary could be acted upon it had been changed in the main thread… many times. Luckily, we have the copy.deepcopy() function to save the day. I just made sure that we did a deep copy of the dictionary before passing it and things went much better.
A skeletal example
The acutal code was written for a client so I will only show a prototype for you to use here
import subprocess
from copy import deepcopy
from multiprocessing import Pool
def ThreadFunction(*args, **kwargs):
#This thread is what is called by the pool
#subprocess.call(['/path/to/something', 'arg1','arg2'])
return kwargs['square']**2 #This gives us something to return
if __name__ == "__main__":
#We have to wrap the main body in here for multiprocessing to work
#because the file is loaded for each thread.
#If we don't do this, each thread would start executing the whole program
pool = Pool(6) #This sets up multithreading with up to 6 worker threads
PoolResults = [] #This is where we will keep track of the results
ArgDict = {} #Dictionary for arguments
for ii in range(2000):
ArgDict['square'] = ii
PoolResults.append(pool.apply_async(ThreadFunction, deepcopy(ArgDict)))
#now we've got all our tasks queued up, we need to wait until they are
#done
results = []
while (len(PoolResults) > 0):
try:
#get the results if they are ready and append to the results array
results.append(PoolResults[0].get(timeout=0.01))
PoolResults.pop(0)
except:
pass
#Now our tasks have completed and we can show the results
print results
You may be asking why I didn’t use the map() function of the pool. The reason is that the calculations involved in each iteration were sufficiently complex that I didn’t want to wait until I had all the calculations done for each permutation of code before starting compilation. The first few versions of the code had the calculations necessary to figure out the code changes taking as long as the compilation.
I like your implementation here. I sourced it for an API which manages a pool of non-CPU bound ETL processes. In my case, the children may complete at widely different times and are not dependent on one another.
I nested the try/except block inside an enumeration of the process pool. It didn’t make a difference in execution order or time, but it makes the humans watching logs happier to see child processes pop off the stack in real-time.
Thanks for sharing!