Parallelizing single-threaded batch jobs using Python's multiprocessing library.

Suppose you have to run some program with 100 different sets of parameters. You might automate this job using a bash script like this:

ARGS=("-foo 123" "-bar 456" "-baz 789")
for a in "${ARGS[@]}"; do
  my-program $a
done

The problem with this type of construction in bash is that only one process will run at a time. If your program isn’t already parallel, you can speed up execution by running multiple jobs at a time. This isn’t easy in bash, but fortunately Python’s multiprocessing library makes it quite simple.

[Read More]

The fruits of some recent Arduino mischief.

I recently consulted on a project involving embedded devices. Like most early-stage embedded endeavors, it currently consists of an Arduino and a bunch of off-the-shelf peripherals. During the project, I developed two small libraries (unrelated to the main focus of the project) which I’m open-sourcing today.

[Read More]

Optimizing MySQL and Apache for a low-memory VPS.

Diagnosing the problem.

My last post had a plug about the migration of our Wordpress instance to a new server. However, it didn’t go completely smoothly. The site had gone down a few times in the first day after the migration, with Wordpress throwing “Error establishing a database connection.” Sure enough, MySQL had gone down. A simple restart of MySQL would bring the site back up, but what caused the crash in the first place?

[Read More]

Information-centric networking for laymen.

The design of the current Internet is based on the concept of connections between “hosts”, or individual computers. For example, when you visit a website, your computer (a host) always connects to a particular server (another host) and retrieves content through a session-oriented pipe. However, the amount of content hosted on the Internet and the number of connected devices are both growing. This is a crisis scenario for the current Internet architecture – it won’t scale.

[Read More]