Recently, another student asked me to set up a PostgreSQL instance that they could use for some data mining. I initially put the instance on a HDD, but the dataset was quite large and the import was incredibly slow. I installed the only SSD I had available (120 GB), and it sped up the import for the first few tables. However, this turned out to not be enough space.
I did not want to move the database permanently back to the HDD, as this would mean slow I/O. I also was not about to go buy another SSD. I had heard of bcache, a Linux kernel module that lets a SSD act as a cache for a larger HDD. This seemed like the most appropriate solution – most of the data would fit in the SSD, but the backing HDD would be necessary for the rest of it. This article explains how to set up a bcache instance in this scenario. This tutorial is written for Ubuntu Desktop 16.04.1 (Xenial), but it likely applies to more recent versions as well as Ubuntu Server.
Suppose you have to run some program with 100 different sets of parameters. You might automate this job using a bash script like this:
ARGS=("-foo 123" "-bar 456" "-baz 789")
for a in "${ARGS[@]}"; do
my-program $a
done
The problem with this type of construction in bash is that only one process will run at a time. If your program isn’t already parallel, you can speed up execution by running multiple jobs at a time. This isn’t easy in bash, but fortunately Python’s multiprocessing library makes it quite simple.
I recently consulted on a project involving embedded devices. Like most early-stage embedded endeavors, it currently consists of an Arduino and a bunch of off-the-shelf peripherals. During the project, I developed two small libraries (unrelated to the main focus of the project) which I’m open-sourcing today.
Diagnosing the problem.
My last post had a plug about the migration of our Wordpress instance to a new server. However, it didn’t go completely smoothly. The site had gone down a few times in the first day after the migration, with Wordpress throwing “Error establishing a database connection.” Sure enough, MySQL had gone down. A simple restart of MySQL would bring the site back up, but what caused the crash in the first place?
The design of the current Internet is based on the concept of connections between “hosts”, or individual computers. For example, when you visit a website, your computer (a host) always connects to a particular server (another host) and retrieves content through a session-oriented pipe. However, the amount of content hosted on the Internet and the number of connected devices are both growing. This is a crisis scenario for the current Internet architecture – it won’t scale.