A graphical tool for configuring Alesis V-Series MIDI controllers on Linux.

In my last post, I explained how I reverse-engineered the Alesis SysEx protocol and detailed my findings. Now, two months later, I’ve finally decided to implement a tool for configuring these controllers on Linux.

The entirety of this work was done over the past two days, so it likely contains some hidden bugs, but as far as I can tell it is entirely usable.

Unfortunately, this is my first attempt at writing any kind of GUI, so it’s not beautiful. But hey, it works.

You can find the project on GitHub. Contributions are welcome.

Reverse engineering the Alesis V-series SysEx protocol.

I recently got back into music production and decided to order myself a MIDI controller. I got a few recommendations for the the Alesis V25, so I went ahead and ordered it. However, I was less than pleased to find that its configuration software wouldn’t run on Linux, even under Wine. Of course, this prompted me to reverse engineer the protocol that lets the software talk to the keyboard.

Continue reading Reverse engineering the Alesis V-series SysEx protocol.

Using black magic to make a fast circular buffer.

Yesterday, I took a glance at the Wikipedia page for the circular buffer and was intrigued by an alleged optimization technique that I was not familiar with:

A circular-buffer implementation may be optimized by mapping the underlying buffer to two contiguous regions of virtual memory. (Naturally, the underlying buffer‘s length must then equal some multiple of the system’s page size.) Reading from and writing to the circular buffer may then be carried out with greater efficiency by means of direct memory access; those accesses which fall beyond the end of the first virtual-memory region will automatically wrap around to the beginning of the underlying buffer. When the read offset is advanced into the second virtual-memory region, both offsets—read and write—are decremented by the length of the underlying buffer

When implementing a circular buffer, we need to handle the case where a message spans the “discontinuity” in the queue and wraps around. The naive circular buffer’s write routine might employ a byte-by-byte write and look something like this:

void put(queue_t *q, uint8_t *data, size_t size) {
    for(size_t i = 0; i < size; i++){
        q->buffer[(q->tail + i) % q->buffer_size] = data[i];
    }
    q->tail = (q->tail + size) % q->buffer_size;
}

The fact that a modulo operation is necessary to index into the array makes this function hard (if not impossible) to vectorize, and thus unnecessarily slow. Though there are other optimizations we can make, the technique offered in the above Wikipedia surpasses hardware-agnostic approaches by virtue of the fact that the memory management unit can handle most of the wrap-around logic on our behalf. I was so excited by this idea that I did no further research whatsoever, and implemented it based only on the brief description above.

Continue reading Using black magic to make a fast circular buffer.

Proving a mathematical curiosity.

Today, a thread full of cool math facts appeared on Reddit. As usual, someone mentioned the fact that 111111111 × 111111111 = 12345678987654321. In another reply, someone pointed out that this also works in other bases. For some reason, I decided that I needed to prove that it works in all bases.

To begin, I needed a general formula for values of the 111… terms. This was fairly straightforward: for a base b, we want b-1 base-b digits, all ones. To standardize the base, we multiply each digit by an increasing power of b and sum. Since each digit is one, we get a nice geometric series which can easily be solved.

    \[\sum\limits_{i=1}^{b-1} b^{i-1} = \frac{b-b^b}{b-b^2}\]

When we multiply this number by itself, we are squaring it, so we end up with \left((b-b^b)/(b-b^2)\right)^2.

The hard part was writing a general form for the 1234 \dots (b-1) \dots 4321 number. To deal with this, I broke it down into two parts, as illustrated below.

Digit value 1 2 \dots (b-2) (b-1) (b-2) \dots 2 1
Place multiplier b^{2b-3} b^{2b-4} b^{2b-(b+1)} b^{2b-(b+2)}
b^{b-2} b^{b-3} b^1 b^0

I calculated the values of the most-significant digits starting at the left, and the values of the least-significant digits starting at a right. To make the math come out nicely, I actually included the center digit in both formulas. That’s okay, since we can subtract it off once to make up for the duplicate. Now we have a summation formula for the value of the square.

    \[\left( \sum\limits_{i=1}^{b-1} ib^{2b-(i+3)} \right) + \left( \sum\limits_{i=1}^{b-1} ib^{i-1} \right) - b^{b-2}\]

With a little thinking (or the help of a computer algebra system), we can get a neat closed form.

    \[\left( \sum\limits_{i=1}^{b-1} ib^{2b-(i+3)} + ib^{i-1} \right) - b^{b-2} = \left(\frac{b-b^b}{b^2-b}\right)^2\]

We can see that this is quite similar to the expression we got for the square above; the only difference is that the b-b^2 denominator has changed to b^2-b. Fortunately, this negation goes away when squaring, so we can trivially prove that the two expressions are equal.

And there we have it: proof that this curiosity is true in any base of at least two.

Generating spectrograms the hard way with numpy.

A spectrogram is a convenient visualization of the frequencies present in an audio clip. Generating one involves obtaining the frequency components of each window of the audio via a Discrete Fourier Transform (DFT) of its waveform. While tools are available to both generate spectrograms and compute DFTs, I thought it would be fun to implement both myself in my language of choice, Python.

In the following, I will discuss computing a DFT (the hard way), processing a WAV file, and rendering a spectrogram (all in Python). If you’re impatient and just want to see the code, you can find it on GitHub.

Continue reading Generating spectrograms the hard way with numpy.

Integrating GitLab and Google Calendar.

Zeall, like many other software startups, uses GitLab for version control and issue management. We also use the ever-popular Google Calendar to handle meetings, reminders, and deadlines. For several months, we’ve been looking for a way to automatically push GitLab issue deadlines into Google Calendar, and until now it seemed impossible. Only after a recent migration from our own private mailserver to G Suite did we find a solution — or rather, figure out how to feasibly build one.

Continue reading Integrating GitLab and Google Calendar.

Adding custom fields to packets in ndnSIM 2.3 without forking the entire repository.

The recommended way to build something on top of ndnSIM is to fork its scenario template repository and work inside there. You still need to download and compile the actual framework, however you will simply install it into /usr/local and link to it instead of actually working inside the main repository.

It turns out that this workflow actually makes certain tasks a lot more difficult. You might think a network simulator would make it easy to add new header fields to packets. Well, think again.

Continue reading Adding custom fields to packets in ndnSIM 2.3 without forking the entire repository.

An idiot’s guide to fulltext search in PostgreSQL.

I love PostgreSQL. It’s probably the most powerful open-source database system out there. Recent features to handle JSON and geospatial data are allowing it to supplant specialized database systems and become closer to a one-DB-fits-all solution. One feature that I’ve recently been able to exploit is its fulltext search engine. It allowed me to easily move from a terrible search implementation (using regular expressions) to one that actually meets users’ expectations.

In this article, I will walk through a basic fulltext search configuration, as well as highlight a few potential improvements that can be made if you’re so inclined.

Many of the features discussed in this post are only available as of PostgreSQL 9.6. Earlier versions have some rudimentary fulltext functionality, but a lot of the more powerful tools we’ll be using are fairly new.

Continue reading An idiot’s guide to fulltext search in PostgreSQL.

Fun with integer division optimizations.

I recently stumbled across a post about some crazy optimization that clang does to divisions by a constant. If you aren’t interested in reading it yourself, the summary is as follows:

  • Arbitrary integer division is slow.
  • Division by powers of 2 is fast.
  • Given a divisor n, the compiler finds some a, b such that a/2b approximates 1/n.
  • This approximation gives exact results for any 32-bit integer.

I was interested in seeing just how much faster this seemingly-magic optimization was than the regular div instruction, so I set up a simple test framework:

Continue reading Fun with integer division optimizations.

The problem with Python’s datetime class.

This might sound like a strong opinion, but I’m just going to put it out there: Python should make tzinfo mandatory on all datetime objects.

To be fair, that’s just an overzealous suggestion prompted by my frustration after spending two full days debugging timestamp misbehaviors. There are plenty of practical reasons to keep timezone-agnostic datetimes around. Some projects will never need timestamp localization, and requiring them to use tzinfo everywhere will only needlessly complicate things. However, if you think you might ever need to deal with timezones in your application, then you must plan to deal with them from the start. My real proposition is that a team should assess its needs and set internal standards regarding the use of timestamps before beginning a project. That’s more reasonable, I think.

Continue reading The problem with Python’s datetime class.