Adding custom fields to packets in ndnSIM 2.3 without forking the entire repository.

The recommended way to build something on top of ndnSIM is to fork its scenario template repository and work inside there. You still need to download and compile the actual framework, however you will simply install it into /usr/local and link to it instead of actually working inside the main repository.

It turns out that this workflow actually makes certain tasks a lot more difficult. You might think a network simulator would make it easy to add new header fields to packets. Well, think again.

First Steps

What do we want to do? Our goal is to just add one field to the Interest packet header. The ndn::Interest class inherits an interface called ndn::TagHost, which allows you to attach arbitrary tags to it. Defining your own tag can be as simple as a single typedef, if you only need to contain a single value in that tag:

typedef ndn::SimpleTag<uint64_t, 0x60000001> MyCustomTag;

You simply specify the type of the tag and make up an ID for it. However, you must pick an unused tag from the valid range given in the ndn-cxx wiki. My 0x60000001 is the first value in this range.

To attach a tag to an Interest, you simply call the setTag method:

interest.setTag<MyCustomTag>(std::make_shared<MyCustomTag>(54321));

To read a tag from an Interest, there is a corresponding getTag method:

std::shared_ptr<MyCustomTag> tag = interest.getTag<MyCustomTag>();

This gives you a pointer to the tag object, and you can get the value out of it quite easily… But first, check if it is null.

if(tag == nullptr){
    // no tag
}
else{
    uint64_t tagValue = tag->get();
}

However, now is where we encounter our problem. Our tag will not actually be encoded and sent over the network. That’s right — we can attach a tag to the Interest, but when it arrives at the next hop it will be gone.

How can we fix this?

Investigation

Vanilla ndnSIM uses these sorts of tags itself in a few places. One obvious one is the HopCountTag, which you can use to figure out how far a packet has gone in the network. A grep through the ndnSIM source brings us to a class called GenericLinkService. This class is responsible for actually encoding packets and sending them out on the wire. In particular, we can find the bit responsible for encoding the HopCountTag in a method called encodeLpFields:

shared_ptr<lp::HopCountTag> hopCountTag = netPkt.getTag<lp::HopCountTag>();
if (hopCountTag != nullptr) {
    lpPacket.add<lp::HopCountTagField>(*hopCountTag);
}
else {
    lpPacket.add<lp::HopCountTagField>(0);
}

Clearly, we need to define a MyCustomTagField to be able to encode our new tag.

Declaring a Tag

This is actually pretty easy, but first you need to know what kind of witchcraft is going on. Let’s start with the actual code to define the field, then go on to analyze it:

enum {
 TlvMyCustomTag = 901
};

typedef ndn::lp::detail::FieldDecl<ndn::lp::field_location_tags::Header, uint64_t, TlvMyCustomTag> MyCustomTagField;

First, we define a constant for the TLV type ID… There are actually a few hidden constraints to what we can pick. If we don’t do this right, we get a packet parse error. Why?

Let’s look at ndn::lp::Packet’s wireDecode method:

for (const Block& element : wire.elements()) {
    detail::FieldInfo info(element.type());
    if (!info.isRecognized && !info.canIgnore) {
        BOOST_THROW_EXCEPTION(Error("unrecognized field cannot be ignored"));
    }
    ...
}

Apparently, this FieldInfo class tells the decoder whether the field is recognized, and whether it can ignore it if it isn’t. Let’s peek at the constructor:

FieldInfo::FieldInfo(uint64_t tlv)
: ...
{
   boost::mpl::for_each<FieldSet>(boost::bind(ExtractFieldInfo(), this, _1));
  if (!isRecognized) {
    canIgnore = tlv::HEADER3_MIN <= tlvType
               && tlvType <= tlv::HEADER3_MAX
               && (tlvType & 0x01) == 0x01;
  }
}

Now this is interesting… To figure out what a TLV tag is, it iterates over FieldSet (which only contains the built-in tags, and we can’t override). However, if it doesn’t find a match, it determines if it is ignorable based on the value of the TLV type ID. We can’t make the field recognized without forking the actual ndnSIM core, but we can make it ignorable by choosing the right ID.

To save you from looking up tlv::HEADER3_MIN and tlv::HEADER3_MAX, they are 800 and 959, respectively. Also, don’t forget that the low bit has to be set. And don’t pick one of the types that is already used.

Moving on from the TLV ID nonsense, the rest of the FieldDecl is pretty straightforward. We pass a flag that says “this goes in the header,” followed by the type of the value and the TLV ID we just made up.

Note that for some reason, the code won’t compile if the type is specified as anything other than uint64_t. I didn’t care enough to figure this out, but it seems to have something to do with the fact that the only integer EncodeHelper defined is for uint64_t.

Encoding the Tag

So far, we have defined our tag twice: once for the high-level Interest object, and once for the low-level TLV encoding. Now, we need to write code to convert between these two representations.

To do this, we need to create a new LinkService. Sounds intimidating, but really all we need to do is make a copy of GenericLinkService and change a few things. Yes, literally copy generic-link-service.hpp and generic-link-service.cpp out of ns3/ndnSIM/NFD/daemon/face/ and into your own project. Rename the file as you see fit, and carefully rename the class to something like CustomTagLinkService. You will want to be careful because we still need to implement the GenericLinkServiceCounters interface if we don’t want to break anything. We can also avoid redefining the nested Options class by using a typedef to import it from GenericLinkService into the new CustomTagLinkService namespace.

Now that we have an identical clone of the GenericLinkService, let’s fix it. To encode your new field, take a look at the encodeLpFields method. Follow the pattern used by the CongestionMarkTag field to implement your new custom one:

shared_ptr<MyCustomTag> myCustomTag = netPkt.getTag<MyCustomTag>();
if (myCustomTag != nullptr) {
    lpPacket.add<MyCustomTagField>(*myCustomTag);
}

Then, add the corresponding decoding logic to decodeInterest:

if (firstPkt.has<MyCustomTagField>()) {
    interest->setTag(make_shared<MyCustomTag>(firstPkt.get<MyCustomTagField>()));
 }

Add the same code to the decodeData and decodeNack methods if you need them.

Using the LinkService

Specifying a custom LinkService isn’t going to do us any good if we don’t tell ndnSIM to use it. We’ll have to replace the callback that sets up a Face in order to do this. We’re going to focus on Faces for PointToPointNetDevices, but the following can be generalized for other types of links.

The call from our scenario file will look something like this:

stackHelper.UpdateFaceCreateCallback(
    PointToPointNetDevice::GetTypeId(),
    MakeCallback(CustomTagNetDeviceCallback)
 );

For context, this is a method of the StackHelper that you’re probably already using to install the NDN stack on nodes. To write the callback, copy the logic from the PointToPointNetDeviceCallback in that same class. All you have to change is the instantiation of the LinkService — replace the GenericLinkService with your own. You will also need to copy the constructFaceUri method (verbatim) because your callback will need to refer to it, but it is out of scope.

Other Caveats

By default, the scenario template wants to compile your code in C++11 mode. However, the LinkService uses some C++14 features, so you’ll have to edit the flags in .waf-tools/default-compiler-flags.py. Note that you need to re-run ./waf configure if you edit these flags.

Conclusion

I think this is way too much effort just to add a field to a packet. We’ve duplicated a lot of logic in order to do something so small. I feel like the ndnSIM developers should have made it a bit easier to add fields to a packet… At worst, I might expect a call to the StackHelper to add new fields. It would likely be possible to write a generic enough LinkService which will encode any custom fields as long as mappings between the TLV classes and tag classes are provided. I look forward to this feature, because it would have made the middle part of my week go a lot more smoothly. Until then, I hope that this post can be useful to anyone else trying to do the same thing.

An idiot’s guide to fulltext search in PostgreSQL.

I love PostgreSQL. It’s probably the most powerful open-source database system out there. Recent features to handle JSON and geospatial data are allowing it to supplant specialized database systems and become closer to a one-DB-fits-all solution. One feature that I’ve recently been able to exploit is its fulltext search engine. It allowed me to easily move from a terrible search implementation (using regular expressions) to one that actually meets users’ expectations.

In this article, I will walk through a basic fulltext search configuration, as well as highlight a few potential improvements that can be made if you’re so inclined.

Many of the features discussed in this post are only available as of PostgreSQL 9.6. Earlier versions have some rudimentary fulltext functionality, but a lot of the more powerful tools we’ll be using are fairly new.

Continue reading An idiot’s guide to fulltext search in PostgreSQL.

Fun with integer division optimizations.

I recently stumbled across a post about some crazy optimization that clang does to divisions by a constant. If you aren’t interested in reading it yourself, the summary is as follows:

  • Arbitrary integer division is slow.
  • Division by powers of 2 is fast.
  • Given a divisor n, the compiler finds some a, b such that a/2b approximates 1/n.
  • This approximation gives exact results for any 32-bit integer.

I was interested in seeing just how much faster this seemingly-magic optimization was than the regular div instruction, so I set up a simple test framework:

Continue reading Fun with integer division optimizations.

The problem with Python’s datetime class.

This might sound like a strong opinion, but I’m just going to put it out there: Python should make tzinfo mandatory on all datetime objects.

To be fair, that’s just an overzealous suggestion prompted by my frustration after spending two full days debugging timestamp misbehaviors. There are plenty of practical reasons to keep timezone-agnostic datetimes around. Some projects will never need timestamp localization, and requiring them to use tzinfo everywhere will only needlessly complicate things. However, if you think you might ever need to deal with timezones in your application, then you must plan to deal with them from the start. My real proposition is that a team should assess its needs and set internal standards regarding the use of timestamps before beginning a project. That’s more reasonable, I think.

Continue reading The problem with Python’s datetime class.

Using bcache to back a SSD with a HDD on Ubuntu.

Recently, another student asked me to set up a PostgreSQL instance that they could use for some data mining. I initially put the instance on a HDD, but the dataset was quite large and the import was incredibly slow. I installed the only SSD I had available (120 GB), and it sped up the import for the first few tables. However, this turned out to not be enough space.

I did not want to move the database permanently back to the HDD, as this would mean slow I/O. I also was not about to go buy another SSD. I had heard of bcache, a Linux kernel module that lets a SSD act as a cache for a larger HDD. This seemed like the most appropriate solution — most of the data would fit in the SSD, but the backing HDD would be necessary for the rest of it. This article explains how to set up a bcache instance in this scenario. This tutorial is written for Ubuntu Desktop 16.04.1 (Xenial), but it likely applies to more recent versions as well as Ubuntu Server.

Continue reading Using bcache to back a SSD with a HDD on Ubuntu.

Parallelizing single-threaded batch jobs using Python’s multiprocessing library.

Suppose you have to run some program with 100 different sets of parameters. You might automate this job using a bash script like this:

ARGS=("-foo 123" "-bar 456" "-baz 789")
for a in "${ARGS[@]}"; do
  my-program $a
done

The problem with this type of construction in bash is that only one process will run at a time. If your program isn’t already parallel, you can speed up execution by running multiple jobs at a time. This isn’t easy in bash, but fortunately Python’s multiprocessing library makes it quite simple.

Continue reading Parallelizing single-threaded batch jobs using Python’s multiprocessing library.

The fruits of some recent Arduino mischief.

I recently consulted on a project involving embedded devices. Like most early-stage embedded endeavors, it currently consists of an Arduino and a bunch of off-the-shelf peripherals. During the project, I developed two small libraries (unrelated to the main focus of the project) which I’m open-sourcing today.

Continue reading The fruits of some recent Arduino mischief.

A simple recommender system in Python.

Inspired by this post I found about clustering analysis over a dataset of Scotch tasting notes, I decided to try my hand at writing a recommender that works with the same dataset. The dataset conveniently rates each whisky on a scale from 0 to 4 in each of 12 flavor categories.

Continue reading A simple recommender system in Python.

Optimizing MySQL and Apache for a low-memory VPS.

Diagnosing the problem.

My last post had a plug about the migration of our WordPress instance to a new server. However, it didn’t go completely smoothly. The site had gone down a few times in the first day after the migration, with WordPress throwing “Error establishing a database connection.” Sure enough, MySQL had gone down. A simple restart of MySQL would bring the site back up, but what caused the crash in the first place?

Continue reading Optimizing MySQL and Apache for a low-memory VPS.

This blog is illegal!

At Zeall, we offer our employees the courtesy of free hosting for their personal blogs, in hopes of furthering their professional image. Today, we completed the migration of the employee Wordpress instance from a shared hosting provider to its own VPS, and simultaneously deployed TLS certificates (thanks, Let’s Encrypt!) for all domains hosted there (including this one).

Continue reading This blog is illegal!