Nerds do it better
When we, nerds, are looking for the best solution to a programming problem we sometimes get distracted. We are asked to find a way to get from A to B and while we are finding our way we find out that there are often (always) multiple solutions to a problem.
A lesser person would say: “Hey, if it brings me from A to B then it’s alright!”. Right? Well not us nerds. At least not always. For instance right now I am working on a pet project of mine where I want to receive data from a GPS unit that is sending positional and other data from my car to a receiver service I am currently programming. This service should receive the raw bytes from my GPS unit, convert the raw bytes into information that is human readable and then forward this information to subscribers down the way. This post is about the conversion from raw bytes to readable information.
Roads from A to B
I was looking through a sample project created by the manufacturer of these GPS thingies and watched them doing some conversions. First, they wanted not to receive a byte array. Instead they wanted to use a stream as input, which is not a good idea when you are receiving data from the network: when using a stream from the network you never know if the stream has completed. That is why you want to work with byte arrays.
So what I did first is converting the byte array I received in my application to a MemoryStream and then use the manufacturer’s software to parse it. This works but for reasons that are beyond the scope of this post I will not go into that further.
Options
I googled a bit and I found out I had multiple options:
- Use the manufacturer’s code
- Write my own byte array reader
- Use the new API I read about: BinaryPrimitives Class (System.Buffers.Binary) | Microsoft Docs
The reason these options are important is because one might perform significantly better than the other. Which means this option is faster, uses less CPU cycles than the other and subsequently postpones your buying of extra servers. More about that later.
Endianness
Before I continue, let me tell you the main issue with the converters I am describing here is about “Endianness”. Endianness is what determines if the most significant bytes you are working with are on the left or on the right of your byte array. You can google more about it if you want to; the core of the matter is that bytes coming in from network sources are often “big endian”, while most operating systems nowadays are “little endian’. Which just means they interpret numbers in the opposite order. Btw, if you want to know where these terms come from, read Gulliver’s Travels.
Coding
So I created a little benchmark project to find out which of my options would have the best performance. Let me show you an excerpt from each option:
The manufacturer’s choice
They are using a custom implementation built on the BinaryReader class:
public override int ReadInt32()
{
return BytesSwapper.Swap(base.ReadInt32());
}
public static int Swap(int value)
{
return (value >> 24 & 255) | (value >> 16 & 255) << 8 | (value >> 8 & 255) << 16 | (value & 255) << 24;
}
Looks kinda complex but does the job.
My own Byte array reader implementation of the same thing
public int ReadInt32()
{
Int32 value = BitConverter.ToInt32(_input, (int)_position);
_position += 4;
return BytesSwapper.Swap(value);
}
The new BinaryPrimitives method available since .NET Core 2.1
public int ReadInt32()
{
ReadOnlySpan<byte> span = new ReadOnlySpan<byte>(_input, (int)_position, 4);
Int32 value = BinaryPrimitives.ReadInt32BigEndian(span);
_position += 4;
return value;
}
As you can see the latter uses a ReadOnlySpan which I understand to be a kind of wrapper, or metadata, delimiting the boundaries of the array we are interested in. It does not move or copy any data, otherwise it would be a costly operation. Also, there seems to be no need to swap bytes around anymore since it is natively reading and returning the desired value from our Big Endian input.
The Benchmark
Without further ado, let’s see what happens when we compare these options. For this I am using a hex string in my project as input, converted to a byte array. This makes sure we are comparing apples to apples while we are doing this test. So. Running….
Origin | Method | Mean | Error | StdDev | |
---|---|---|---|---|---|
Manufacturer | ReadIntFromByteStreamAndReverse | 8.338 ns | 5.0010 ns | 0.2741 ns | |
My own byte reader | ReadIntFromByteArrayAndReverse | 3.420 ns | 2.0190 ns | 0.1107 ns | – |
.NET Core Primitives Reader | ReadIntFromByteArrayBinaryPrimitivesMethod | 1.032 ns | 0.3370 ns | 0.0185 ns |
Wrapping Up
It is amazing to see how a simple conversion from bytes to integers can have many different options, and one option outperforms them all. This tells me my little hour of work into investigating this will potentially save my company thousands of Euros/Dollars/Yen/Rubles because if I would have gone for the manufacturer’s option I would have to buy 8 times as much servers / scale to an 8 times more expensive App Service Plan in Azure to do the same amount of work.
If you want to know more about this, or about me, please put it in the comments below. I will answer them all, except the spam ones of course 😉