Comparing Sequence Numbers Responsibly
Contents
It’s very common when programming to want to give something a monotically increasing sequence number. Especially when writing netcode. It’s simple though, right? You just pick a integer that’s big enough, and you’re done!
|  |  | 
In my current project, a server-authoratative 2D RPG, input is captured and sent to the server 30 times per second. Let’s calculate how long it would take for a uint32_t sequence number to overflow:
4294967296 (UINT32) / 30 tps / 60 s/min / 60 min/hr / 24 hr/day ~= 1,657 days (~ 4.5 years)
Surely nobody would ever run your program for 4 and a half years, right?? Well.. that’s a discussion for another day. Let’s say we’re fine with the program crashing after 4 years for the purpose of a game (if you’re not fine with that, keep reading!).
Input Bandwidth
Great, so the program crashes if the player plays it for 4 years straight. Who cares. What about bandwidth? Let’s assume we’re already packing the bits to reduce bandwidth:
|  |  | 
That’s 37 bits per input command. This means that 86% of our input channel’s bandwidth is being eaten by sequence numbers!
Maybe this isn’t a problem, maybe it is. It’s impossible to say in general, but in my case, I really wanted to solve the wrapping problem, and after thinking about it for a bit, I realized that solving it for 32-bit integers would also solve it for integers of any other size. Considering that the input history buffer keeps 64 inputs at most, handling wraparound gracefully would mean that we could use 8-bit sequence numbers. Interesting…
Who Else Has This Problem?
It turns out, as with most things, this problem has already been solved. TCP uses sequence numbers for reliability, and of course they are finite in size and can overflow. It would be pretty terrible if a long-standing TCP connection just exploded after wrapping, so what does it do?
Most of the foundational technology that underlies the modern internet is formally documented in a set of freely available technical specification documents called RFCs (“Request For Comments”, named as such because they were used to collaborate on the design decisions in the early days of the internet). RFCs are maintained and hosted by the IETF (Internet Engineering Task Force). You can find them here: https://www.rfc-editor.org/
Protect Against Wrap Sequence (PAWS)
While the original RFC which specified TCP explains how to handle wrapping in a very verbose and abstract manner, there is a follow-up RFC, titled TCP Extensions for High Performance which provides a much more terse and tenable solution. Specifically, Section 4 of RFC 1323, aptly named PAWS – Protect Against Wrapped Sequence Numbers, tells us exactly how to do it:
In both the PAWS and the RTTM mechanism, the “timestamps” are 32-bit unsigned integers in a modular 32-bit space. Thus, “less than” is defined the same way it is for TCP sequence numbers, and the same implementation techniques apply. If s and t are timestamp values, s < t if 0 < (t - s) < 2**31, computed in unsigned 32-bit arithmetic.
Wow. That’s easy. All we have to do is calculate the difference in the context of unsigned arithmetic, then check that the difference is less than half the range of the unsigned type we’re using for our sequence number!
In the Real World
Let’s take a look at the server code which checks if a new input command has been received from the client:
|  |  | 
Wraparound bug, yuck. Now, let’s write a handy template that implements the PAWS fix for any unsigned type:
|  |  | 
Given this handy function, the fix for our code above is simply to replace:
|  |  | 
with:
|  |  | 
Now, when the sequence number overflows, the client and server continue to chug happily along.
Free Bandwidth
As a final win, let’s reduce our bandwidth by ~65% for free!
|  |  | 
256 (UINT8) / 30 tps ~= 8.5 seconds
Our input sequence number now wraps around every 8 seconds, but it doesn’t matter, because by then the whole input buffer will have been recycled at least 4 times (remember, it’s a ring buffer of the 64 most recently received inputs). The half-range of a uint8_t is 128 inputs, which is more than enough leeway to avoid any issues. Nice!
Conclusion
It’s very common for programmers to see a problem like this and just replace the uint32_t with a uint64_t. While that would have made it take countless human lifetimes for the sequence number to wraparound, it also would have nearly doubled our bandwidth usage.
You see a similar thing happen quite often when programmers run into floating point issues. Rather than using a fixed point number, they will just replace their floats with doubles and pretend that the problem has gone away, at the cost of using twice as much memory/storage/bandwidth.
Often it’s worth it to pause, step back, and think about the problem at hand and whether or not there’s a better solution. Especially when that problem is so ubiquitious that it surely has been solved before.