homedark

How unreliable is UDP?

Oct 16, 2014

I realized something recently: I know virtually nothing about UDP. Oh, I know it's connectionless, has no handshaking and thus doesn't provide any guarantees about delivery or ordering. But, in practice, what does that actually mean?

I setup 5 VPS to send each other a few UDP packets over a 7 hour period. I didn't send much traffic (though that's certainly worth trying). Each server, every 9-11 second, randomly picked a target and sent 5-10 packets ranging from 16 to 1016 bytes.

2 servers were in the same data center in New Jersey. 1 each in LA, Amsterdam and Tokyo.

[Un]Reliability

The first thing I wanted to know was how unreliable UDP was. Are we talking about a delivery rate of 25%? 50%? 75%?

Packets Received - click table to toggle %

Receiver
NJ 1 NJ 2 LA NLD JPN
NJ 1 - 2981/2981 2888/2889 2964/2964 3053/3054
NJ 2 3016/3016 - 3100/3101 2734/2735 3054/3054
LA 2901/2941 2932/2975 - 2938/2942 2712/2712
NLD 3038/3038 2771/2772 2724/2724 - 2791/2791
JPN 2551/2552 2886/2886 2836/2838 2887/2887 -

These numbers were better than what I had expected. I was specifically thinking NLD <-> JPN would see above normal loss, but there was none. Data being sent out of LA, specifically to the two servers in NJ, seems to have struggled some. Was there a pattern?

First, I thought maybe the size of the packet would be an issue. Admittedly, I kept them small (16 byte header, 0-1000 byte payload):

Packet Loss Per Size (bytes)

0-115 116-215 216-315 316-515 516-715 716-915
13 11 12 13 23 23

Nothing obvious there. Did the packet loss happen around the same time? Unfortunately, I didn't keep timestamps (why?!), but I did keep a counter per pair. If you look at the 43 packets that failed to make it from LA to NJ2, 29 were lost during 2 ~1 minute periods. The NJ1 packet loss also largely happened during 2 short periods.

Ordering

The other thing I was interested int was ordering.

The first way I looked at this was to measure the inversion of the array. Essentially, that's the number of pairs that are out of order. If you have an array with the values 10, 8, 3, 7, 4, you end up having to do 8 swaps ((10, 8), (10, 3), (10, 7), (10, 4), (8, 3), (8, 7), (8, 4), (7, 4)).

Inversions

NJ 1 NJ 2 LA NLD JPN
NJ 1 - 0 2994 2581 4658
NJ 2 0 - 3147 2459 4645
LA 3980 3861 - 3237 4010
NLD 3125 1826 3133 - 4189
JPN 3920 4417 4147 4425 -

Don't know about you, but I'm not sure I find that useful. It sure seems high. Of course, one of the reasons to use UDP is when you're able to discard some packets. If you send 10 000 packets, and they're all ordered, except that the last one is somehow first, you can just discard it rather than doing 9999 swaps.

What if we discard any packet that come after a later packet we've already processed (later meaning the counter is great)? For example, if we get 1, 5, 4, 3, 6, 7, we'd discard 4 and 3 since we've already seen 5. How many "good" packets would that leave?

# of ordered packets - click table to toggle %

NJ 1 NJ 2 LA NLD JPN
NJ 1 - 2981 1514 1658 1123
NJ 2 3016 - 1627 1483 1161
LA 1227 1259 - 1485 1067
NLD 1407 1645 1220 - 1096
JPN 980 1083 1141 1087 -

As a slight tweak, what if we group 5 packets together, sort them, then re-apply the above discarding code:

# of ordered packets (with grouping) - click table to toggle %

NJ 1 NJ 2 LA NLD JPN
NJ 1 - 2981 2061 2235 1807
NJ 2 3016 - 2214 2041 1889
LA 1868 1873 - 2066 1720
NLD 2200 2273 1920 - 1712
JPN 1541 1804 1735 1732 -

Conclusion

It's hard to draw any conclusions without running this for longer and with more data. Still, it seems that UDP reliability is pretty good. Distance usually involves more hops and each hop increases the risk or something going bad, but if things are normally ok, then distance doesn't seem to be an issue.

What is an issue is ordering. Here, distance does appear to play a bigger factor. By grouping the packets we see a substantial and expected improvement. In a lot of cases, ordering might not matter. Unless you're streaming, it's possible that simply keeping a timestamp and re-ordering on the receiving side would work.

I'd like to test more things. More data for a longer period of time and more locations. I'd also like to compare the performance to TCP. But, overall, I feel that the better-than-I-expected reliability makes UDP something I should keep in my toolbox.