Potential Serialization Flaws¶
The structure has a couple immediately obvious shortcomings.
First, the maximum message size is 4,294,967,299 bytes (including compression and headers). It could well be that in the future there will be more data to send in a single message. But equally so, a present-day attacker could use this to halt sections of a network using this structure. A short-term solution would be to have a soft-defined limit, but as has been shown in other protocols, this can calcify over time and do damage. In the end, this is more of a governance problem than a technical one. The discussion on this can be found in issue #84.
Second, there is quite a lot of extra data being sent. Using the default parameters, if you want to send a 4 character message it will be expanded to 159 characters. That’s ~42x larger. If you want these differences to be negligble, you need to send messages on the order of 512 characters. Then there is only an increase of ~34% (0% with decent compression). This can be improved by reducing the size of the various IDs being sent, or making the packet headers shorter. Both of these have disadvantages, however.
Making a shorter ID space means that you will be more likely to get a conflict. This isn’t as much of a problem for node IDs as it is for message IDs, but it is certainly a problem you’d like to avoid.
Making shorter packet headers presents few immediate problems, but it makes it more difficult for debugging, and may make it more difficult to establish standard headers in the future.
Results using opportunistic compression look roughly as follows (last updated in 0.4.231):
For 4 characters…
original 4
plaintext 167 (4175%)
lzma 220 (5500%)
bz2 189 (4725%)
gzip 156 (3900%)
For 512 characters…
original 512
plaintext 677 (132.2%)
lzma 568 (110.9%)
bz2 555 (108.4%)
gzip 487 (95.1%)
Because the reference implementations support all of these (excepting environment variations), this means that the overhead will drop away after ~500 characters. Communications with other implementations may be slower than this, however.