Profiling node performance

This test focuses on the effect of group size and quorum size on upload time and network performance.

Summary

As usual, the results first:

Upload time increases dramatically as group size increases.

Upload time does not change significantly as quorum size increases.

Methodology

This test uses the same software versions as the last test, primarily centred around testing vault 0.11.0, only modified for additional logging.

The file uploaded for these tests is Hippocampus (downloadable from vimeo), and is 96.2 MiB.

  • Compile vault with custom group and quorum size
    • Deploy and start 28 vaults on 7 pine64s
    • Wait for network to be formed
    • Upload 96.2 MiB file using demo_app
    • Record the time to upload and the number of hop messages received by vaults
    • Stop vaults and remove old logs
    • Repeat 5 times with the same group/quorum size
  • Repeat with a new group/quorum size

The tests spanned group sizes from 4 to 14 and quorum sizes from 2 to 12.

The results below show the median from five repetitions.

Upload Times

How many minutes did it take to fully upload the 96.2 MiB file? (I’ll put a dynamic table here if the forum can allow <table> tags)

Hop Messages

How many million hop messages were received by all vaults? (I’ll put a dynamic table here if the forum can allow <table> tags)

Variation Due To Group Size

The effect of different group sizes can be observed by keeping the quorum size the same. There is a strong increase in upload time and the number of hop messages as the group size increases.

Variation Due To Quorum Size

There is relatively little increase in upload time as the quorum size increases, but hop count does seem to increase slightly (but not as much effect as changing group size).

Raw Data

The raw log data for all tests is available on the alpha network at http://www.mav.safenet/2016_10_10_group_size_test.7z. Download is 80 MiB, and when decompressed is about 4.8 GiB. For a more permanent link, it’s also available on mega.

Discussion

These results seem to match intuition, but the increase in upload time was more than I expected.

It’s reassuring to see quorum size only affects security / durability of data and is not a factor in network scaling (as expected).

I’m not sure how disjoint groups will perform, since there will be significant variation in group size. Not saying it’ll be bad, just saying I don’t know.

As @AndreasF pointed out in a previous post about expected number of hops, there is room for improvement on hops but security must come first. Will disjoint groups improve the hop messages situation?

Does the small size of my network (28 vaults) affect the results? I wouldn’t think so, since any node only sees part of the network anyhow, but I’m open to speculation about this.

Asides

  1. To compare with the alpha network (group 8 quorum 5), it took 8.5m to upload the same 96.2 MiB file to the alpha network. This is an average rate of 1.55 Mbps. speedtest.net shows my connection can reach 35.16 Mbps upload, so it wasn’t saturated. I reckon that’s pretty good performance, at least compared to the pine64 network which was about 20.5m for similar group and quorum sizes. I’d say this is due to two main differences: alpha has more nodes, and the nodes are (presumably) on more powerful machines than the pine64s I’m using.

    This result was a surprise to me. I expected alpha to be slower than my dedicated network. Shows that cpu performance matters when latency is low.

  2. Received hop message count includes those required for the network to start up and registering an account etc (the point being this was consistent across all tests).

  3. There was a huge variation within the results of each configuration, which is still very surprising to me. Hover over any of the cells in the tables above to see what I mean (hovering shows the data for all five tests).

  4. The last 1% (going from 99% to 100%) always takes much longer than any other. If a normal percent passes in 25s the last percent will pass in about 400s. What is happening in that last 1% that takes so long? It’s very frustrating!

  5. The first 10% (going from 0% to 10%) is almost instant (300ms). I don’t know whether this is because the progress bar is faulty in the demo app or if the vaults are really saving 10% of the file very quickly.

  6. The amount of time for the network to start (ie populate the routing table) increased as group size increased. I didn’t measure it, but subjectively it was quite noticable.

  7. There appears to be a loose correlation between upload time and hop message count. This wasn’t the point of the test, but anyway here’s the chart for anyone curious:

  8. The Disjoint Groups RFC has an interesting point in the drawbacks section related to these tests:

    The group message routing will involve more steps, either doubling the number of actual network hops, or making the number of direct messages per hop increase quadratically in the average group size!

    Although this seems to be addressed in the Test 11 update:

    we will implement the new group message routing mechanism. It will be slightly different from the one specified in the RFC, however, delivering the same level of security but without the huge increase in the number of hops or hop messages.

    It’s still not clear how this will perform relative to the existing hop mechanism but it is being considered.

  9. There were some tests that did not complete and were discarded. This was either due to very slow bootstrapping of the network (mostly with large group sizes), or the demo app showing a Request Timeout error.

  10. The safe network is amazing. Seeing it come together is a real privilege.

Main Point

My main takeaway from these tests is the underlying performance characteristics of the safe network arise as a result of being primarily a messaging platform, with data storage being ‘merely’ the end result of that. Messaging is the key. This conclusion is not surprising in hindsight, but the tests have shifted my balance of thought strongly toward messaging and away from data.

25 Likes