Not exactly, the prefix-map file recently evolved into containing not just a map of prefixes to SAPs, but also to contain the complete sections keys DAG/tree, so this file is likely to be renamed soon. For a node joining a network, or a client bootstrapping to connect to a network, we are using that same file as it contains the SAPs they can connect to, so from their perspective, that’s just a network contacts list/file, not sure if in the future clients and nodes support other type of files/formats than this prefix-map file we currently produce, either way, they need a network contacts list.
Chop chop MaidSafe, it’s nearly PIMMs o’clock!
Anyone expecting anything exciting?
It’s been months
Sod yer poncy Pimms - used to think that was some obscure form of DRAM packaging btw - it’s weissbier and grilled bratwurst time out my back garden.
My laptop HDD is rubber-ducked so I will not be competing in the first-post race this time. I’ll read later when it finally drops and respond then.
I have to go out the back anyway, the agogometer is throbbing noisily and distracting me.
11 August Update has dropped!
The sectionTree !
No chains to see here, move along now.
It’s quiet on GitHub and the forum, too quiet. Maybe i have more influence than I thought…
Missed your calling. I know I’d fear if you had a Bobby stick!
There were fairly big changes last nigt when I did a git pull. Built that and fired up baby-fleming. Put a couple of 2-300Mb dirs without issue then fed it my usual 3.2Gb of photos.
That put process crapped out after a couple of mins but the nodes kept running. Successfully stored another couple of small dirs then fed it a dir of ~1000 small thumbnails mostly <20kb. Nearly 12 hrs later, its still running. Tailing the logs shows a lot of msgs are
like
2022-08-18T12:27:32.624010Z DEBUG sn_node::node::flow_ctrl] checking for q data, qlen: 0
[2022-08-18T12:27:32.624014Z DEBUG sn_node::node::flow_ctrl] data found isssss: None
and one node has nothing in the logs since this at 02:07 UTC
[2022-08-18T02:07:58.898654Z DEBUG sn_dysfunction] Adding a new issue to 2198ea(00100001).. the dysfunction tracker: PendingRequestOperation(OpId-e24ab5..)
[2022-08-18T02:07:58.953496Z DEBUG sn_node::node::flow_ctrl] checking for q data, qlen: 0
[2022-08-18T02:07:58.953505Z DEBUG sn_node::node::flow_ctrl] data found isssss: None
[2022-08-18T02:07:58.986814Z DEBUG sn_node::comm] Cleanup peers , known section members: {NodeState { peer: Peer { name: 0fad97(00001111).., addr: 127.0.0.1:39126 }, state: Joined, previous_name: None }, NodeState { peer: Peer { name: 2198ea(00100001).., addr: 127.0.0.1:41898 }, state: Joined, previous_name: None }, NodeState { peer: Peer { name: 362cae(00110110).., addr: 127.0.0.1:45000 }, state: Joined, previous_name: None }, NodeState { peer: Peer { name: 555e64(01010101).., addr: 127.0.0.1:33964 }, state: Joined, previous_name: None }, NodeState { peer: Peer { name: 55bf5a(01010101).., addr: 127.0.0.1:37662 }, state: Joined, previous_name: None }, NodeState { peer: Peer { name: 75ac26(01110101).., addr: 127.0.0.1:36726 }, state: Joined, previous_name: None }, NodeState { peer: Peer { name: b42af9(10110100).., addr: 127.0.0.1:43950 }, state: Joined, previous_name: None }, NodeState { peer: Peer { name: b6afc9(10110110).., addr: 127.0.0.1:41838 }, state: Joined, previous_name: None }, NodeState { peer: Peer { name: d74072(11010111).., addr: 127.0.0.1:48182 }, state: Joined, previous_name: None }, NodeState { peer: Peer { name: f60630(11110110).., addr: 127.0.0.1:48768 }, state: Joined, previous_name: None }, NodeState { peer: Peer { name: f703af(11110111).., addr: 127.0.0.1:57683 }, state: Joined, previous_name: None }}
The agogometer is barely flickering this afternoon.
Now tthis is interesting…
Safe Network v0.9.0/v0.10.0/v0.70.0/v0.66.0/v0.68.0/v0.61.0
Repository: maidsafe/safe_network · Tag: 0.10.0-0.9.0-0.70.0-0.66.0-0.68.0-0.61.0 · Commit: 43fcc7c · Released by: github-actions[bot]
This release of Safe Network consists of:
Safe Node Dysfunction v0.9.0
Safe Network Interface v0.10.0
Safe Client v0.70.0
Safe Node v0.66.0
Safe API v0.68.0
Safe CLI v0.61.0
New Features
- remove ConnectivityCheck
Now we have periodic health checks and dysfunciton, this
check should not be needed, and can cause network strain
with the frequent DKG we have now - new cmd to display detailed information about a configured network
- include span in module path
- simplify log format to ’ [module] LEVEL ’
- make AntiEntropyProbe carry a current known section key for response
This is cool
willie@gagarin:~$ safe networks sections
Network sections information for default network:
Read from: /home/willie/.safe/network_contacts/default
Genesis Key: PublicKey(0aa9..928c)
Sections:
Prefix ''
----------------------------------
Section key: PublicKey(06dc..6260)
Section keys chain: PublicKey(0aa9..928c)->PublicKey(1211..cda6)->PublicKey(0854..1e80)->PublicKey(0b66..272a)->PublicKey(0bc3..4074)->PublicKey(06dc..6260)
Elders:
| XorName | Age | Address |
| d4a18f.. | 86 | 127.0.0.1:39521 |
| 370cfc.. | 88 | 127.0.0.1:57347 |
| b63782.. | 90 | 127.0.0.1:48025 |
| 75c634.. | 92 | 127.0.0.1:57335 |
| f56d64.. | 94 | 127.0.0.1:33643 |
| 0f704a.. | 96 | 127.0.0.1:55864 |
| 15c8f2.. | 255 | 127.0.0.1:52482 |
Running stably?
Stable? yeees…
I didnt crash it but when I tried to load 3Gb of pics, the box slowed to a crawl and peaked at 99% RAM and 80% swap so I killed the process and could continue to add small files. I didnt try anything over 100Mb after that.
Its still hogging RAM and refusing to release it after the files are put.
However Im thinking the tests I am doing are perhaps irrelevant in the real world. In production I doubt any sane person would try to run 15 nodes simultaneously and try to put 3GB of files in one shot. Run one or two nodes and try to put 3Gb - thats another story…
Any clown can run out of RAM if they hammer the box hard enough, whether its 15 sn_node processes or watching 30 YouTube vids. Im only exploring the limits of my box, not the code
Perhaps a more useful test - until we have a comnet - would be to run the 15 nodes, cos baby-fleming is the only option - and put a series of smaller files whilst keeping a very close eye on the RAM consumed after each put. Its the fact that each sn_node process continues to hog memory after the job is done that I think is concerning.
Rather than safe files put ~/Pictures/2016/ ← ~3.2Gb, I will make a set of test dirs each with say 1 Gb of total content and put them sequentially .
I got busy with other stuff and have yet to make a structured set of dirs. So I just fed it some more ~/Pictures subdirs
Putting larg(ish) dir one after another showed me this
RAM usage is up approx 10% BUT some of it appears to be getting released now.
The command was
willie@gagarin:~$ safe files put -r ~/Pictures/2011 && \
safe files put -r ~/Pictures/2008 && \
safe files put -r ~/Pictures/2019 && \
safe files put -r ~/Pictures/2018
2011 ← 1.4Gb
2008 ← 534 Mb
2019 ← 354Mb
2018 ← 265Mb
Its almost like there is a certain threshold of size of put, above which the RAM is not released. This sounds crazy to me but its all I can deduce from what I have seen over the past couple of weeks. Small uploads and it tends to work as expected but go above some limit (1Gb?) and the sn_node processes do notrelease te memory once the put is complete.
@joshuef @chriso @dirvine @qi_ma do you want logs for this?
PS @mods @ can we get a @devs address that will send to all the devs?
Repro case is probs more useful than the logs atm. I haven’t tried this locally yet, but I will be soon. All being well.
@southside perhaps you can have a go at running the churn
example and tweaking these values: safe_network/churn.rs at main · maidsafe/safe_network · GitHub
right now it does 400mb and 27 nodes in total. If you have time/want perhaps dial in a quantity of data there that’s causing this that’d be awesome No worries if not, i’ll be aiming to do this to try and get a test case up before the end of the week (hopefully
)
Note current usage of RAM and swap
Fire up baby-fleming with latest release
RUST_LOG=DEBUG safe node run-baby-fleming --nodes 15
put a largish (>2Gb) file or dir -watch the RAM and swap usage as it starts and after the chunks are stored
UPDATE: I ran my grab_logs script that copies baby-fleming nodes to /tmp then does a find --exec to extract the log files. This was a daft move as I only have 16Gb allocated for /tmp so I quickly ran out of space. I tried clearing /tmp and trying again but that also ran out of space, However now my RAM/swap usage looks like this…
Could my issue be related to sn_node process caching some logging in RAM?
I cant remember ever seeing swap > used RAM.
Since you are talking about RAM usage in nodes, it should not matter much if you run upload on separate PC - RAM usage of client is a small fraction of RAM usage of 15 nodes.
Then you can assume that 1 node will require 1/15 of your RAM amount. This will be real world case.
Now look at how many RAM it uses and think if such usage is adequate to task, which node performs.
Since I know that torrent clients can download tenths GB of data while using much less than 1 GB of RAM, I expect that SN node can do the same. If someone can prove that node for some reason needs more, please show such proofs.
Just to note, my torrent client is configured to use 192MB of RAM for read/write cache - experimentally I found that it is enough for it to constantly serve 10MB/s uploads. Total amount of RAM used for it now is ~400MB.
Looks like it…
I cleared out /tmp again before grabbing some logs and the RAM usage went down again.
Heres the screenshot
Running a baby-fleming is something only a few users are ever likely to do so best not to get hung up about this.
I’ll look at this tonight.
There is a lossy option… that is not enabled nor is there an option to from the sn_node bin… but I do wonder. Can you try with logs disabled and see how you go?