I’m just about to go pick up a package, I’ve invested in a ThinkPad mobile workstation (something I’ve been meaning to do for a long time) and I’m also going to get Linux up and running but it’s all going to take time…
For the time being I have to run on the old computer and I have added safe.exe and safenode.exe to the list but I still can’t run nodes and when I try to update again I keep getting the same error about the virus.
Should I make an exception for some other file?
I will add that before the update and since the beginning of Beta testing, I have not had any virus message, Windows has always run Safenode.
At the moment, after a break, I restarted Safenode-manager and the update command works, so I guess the exceptions added to the list worked but still the nodes do not start:
PS C:\Users\gggg> safenode-manager upgrade --interval 10000
╔═══════════════════════════════╗
║ Upgrade Safenode Services ║
╚═══════════════════════════════╝
Retrieving latest version of safenode...
Latest version is 0.110.0
Using cached safenode version 0.110.0...
Download completed: C:\ProgramData\safenode-manager\downloads\safenode.exe
Refreshing the node registry...
✓ All nodes are at the latest version
PS C:\Users\gggg> safenode-manager start
╔═════════════════════════════╗
║ Start Safenode Services ║
╚═════════════════════════════╝
Refreshing the node registry...
Attempting to start safenode1...
Attempting to start safenode2...
Attempting to start safenode3...
Attempting to start safenode4...
Attempting to start safenode5...
Failed to start 5 service(s):
✕ safenode1: Service 'safenode1 (safenode1)' was refreshed successfully.
Starting service 'safenode1 (safenode1)'...
Failed to start the service.
✕ safenode2: Service 'safenode2 (safenode2)' was refreshed successfully.
Starting service 'safenode2 (safenode2)'...
Failed to start the service.
✕ safenode3: Service 'safenode3 (safenode3)' was refreshed successfully.
Starting service 'safenode3 (safenode3)'...
Failed to start the service.
✕ safenode4: Service 'safenode4 (safenode4)' was refreshed successfully.
Starting service 'safenode4 (safenode4)'...
Failed to start the service.
✕ safenode5: Service 'safenode5 (safenode5)' was refreshed successfully.
Starting service 'safenode5 (safenode5)'...
Failed to start the service.
Error:
0: Failed to start one or more services
Location:
sn_node_manager\src\cmd\node.rs:759
Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.
I’d rip it up and start again - others with more Windows experience may have other suggestions
Do safenode-manager reset and confirm with ‘y’
Manually delete all traces of anything called safenode apart from your binaries - get rid of logs, data, service definitions (sorry I havent a clue how Windows does that)
Then start again with safenode-manager add and follow that with safenode-manager start --interval 300000
Yes of course, but you have to weight the consequences. A network that can be owned by the biggest player or elevated bandwidth usage while nodes upgrade when the owner gets around to it. In reality upgrading will be occurring over a decent period of time, weeks to months. Yes some will be within a day but most won’t.
And the 2GB maximum node chunk storage is a small, if not tiny portion of the bandwidth that the node does in a day, let alone a week. The network is also designed that it can handle people only having their nodes on for 12, 18, 22, 24 hours a day. An upgrade every few months is going to pale in comparison to normal bandwidth
We are not going to have the concentration of eager beavers that make up beta in the live network.
I’d rather increased b/w than having the situation where people could hold the network to ransom.
Thanks for the response. I think Jim has clarified above that there’s not going to be a change in the actual protocol, so the debate here is moot. Still, some things to keep in mind:
I’m pretty sure the final network is going to need a node size much greater than 2GB. The experience from the test network has backed-up up pretty convincingly that point I’ve made previously.
We don’t know how frequent upgrades will be. And there will be a lot of private data and non-popular public data that are accessed very infrequently. Imagine something like S3 Glacier. It’s better for the network to try as much as possible to have bandwidth efficiency from the outset.
Yes this is not a change in protocol at all. It is just a change in how the node can startup. Basically a removal of the xor address being related to the peerID and it being regenerated on startup. Also this solves the shunning problem with keeping the peerID and xor and being shunned by some nodes while its in process of restarting. (slow degrade of connectivity over whole network)
I believe it will too. Although for decentralisation/distribution reasons it cannot be too high, maybe 8GB but def no more than 20GB, and at that size it might exclude some who cannot afford to run nodes or many at all.
Experience is a good teacher. Its taken weeks during a fast highly focus development cycles and once the network is live that time is only going to increase. But even once a week the traffic of churning is going to be quite a small proportion of the total traffic.
But if you allow xor to be engineered then people will own portions of the network and bye bye decentralisation. Imagine a big player owning any chunk whose xor address is within a range of xor addresses. They then have a number of options they can do. Like just turn off all their nodes and bye bye those chunks in the centre of it only the outlying ones have a chance and maybe a few in the centre of it all with the random 3rd party node that joined in that area. One more reason for small as possible node sizes.
Yes for exabytes to be stored it is divided up at 2GB, or 8GB or 16GB (if node size increases) and that will be over a huge number of internet connections/providers/backbone connections. Also we see traffic between nodes without major churning being orders of magnitude over the stored data that doing a upgrade and churn of all nodes over weeks will not even be 1%, more like 0.1%
The effects of the additional shuns if peerID/xor is kept will have a worse effect on the network alone. Without even considering attack vectors.
If just 2 nodes shun each upgrading node while it is upgrading and restarting then we see a degrading of the network overall. If nodes mostly stay online 24/7 then a few upgrades will starting seeing the slow degrade of connectivity that will make the extra churn traffic of not keeping peer/xor pale out of consideration.
The keeping of peerID/xor was anti the design considerations that led us to this design. Just go back over the years and its always been if a node restarts it has a new xor/peer address. Also all these people worried about ABCs tracking where chunks are stored have more to worry about since the ABC’s have had their job made many times easier it your node keeps its xor/peer forever*
Also with the near exponential growth of data generation and information generation the old data (more than 2 years old for instance) will always be a small portion of total data. And the older the data the increasingly less of a portion.
Did you read that David also wanted it changed too. See hear, so you wanted to hear from the team then here you go
The node manager at the moment is stopping the node, copying the new node software (IIRC) and then starting the new node using the data directory of the old node that was stopped. And the node while starting up uses the keys in the directory to get its peerID and xor address.
So step one in the change over to all new peer/xor is for the node manager to not reuse the directory completely intact but remove the old keys and then the node node starts with new peer/xor
The next step is to remove the reuse of old keys when starting again. Even from computer restart since the peerid would have been shunned by the other nodes anyhow. Only way a restarting node after being powered off for an hour will work properly is if new nodes exist that will talk to it even though all the other peers are shunning it
Storing data does cause bandwidth usage. But it’s a fraction of the bandwidth usage of nodes. I have had 10 nodes running for 25 days (except that I had a few of them shutdown for a few days, but anyway). The nodes are storing a total of 1731MB:-
ls $HOME/.local/share/safe/node/ | grep safenode | while read f; do echo ${f} ; du -sh $HOME/.local/share/safe/node/$f/record_store ; echo ; done
downloads
du: cannot access '/home/safe/.local/share/safe/node/downloads/record_store': No such file or directory
node_registry.json
du: cannot access '/home/safe/.local/share/safe/node/node_registry.json/record_store': Not a directory
safenode1
129M /home/safe/.local/share/safe/node/safenode1/record_store
safenode10
206M /home/safe/.local/share/safe/node/safenode10/record_store
safenode2
265M /home/safe/.local/share/safe/node/safenode2/record_store
safenode3
127M /home/safe/.local/share/safe/node/safenode3/record_store
safenode4
142M /home/safe/.local/share/safe/node/safenode4/record_store
safenode5
137M /home/safe/.local/share/safe/node/safenode5/record_store
safenode6
255M /home/safe/.local/share/safe/node/safenode6/record_store
safenode7
140M /home/safe/.local/share/safe/node/safenode7/record_store
safenode8
82M /home/safe/.local/share/safe/node/safenode8/record_store
safenode9
248M /home/safe/.local/share/safe/node/safenode9/record_store
Let’s say that they’ve had 5 times that due to churn. Actually, I’ll give your argument a hand and say it was 10 times that which it wasn’t and say they’ve had 17GB over the last 25 days.
These are the stats from the switch port the 10 node machine is on:-
Way more GETs than PUTs. And the PUTs is only 10.6GB even if we assume every one is the max size of a record of 512MB.
And all that includes the upgrade which downloaded everything again.
So the data being sent to a node because of PUTs - even when all the nodes have been upgraded - is a fraction of the total amount of bandwidth used for running nodes.
You’re worried about something that isn’t really an issue.
And as you say refreshing nodes is a good thing. Like RAID 5 & 6 bitrot can creep in if you don’t do a scan of the disks to detect single errors before another disk gets an error in the same area.
Autonomi is better in that it is like a mirroring system, but if all nodes keep their same chunks for years, even with power cycles and upgrades then bit rot will creep in. If all 5 nodes have errors on the old chunks then those chunks are dead, useless. In 3 to 5 years its reasonable that some chunks, as few as it may be, will suffer from all 5 nodes ending up with the chunk corrupted (or unreadable due to some disk error) if they were just left there.
By reading and churning the chunk errors are detected long before all 5 copies would be corrupted. And that churn process then makes sure that 5 good copies are made again.
The knee jerk reaction was two fold, the first was to move away from when the node restarts (power off/on or upgrade) it gets a new xor address and the second was to think that going back to original design was a knee jerk reaction rather than fixing a mistake.
Think about when exabytes of data are being stored on the network, not a few gigs (assuming the network is successful). You design for the future, not right now.
It’s not a few GB being stored by the network just now. It’s probably 5-10TB. Spread across ~40k nodes. By the time the network is storing 1EB+ there will obviously be a larger number of nodes. That would require 500 million nodes at the current node size of 2GB so by then the node size will likely be increased. But there will be no need to if by then there are enough noderunners to run 1 billion nodes of 2GB each.
But the individual bandwidth that users will have to cope with will still be low because they will only have to download 1 x 500 millionth of the data in the network per node when they restart it rather than 1 x ~40,000th of it just now. And the amount of bandwidth used for that will still be a fraction of the bandwidth used just for keeping nodes communicating with each other to run the network.
You guys moved the goal post here from a) “there’ll be less bandwidth requirement if you delete data and have to redownload it” to b) “the bandwidth requirement for deleting and redownloading data is a fraction of what’s needed for communication between nodes”. I’ve never argued against (b). And (b) can be further improved from its current state so that hopefully it’ll take less bandwidth for nodes to communicate vs storing data. The arguments above were for (a), which I don’t think makes sense. What’s your take on (a)?
And perhaps my biggest reason for retaining xor addresses is that I haven’t seen any proofs supporting how data will be retained if every node is deleting all their data every time they update. The entire network is essentially deleting all of its data. You would have to start having update rules where only certain areas can updates to prevent critical data loss, and it can get messy very quickly.
On the other hand, if xor addresses are retained (i.e., data is retained), it’s intuitively clear that data is retained globally on the network. No complex updating rules needed to prevent data loss.
Might be an idea to understand the network before saying things won’t work.
the data isn’t being deleted just because the xor address changes
it is needed for the security of the network
nodes are getting shunned as they take time to restart - be it one shun or dozens. After a few upgrades the network has degraded in connectivity as so many nodes are shunning so many other nodes and increasing every restart. Restarting afresh means the network heals itself
people will not and do not all upgrade on the same minute or hour or day and even with beta and everybody eagerly waiting on the upgrade, we saw it take 3 days before over 1/2 had upgraded
You keep ignoring the unintended consequences of keeping the xor address
And back to no data loss. This happens because when the node starts afresh after a upgrade it gets itself a new peer/xor and then offers up its chunks that it is no longer responsible for and any close node to any of the chunks will GET it.
the chunk isn’t deleted but left as inactive and can be retrieved if needed.
and all this with out considering the attack vectors that open up with keeping peer/xor
and all this without considering the massive tracking that is now made super easy since the data is moving around a ton less over any period of time. People’s fears of being tracked will be greatly increased due to this.
Well if you change what people say then you can say anything. And its just covering up the attempt to scaremonger with oh there will be exabytes. Thats like saying there will be giga tons of flour, how will we distribute it to the people.
If you really have a problem with them returning to the security and healing design it was before then raise a github ticket and explain why.
LOL. Let me just say that one of us spends hundreds of thousands on compute cost at AWS and has to deal with transferring massive amounts of data between regions (and I’m sure it’s not you). So I might know what I’m talking about when I bring up future bandwidth efficiencies.
If you go back and read above, you’ll see that I’ve advocated for comparing pros and cons of each. No approach will be all gravy.
Good thing that Autonomi is designed to run on home computers with fixed costs they pay anyhow whether they run nodes or not.
But here you are waving away all the cons.
Well I know the team didn’t and I certainly didn’t. But you said it in the post I replied to.
Oh and don’t assume what people have and not done.
Anyhow I’ll leave this here, you are not wishing to understand further with your superior position in moving data. You don’t seem willing to open a github ticket to present your reasons why its bad. And I don’t see any point continuing when 2 people with experience in storage, computers, networking have their reasoning basically ignored and not seriously looked into.
There is an incoming improvement on XOR Distance that ensures the network is sybil resistant. This is a simple, but powerful mechanism. Basically it’s what I have noted below.
In addition though I have 2 more issues
Node size needs to be much larger, 20-50Gb
We need to get rid of logs altogether. (replace with a metrics type API for those that are interested, @Shu has neat work there).
To me logs are purely a debug tool, they consume cpu, cause data bloat and are a terrible way to monitor things. They are unnatural and inefficient. Folk love them, I hate them in production.
Anyway, summation of XOR distance fix is below:
In a decentralised network based on kademlia we have an issue with Sybil nodes. To overcome the effect of such nodes giving bad data we use verifiable data or indeed unique signed data. The latter being either mutable and crdt base or is mutable signed and unique. The latter type being the issue. Let’s call this type transactions.
We introduce a distance metric to ensure Sybil injected nodes cannot present doublespend type transactions as the distance measure must include honest nodes that will transmit the other transaction meaning the malicious injected transaction is not unique and therefore provably bad. The way this metric works is that we analyse each GetNode or GetClosestAddress and check the distance of the close group. We then take the average of the last 100 calls and use that distance to get all nodes within that distance to an address. This means injected nodes cannot take over an address space at the expense of honest nodes as the honest nodes are contained in the returned nodes from such network calls.
I can’t wait to see what this does with bandwidth usage, shunning on filling (because it takes time to get the chunks the node is supposed to handle) and my home node connectivity =)