NewYearNewNet [04/01/2024 Testnet] [Offline]

Good question, and I’m not sure either.

Was the issue that some nodes stopped connecting with the rest of the network, so it kind of splintered leading to a failure in being able to retrieve data, and replecate properly?

Certainly great to see the improved stability and data retention on recent tests!

2 Likes

This one I guess:

4 Likes

I suspect a lot of us got caught with the spaces in filename bug

5 Likes

I don’t know if I understood the answer correctly

I don’t know to what extent we were trapped

Since the problem with the spaces was when uploading files, so to upload without errors it is necessary to rename without spaces and upload them again, this way there is no problem to download them.

3 Likes

I think you’re right that we won’t see it being economic for datacentre clients to run nodes. I think the earnings won’t be enough to justify the CPU and RAM requirements but more importantly the data transfer costs. I’ve looked into it and it’s not looking healthy. The rack space, power and network connectivity costs are bad enough for a couple of beefy servers (well, the servers I have lying around were beefy a few years ago!) that could run a few thousand nodes but the data transfer costs for the nodes would be murderous. I don’t have time to go into the maths now but it was data transfer that killed the whole idea.

Big datacentres don’t pay for their connectivity by the TB as they just buy multiple redundant circuits with something like 100Gb/s capacity that they can run at any utilisation level they want. They don’t want to run them at full capacity though in case one of them fails. So they could run nodes to soak up some of that and kill the safenodes if they have a failure or degradation on a circuit and they already have all the monitoring and tooling to do this sort of thing.

What the hyperscalers such as AWS and Azure do is provide ‘spot’ services to soak up spare capacity. If you have a workload that isn’t urgent you can run it when Instances of the right type become available at a price you’re happy with. You have to do all the programming and scheduling to make sure your jobs start up on the Instances as they start and spin down when done and make sure the whole system can cope with fluctuating resources but it does work. I can imagine that when there isn’t even enough demand for Spot Instances that AWS for example might spin up safenodes to make use of resources and kill them off when demand picks up.

6 Likes

I am currently at 12TB of traffic (6TB down, 6,5TB up) since the start of this testnet. That is on average 2.3 GB per node per day, quite a lot for 30-100 MB of stored data.

12 Likes

yeah, that’s the kind of thing I’m talking about. Not a great look for efficiency but obviously obviously it’s needed to make the data resilient and keep the network healthy. So that’s what all node runners have to accept as reality. And it makes it look like a shaky proposition on any kind of metered internet connection. I wonder if it will make it only economically feasible to run nodes at home? That would be kind of cool actually.

And that reminds me - I need to look at the stats for the AWS Instance I have up that has just 1 node running on it.

8 Likes

I hope this is not final, with this amount of overhead it would be impossible for the network to become “new internet”.

5 Likes

I expect there’s lots of room for optimisation, but even if there wasn’t, why would that stop the network becoming the “new internet”? Wouldn’t it just need to remain sufficiently decentralised? Or at some scale would it overwhelm some of the existing low level Internet infrastructure?

4 Likes

Because, IIUC, the cost of bandwidth will be prohibitive for Joe User and as a result, we will not reach the required critical mass for World Domination (or something)

5 Likes

It’s odd as I have 4 machines with 30 nodes each and average ~33 GB per day per machine. Amazingly close on all consistently.

So I am doing just over a GB per node per day.
I don’t think it is terrible.

Going to go out on a limb and say that it doesn’t seem correlated to network use, is that just the nodes chatting?

9 Likes

Well if you consider the non-tech people might only run a few nodes if quota and/or bandwidth is an issue. At 1 GB (@Josh ) to 2.3 GB (@peca ) per day then a home with BW or quota issues might only run 1 to 5 nodes at any one time. World wide then we’d still be looking at upwards of a billion or more nodes

But I do wonder if there is a lot more churning (relocating chunks) to every node than would occur on a more mature network with 100 to 10000 times the nodes

4 Likes

Interesting. I have separate stats only for 2 of 4 machines, but the averages are not same, not even close.

I think so, It uses roughly same bandwidth 24/7.

Maisafe, me, @aatonnomicc probably, that is majority of nodes running 24/7 without reboots. I would think there shouldn’t be much churn, but I may be mistaken on how that works.

4 Likes

It was mentioned that churn also occurs when a node determines one or more of its chunks is no longer in the closest group then it’ll churn those chunks to the new node

If others are only running nodes for limited time and restarting again later on then your nodes may be experiencing churn due to the nodes going off line then new nodes coming online changing the make up of “close groups” for many of chunks

I have no clue how much that’d be happening.

4 Likes

You can see mine here, the first only has 13 nodes the other 4 have 30 each.
Very close to each other.
(I am using vnstat)

https://javages.github.io/mtracking.html

3 Likes

Internet is HUGE today. Considering 1G nodes (50% full 2GB nodes) we would need 3,093,361,410 nodes just to replace what backblaze.com is doing (backup and cloud storage). That is only one company, whole Internet is orders of magnitude bigger Global data center storage capacity 2016-2021 | Statista

4 Likes

Given most fixed line Internet packages are unlimited bandwidth where a few GB per day is irrelevant, and most people would only run a few nodes, is it likely that would be an issue?

2 Likes

What does the network need to store though?
Perhaps it can be optimized in the future and idle traffic is reduced allowing more nodes on limited lines.

But, size wise, do we need the Matrix on there? Netflix/YouTube is perfectly suitable for that.
Are we really expecting to fill it with trivial data that could easily remain on legacy systems?

3 Likes

Unless you care about privacy, having ads forced on you, being manipulated etc

6 Likes

This on the button, we are not trying to create a new big huge hard disk, but a new Internet infrastructure. The store files/data is essential for testing, but I would hope that the future is fully decentralised applications with high speed concurrent payments etc.

This, I hope, will be led by the applications in those areas that do require full decentralisation. So I think what the network will store is knowledge, not just any random junk and so on.

What will decide on what is junk and what it knowledge will be decided by users of the apps, or whatever interface we have (it could be AI driven app free). They will speak with their cash, and that should be the way we aim.

I am no fan of how much data we store and so on, but I dream of a fully decentralised network for the benefit of us all. Just looking at robots/ai is interesting. They NEED to be free of the control of ANY centralised party and absolutely must exist in a truly decentralised network. So no node operators deciding what to store, no contracts between users and so, but a truly decentralised neutral network that allows humans to remain independent of ANY centralised control.

I realise we still have a centralised core dev team, but our work recently has been massively exciting, not because it’s the way it should have been, not because we have a highly probabilistic network now (thank God) and not because it’s working.

My excitement is how simple we have made this. How we fought to keep more functionality out of core (look at DNS), but keeping it as simple as we possibly can. There are tweaks to make in terms of the amount of data sent and that is cool, but with the codebase being so simple allows a vital open source engagement. Devs will engage on core work, there will not be much to do (do we need to look at core dev rewards now??, I suspect we may have to).

So the simple, very simple code base allows decentralisation of the core development and with that I hope we can remove any last remnants of centralised “control”.

22 Likes