Community-Test (oct6) Offline

Crap, missed it…well next time…
Nice one @Josh

5 Likes

Are you able to upload the logs here in a DM? or perhaps a wee wetransfer or so?

The above error just looks like normal AE when one node perhaps doesnt know about a split as yet…

We’re trying to repro tehse timeout issues locally, but no joy yet.

5 Likes

just fyis, i’ve received the logs,

I’d miss understood w/r/t timeouts. Those are the whole node logs. Which means there was something wrong in startup I think. I’m trying to repro things here.

7 Likes

Trivia expecting not a factor… I wasn’t connected as a node for up.down expecting get is just available to all as normal.

Should the node be creating logfiles in .safe/node/local-node with the command from the OP?

I was seeing the log messages in the console but no logfiles. Ubuntu 20 (I think).

1 Like

So looking at the tool, atm there’s no log level set, which may explain why there’s nothing interesting in the logs.

There was chat about removing -vvv or no eg before, so I think we’ve committed that by mistake, I’m just readding that to the tool now.

ahh, and @happybeing the log files would be in <droplet>/logs

3 Likes

I’m not using a droplet. I tried this on both my laptop and a cloud instance of Ubuntu and it used to create logs in ~/.safe/node/local-node/.

I can’t find a logs directory.

Maybe the CLI needs a flag now? There’s nothing about this in safe node --help or safe node join --help though.

Ah sorry, I misunderstood. The Q is where do logs go for a locally started node?

the param log_dir exists on the node bin, you can specify it there. Otherwise the node will just log to stdout


Also a PR readding more verbose logs to the testnet tool: fix: add logging to droplets once more by joshuef · Pull Request #55 · maidsafe/sn_testnet_tool · GitHub

1 Like

This doesn’t seem to work. I tried --log_dir and --log-dir with both safe node --log-dir join and safe node join --log-dir and it doesn’t recognise the flag. Maybe that’s not what you mean?

Should we try again with better logging?
I will also use s-4vcpu-8gb instead of s-2vcpu-2gb.

4 Likes

Ahh, sorry, you’re another step away with the CLI. I’m not sure what the current status is there for specifying log dir (@chriso you’ve been in and about that recently, maybe you know if that’s possible?)

@happybeing you’d have to run the bin directyl I guess to use that flag if it’s no working w CLI. It should be installed here ~/.safe/node/sn_node so ~/.safe/node/sn_node --log-dir ~/.safe/node/mylogsta should get it going I think (assuming the bin is indeed in that location).

1 Like

Yeah so the node join command doesn’t have any arguments for specifying the logging directory, so you’ll end up with whatever the default logging behaviour is.

We could add this as a feature though.

For the time being, if you need control over the logging, you’d need to launch the node directly.

3 Likes

So i’ve been poking at this, and with more logs we can more clearly see nodes are working well.

I’ve tried using the same droplet size and can rule that out, the client tests are running for me.

Just FYIs, I normally check a testnet against client tests first and foremost.

so running

  • up.sh <sshdir> <node count>
    • (this gets you latest node v installed)
  • ./scripts/use-network
  • cargo test --release --features=test-utils,always-joinable client_api in the safe_network

That’ll run the low level client tests against your newly formed network.

6 Likes

Should we use latest or 0.32.0?

That’s a Q of compatability. I think @chriso 's original suggestion of 0.32 etc is still fine yeh (and I dont think there’s a CLI that’ll work w main atm). He’s actually looking at merging api/cli into the mono repo so Qs of compatability will go away soon enough.

Meantime: the env var SN_CLI_QUERY_TIMEOUT=90 may be worth setting on your CLI commands. It looks like the CLI sets the query timeout much lower than safe_network at the moment (we’ve tweaked how things are handled there, so it’s likely no longer enough).

That may be the issue w/ seeing timeouts (and that being the only log is likely due to the missing -vvv in the testnet tool, which is now merged into master there.

6 Likes

Timeout is the symptom of broken network.
Of course, it can happen with working network too.
But in these tests I see correlation.
Timeouts → network can’t perform its functions anymore.


I mean timeout duration tweaks most likely will change nothing.

That’s not necessarily the case. There definitely is something going on with the CLI timeout. I get dramatically different results in the CLI test cases when I set SN_CLI_QUERY_TIMEOUT to 15 (note: the units of that value are seconds), as opposed to just using its default value, which is 10 minutes.

It could be a combination of both things though. These are quite tough things to be able to pin down, and at this point we can’t definitely rule any aspect out.

5 Likes

Yeah, at the moment there’s no CLI/API combo that’s working with 0.33.x.

1 Like

Right. Might be a bit of a naive or overly simple comparison but if you have a living person in a coma (all motor systems still working) in front of you but your heart rate monitor is busted, and you don’t know any better, then you might think the dude is dead.

3 Likes

As far as I understand, problems in CLI should not lead to problems in nodes.
So failing CLI command is a problem, of course, but failing nodes are far worse problems.

If some CLI parameter makes CLI perform wrongly, it may be useful to make afterwards CLI call with better parameter value and see if previous call broke nodes.

1 Like