Vdash - dashboard for Autonomi nodes

storage_guy · December 20, 2024, 10:58pm

Super! Thank you. RAM is still missing. Not a disaster as it’s easy to see in ‘top’ etc.

happybeing · December 30, 2024, 10:49pm

Did someone, maybe you Rob (@neo) say that RAM was still in the logs. I can’t find it so if it is there please someone post a copy of the logfile entry.

neo · December 30, 2024, 11:20pm

Actually that was what to search for.

It is still in the /metrics though

Shu · December 31, 2024, 12:04am

{"physical_cpu_threads":4,"system_cpu_usage_percent":6.2774954,"process":{"cpu_usage_percent":0.67681897,"memory_used_mb":115,"bytes_read":0,"bytes_written":8192,"total_mb_read":584,"total_mb_written":907}}

This line I believe has been dropped from being recorded on disk for production binaries.

The data for antnode pid (cpu & mem) & libp2p bandwidth metrics are available at /metrics endpoint.

In addition, future release will increase the decimal precision from 0 to 4 for the cpu & memory counters found at /metrics endpoint.

neo · December 31, 2024, 12:43am

It would be great (in my opinion) if these log entries can be kept in the code and available to be included with a custom build.

It will be a shame if this rich debug info is lost over time because it is not included in the production builds.

happybeing · December 31, 2024, 11:45am

Thanks @Shu, I appreciate you taking the time. The idea of vdash is to only use the logs as a simpler alternative to the endpoint.

I always anticipated it would eventually be superceded but for the time being people are still finding vdash an easy way to monitor nodes outside launchpad so I’m trying to keep it showing whatever it can from the logs.

If there’s a way to keep those performance metrics available, and ideally also the storage cost, that would I expect be appreciated by die-hard vdash users! Maybe gated by an environment setting on antnode?

Shu · December 31, 2024, 2:05pm

The particular logging line was consuming 20% of all logging activity produced per antnode, and with it CPU cycles.

Along with it being a metric that really should be consumed from /metric endpoint was why it was decided to be removed. I am not really sure for how long the team wants to maintain backwards compatibility here as the long term goal was to reduce logging activity and shift certain data points into concrete accessible metrics via /metrics.

@roland would need to comment here if running non production binaries, the log shows up say if running debug builds? I haven’t confirmed this.

I think the team in general is trying to reduce the number of feature flags supported as well, and turn them into runtime settings, if still required.

happybeing · December 31, 2024, 2:58pm

There’s no need for it to be a frequent output so the system load shouldn’t be an issue if there’s a suitably infrequent output it can be coupled to.

I’m not suggesting a feature flag but an environment setting that ideally would work with production binary at least for the time being. The point here is an alternative to the metrics endpoint since that isn’t available easily outside launchpad. Being able to run a CLI like vdash still has a role and until a similar but metrics based alternative is created I think it is useful to maintain.

Someone could fork vdash to use the endpoint but I have other priorities so unlikely.

BTW Does/will metrics include storecost via antnode?

Shu · December 31, 2024, 6:52pm

I can’t find the post on the forum, but a reason was given why storecost metric has disappeared (not in use) by Qi to @neo , as neo asked same question a while back to me.

/metrics included storecost but its not populated as its no longer in use.

dirvine · December 31, 2024, 10:37pm

My own feeling is logs will die out in production, but I am more than happy to help with a switch here @happybeing and also logs can still be available via an env var, but mostly for debugging. So we can try and accommodate wherever possible, but my desire is we get dash to use metrics if possible and also in the python modules I was able to real time read the routing table entries and much more. So we can also make that available as well. We won’t leave dash behind, we should evolve it with the codebase.

It’s a busy time right now but we will get there. Vdash should have a long term lifetime IMO, it’s been great for folk here for sure. However the logs do need to go back into debug land, as you seem to be saying as well or at least expecting. Another thing I want to make happen is that the ant-node library is available. So it should be easier for dash to also run nodes direct, like antctl etc.

Anyway, we are getting close to being very focused on devs, better late than never, but I am very keen we get involved and help cut over not only dash, but to the new data types that will help other projects like awe/jams etc. We do need to get to that place quickly.

neo · December 31, 2024, 11:04pm

The logs still provide a lot of useful info. If all that was in the /metrics then all good, but one thing the logs give is a historical line of what happened. Is the historical needed? Well only when i was doing some stats on things like contacted node xor addresses and other such things. Also my contacting a node to see if its alive and could give a quote used logs from the client as well.

@Shu I can understand trying to reduce the feature flags and its a good thing. Although there are classes of features. Like debug, logs are one sort and then other node functional features are another. Debug type vs Functional types

Debug types of “functions” are separate to functional types. Having them as a flag for build allows the code to be there only when required for testing or as a special binary used for network testing/monitoring.

@happybeing The /metrics is just reading a file. You execute a curl to the node (127.0.0.1:metrics-port/metrics) and store the results in a file then read it or process the curl output directly. At the moment you continue reading the log file, whereas metrics means an extra step to execute the curl then read once the metrics file. My script greps the ^sn_ (well now ^ant_)

Although one downside is that the person must start the node with metrics enabled

dirvine · December 31, 2024, 11:09pm

I have never disagreed with that, but I fear people do not understand the cost of logs. They are very expensive. So the issue is how much are we prepared to pay for logs, they do mean nodes are less efficient, disk space is used and possibly security information leaked.

As a debug tool then great, but as a production thing then I really dislike them a. lot. IF folk want to use RUST_LOG And run nodes with it on then it’s fine, but I would imagine they get shunned much more quickly. That cost must be paid for sure

neo · December 31, 2024, 11:10pm

Agree with that. Having a build flag solves that

Shu · December 31, 2024, 11:10pm

@neo - my initial concern is the type of data being logged vs going to /metrics. Some is actually metrics and measurement key/value pairs, and some is un-formatted and unstructured lines. I do not think we need to duplicate structured data in both places especially for metrics and measurements between logs and /metrics endpoints for production.

I only stated these are not found on production binaries, but a debug build will likely still have them (to be confirmed by Roland).

As for the time series, and historical needs, other tools should be used to process and capture the data from /metrics to store and forward it. Taking shortcuts to put all that data from /metrics and log it at certain frequency in logs for production binaries is definitely not the right route in my opinion.

I agree with David’s response above, while logs has its purpose, it should be used sparingly and appropriately.

neo · December 31, 2024, 11:18pm

And I also agree and why I only ask they are not removed but in a “feature flag” so they can be built. Just as long as the log entries are not removed completely unless they make no sense anymore.

Wasn’t there another “/” to get some unformatted data like port numbers etc? I forget the URL for it. Maybe the routing table could be included in that, be good for statics and something I get from the logs currently.

While I am not arguing for anything here, just noting that polling /metrics only gives snapshots unless one is polling with a very small interval. Logs give the event when it happens and thus all changes are captured. Also more processing to continually poll /metrics just to capture changes in case any changes happen.

The build flag to allow logs solves that anyhow if required.

The current routing table and couple of other things would be enough for my needs at the moment. I am sure I am forgetting something, but since this is a longer term thing then I can ask later.

dirvine · December 31, 2024, 11:21pm

I think we will have better ways of doing that, at least I hope so. i.e. real time query of the routing table.

As I see the discussion it seems to me there re some things handy to expose for some different reasons or apps. Good to see everyone poking though. The issue will always be humans want more stats though

neo · December 31, 2024, 11:24pm

Engineers gobble up stats like its heaven’s manna

happybeing · January 1, 2025, 3:35pm

Short term that’s not the best approach - long term maybe.

Short term we don’t want people having to do custom builds for the convenience of using vdash, one cancels out the benefit of the other (hence my suggesting this be controlled by an environment variable).

Should vdash be migrated to support metrics then it would no longer be and issue (and it could still maintain the option of logs if that was deemed worthwhile).

neo · January 2, 2025, 12:18am

Sorry, yes I was thinking the future of it all.

And I agree that the writing is on the wall that vdash will need to include reading /metrics Maybe a hybrid of /metrics and logs if need be. I’d add metrics and see if it matches logs as an interim step which will either show up bugs or confirm its right.

Topic		Replies	Views
A dummies guide to starting a node Beginners	30	1254	July 7, 2024
NTracking Community snstatsntracking	84	1362	September 6, 2024
Initial Node UI: Starting with the terminal Releases	68	680	May 9, 2024
I have written a script that I'd like help to test Community	25	257	September 20, 2024
Update 25th April, 2024 Updates	14	1142	May 2, 2024

Vdash - dashboard for Autonomi nodes

Related topics