-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FeatureRequest/Idea] Make polkadot-introspector the ultimate debugging tool #764
Comments
I think some automation here would be very helpful, however seemed to be PITA to integrate with Loki last time I tried.
These should be part of the
I don't think this is feasible for collators, for validators it should be easy to check on-chain data.
Do you have any idea on how to approach this. Long ago I was experimenting with exposing
https://github.com/ordian/kuddelmuddel already does execute the block. Do we want to expand on that ?
The data is already available in parachain tracer.
The parachain tracer already shows the relay chain blocks where para slots were missed.
This should be supported with the historical mode of parachain tracer. @AndreiEres please share some thoughts |
I think @lexnv has something that he is using for this.
Most of the parachains use aura, we should be able to query aura authorithies and reverse engineer from that whose turn was to generate a block.
For now I was thinking of only using the data that is there and deduce from that, I was thinking more about things like, did the collator produce other blocks that ended up on chain? Do backing validator backed other parachains blocks or from other collators? Did the author back things ? Was to core ready for backing ?
I was thinking just to use it in here, so that we have everything in one place, same thing with https://github.com/lexnv/subp2p-explorer, the goal being to make the polkadot-introspector the ultimate debugging tool :D. |
It should be relatively straight forward to integrate with grafana, I've did a similar thing to triage all warnings / errors from substrate: https://github.com/lexnv/sub-triage-logs/ automatically. This tool fetches the polkadot-sdk repo and converts any Then it groups warnings / errors from Grafana or a local file. Even more, you can add a closure to deduplicate further (like we do for peer banned reason) A triage output looks like:
This can also be done with https://github.com/lexnv/subp2p-explorer after fetching a DHT record.
We have a tracking issue paritytech/substrate-telemetry#588 to expose a friendly API to extract a bit more information from substrate-telemetry. It would be beneficial to have this API although it will not be sufficient for debugging. Most of the time we try to debug an issue that happened in the past, and by that time the information exposed by telemetry might be lost. For example, who is the peer that was banned 3 days ago? We can either introduce a new service to keep a history record of N days, or we can extend the substrate-telemetry to keep the data around of us
IIUC, we'll have something similar to:
I would opt for a separate repo entirely to keep things simpler, shouldn't matter that much here since it would be a forwarding cli. |
So, a small part of these queries already exist or are easy to implement.
Actually, the introspector is already not a single tool but a toolset. I don't like the idea of making a meta toolset of a toolset :-) But I need to mull over the requests. |
The list of the top it is more or less already prioritised. |
Polkadot introspector should be extended with logic that help us understand the state of the polkadot network at a given time, so that developer could quickly diagnose what component or entity is not properly functioning, for that I think we should extend it to included some predefined logic that can answer specific protocol questions using configurable data sources.
Data sources:
Examples of queries/operations
This is a sample list of basic operations that I find useful during debugging, we should not limit ourselves to them, as a rule of thumb with this tool we should be able to check any protocol invariant that we have data for.
The text was updated successfully, but these errors were encountered: