Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spike: Protocol Analysis #383

Closed
2 tasks done
Tracked by #9
theMultitude opened this issue Jun 28, 2024 · 5 comments
Closed
2 tasks done
Tracked by #9

spike: Protocol Analysis #383

theMultitude opened this issue Jun 28, 2024 · 5 comments
Assignees
Labels

Comments

@theMultitude
Copy link
Contributor

theMultitude commented Jun 28, 2024

Problem Statement

We currently don't have analytical systems in place to monitor the protocol's usage and performance.

In my opinion we have a couple of paths in the near term:

  • Perform analysis on data maintained by the protocol: the state ledger and node interactions.
  • Implement an event driven system to emit and store events for analysis in a centralized space as outlined here

Discussion points

There are trade-offs for both of these options and they aren't mutually exclusive but I believe the second is more robust while requiring less work on the protocol side, and is thus a quicker solution.

One critical pain point is that I believe storing the granularity of data I'd be interested in would be prohibitive if it's stored in each block:

  • To accommodate keeping more granular data on the protocol we have the option of only storing the current state of the network (aggregates and averages) on the state blockchain, having individual nodes store more specific data related to their interactions, and then reconciling the two.
  • For example, an individual worker stores it's own utility over time but to interact with the network that utility has to reconcile with their current utility which is stored officially by a validator. The validator is a summarization (in this case an average) of more granular data stored elsewhere.
  • This can correspond with any node specific data so we're distributing the granularity while maintaining our ability to verify.
  • The main drawback for me with this approach for analysis is that the distributed nature makes it much more difficult to aggregate and manipulate the data.
  • Furthermore this path has many dependencies on technology still being developed for use within the protocol potentially delaying its functionality.

The other main point is a question of separation of concerns and unnecessary data:

  • An event driven system that feeds a centralized data store for analysis allows us to avoid needing to persist granular data only used for tuning the protocol.
  • This approach is also more modular and easier to modify without cascading implications to the actual logic of the protocol.

The main drawbacks to implementing an event-driven data layer are:

  • The costs associated with the Cloud infrastructure
  • the upfront engineering needed to stand up the main components (maintenance and extensibility are much easier).

Summary

I believe implementing an event driven infrastructure should be a priority as it would:

  • avoid the bloat of granular data (meant for analysis) being stored on the protocol.
  • begin to decouple analytics by leaning into an SDK as a layer to extract information from the protocol.
  • create an immediate pathway for saving protocol information in relation to disaster recovery

Acceptance Criteria:

  • A more detailed analysis around the implications of each decision particularly regarding proximate work streams.
  • We have a roadmap for the growth of analytics in Q3 and Q4 addressing noted limitations with a high level view to amelioration.
@mudler mudler changed the title Protocol Analysis spike: Protocol Analysis Jul 1, 2024
@theMultitude
Copy link
Contributor Author

SDK Ticket

@theMultitude
Copy link
Contributor Author

theMultitude commented Jul 8, 2024

The simplistic flow to understand:

  • Oracles
    • request received/submitted
    • request sent/error
  • Worker/s
    • request received
    • work completed/error
  • Validator
    • work received
    • reviewed work
    • parse utility/rewards

Further I'll outline a consistent structure to make parsing this data easier.

It'll follow something like:

{
  "timestamp": "2024-07-05T10:30:00Z",
  "node_id": "oracle_id",
  "event_type": "request_received",
  "details": {
    "cid": "..."
    "key": value
     etc.
  }
}

{
  "timestamp": "2024-07-05T10:30:05Z",
  "node_id": "worker_1",
  "event_type": "work_received",
  "details": {
    "cid": "...",
    "received_from": "oracle_id"
  }
}

@Luka-Loncar Luka-Loncar removed the triage label Jul 8, 2024
@theMultitude theMultitude self-assigned this Jul 8, 2024
@theMultitude
Copy link
Contributor Author

theMultitude commented Jul 8, 2024

@j2d3 please add your thoughts/hesitations here ASAP so we can dig into a resolution. I will add more specific data structures once we're straightened out and ready to proceed.

cc @teslashibe

@jdutchak
Copy link
Contributor

jdutchak commented Jul 9, 2024

@theMultitude this is the code that ships a json payload to s3 563c150

and this would be how you call it where jsonPayload contains the data to send

err = db.SendToS3(id, jsonPayload)
if err != nil {
 logrus.Errorf("[-] Failed to send oracle data: %v", err)
}

@theMultitude
Copy link
Contributor Author

Summary of Q3 Work Streams

From an analytics perspective, one of the most commonly encountered issues is realizing you haven't collected the data needed for future analysis. As we go about trying to fine tune an economic model and stabilize the protocol during organic growth we don't want to find ourselves in that situation. In contrast to periodic data pulls that offer static glimpses event driven analytics gives visibility into critical state changes. The following is an outline of data streams I see as essential to analytics work within the current quarter (Q3 2024) at Masa.

  1. Node State (vertices) - How does node state change over time? Node state at any point in time is encompassed within the nodeData structure as it currently exists. However, understanding how node state evolves is important for understanding the make-up of our network and how nodes mature over time.

  2. Node Relationships (edges) - How do nodes relate to one another? I want to understand which nodes interact with other nodes and how those patterns develop over time.

  3. Work Threads - How does a request for data from the protocol propagate and come to completion?

These data streams don't need to exist immediately but taking the time to carve out their foundations will make refining them infinitely easier as we move forward.

Associated Tickets

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants