Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instrument the code with OpenTelemetry to improve monitoring abilities #256

Open
proffalken opened this issue Apr 25, 2024 · 4 comments · May be fixed by #257
Open

Instrument the code with OpenTelemetry to improve monitoring abilities #256

proffalken opened this issue Apr 25, 2024 · 4 comments · May be fixed by #257

Comments

@proffalken
Copy link
Contributor

proffalken commented Apr 25, 2024

Is your feature request related to a problem? Please describe.

OpenTelemetry is rapidly becoming the standard in sending metrics, logs, and traces to platforms such as Grafana, Datadog, Honeycomb.io, and many more.

In order to have confidence that the platform is performing as expected, it would be good to instrument the existing code to use OpenTelemetry as it removes any vendor lock-in around monitoring and observability whilst ensuring that administrators of the platform can see everything that is going on inside the system.

Describe the solution you'd like

Implement instrumentation on at least the backend platforms for the major events such as user sign-up, door activation, and interlock communication.

The default for OpenTelemetry is a NOOP, so even if this is in the code base, it will not affect any existing installations and the logging/metrics would be appended to the existing outputs rather than replacing them.

Describe alternatives you've considered

We could continue to go down the Prometheus route, but this only takes into account metrics, not logs and traces, which can be incredibly helpful when trying to troubleshoot a system split across multiple services, which seems to be the way that MemberMatters is headed.

Additional context

Full disclosure - I work for Grafana however OpenTelemetry would ensure that observability of Member Matters remains vendor agnostic.

This is something I'm happy to work on and contribute to for the backend, however frontend observability within OpenTelemetry is not well advanced and therefore we would need to decide whether to leave the frontend side of things for now, or look at a vendor-specific option such as Grafana Faro which is Open Source (Apache License) but is tied to Grafana rather than being compatible with other vendors in the same way that OpenTelemetry is.

@proffalken
Copy link
Contributor Author

I've just remembered that there's support for Sentry in MemberMatters already.

Sentry are one of the few organisations who aren't involved with Open Telemetry, but I'm not suggesting we should get rid of Sentry support here, just augment it with a platform that is more open :)

@jabelone
Copy link
Member

Funnily enough I’ve just gone through adding open telemetry to a C# codebase for work. I think it’s a good idea! I also think sentry and open telemetry are best suited for different things. Sentry is great for unhandled error handling, triage and tracking but open telemetry is great for metrics etc. We also already have a Prometheus endpoint thanks to a Django add on but I don’t think it’s well (if at all) documented, and it’s only for Django specific things.

Here’s some of the metrics it currently exports.
IMG_6218

I think I had to disable the sentry feature in a previous release because it was causing issues with the configuration and I definitely haven’t checked it in a long time.

@proffalken
Copy link
Contributor Author

Oh awesome, I made a start on this last night, I'll continue on it over the next few days and see where it goes.

I'm talking specifically about getting up and running with OTEL at Monitorama this year, so I may well use MemberMatters as the demo app rather than the robot arm I've been trying to design and build in my spare time, I'll see how it goes!

proffalken added a commit to MakeMonmouth/MemberMatters that referenced this issue Apr 26, 2024
@proffalken proffalken linked a pull request Apr 26, 2024 that will close this issue
@proffalken
Copy link
Contributor Author

Initial traces flowing into Grafana Cloud Tracing:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants