Open
Description
Metadata | |
---|---|
Point of contact | @yaahc |
Team(s) | compiler, infra |
Goal document | 2025h1/metrics-initiative |
Tracking Issue | rust-lang/rust#128914 |
Summary
Build out the support for metrics within the rust compiler and starting with a Proof of concept dashboard for viewing unstable feature usage statistics over time.
Tasks and status
Metadata
Metadata
Assignees
Type
Projects
Milestone
Relationships
Development
No branches or pull requests
Activity
nikomatsakis commentedon Feb 18, 2025
This issue is intended for status updates only.
For general questions or comments, please contact the owner(s) directly.
yaahc commentedon Feb 25, 2025
I'm very excited to see that this got accepted as a project goal 🥰 🎉
Let me go ahead and start by giving an initial status update of where I'm at right now.
rust-lang/rust/src/tools/features-status-dump
which dumps the status information for all unstable, stable, and removed features as a json fileRUSTFLAGS_NOT_BOOTSTRAP="-Zmetrics-dir=$HOME/tmp/metrics" ./x build --stage 2
and./x run src/tools/features-status-dump/
, save the output to the filesystem, and convert the output to the line protocol with the aforementioned programfrom
unstable_feature_usage_metrics-rustc_hir-3bc1eef297abaa83.json
Snippet of unstable feature usage metrics post conversion to line protocol
Snippet of feature status metrics post conversion to line protocol
Run with
influxdb3 query --database=unstable-feature-metrics --file query.sql
yaahc commentedon Feb 25, 2025
My next step is to revisit the output format, currently a direct json serialization of the data as it is represented internally within the compiler. This is already proven to be inadequate by personal experience, given the need for additional ad-hoc conversion into another format with faked timestamp data that wasn't present in the original dump, and by conversation with @badboy (Jan-Erik), where he recommended we explicitly avoid ad-hoc definitions of telemetry schemas which can lead to difficult to manage chaos.
I'm currently evaluating what options are available to me, such as a custom system built around influxdb's line format, or opentelemetry's metrics API.
Either way I want to use firefox's telemetry system as inspiration / a basis for requirements when evaluating the output format options
Relevant notes from my conversation w/ Jan-Erik
yaahc commentedon Mar 3, 2025
After further review I've decided to limit scope initially and not get ahead of myself so I can make sure the schemas I'm working with can support the kind of queries and charts we're going to eventually want in the final version of the unstable feature usage metric. I'm hoping that by limiting scope I can have most of the items currently outlined in this project goal done ahead of schedule so I can move onto building the proper foundations based on the proof of concept and start to design more permanent components. As such I've opted for the following:
For the second item above I need to have more detailed conversations with both @rust-lang/libs-api and @rust-lang/lang
yaahc commentedon Apr 16, 2025
Small progress update:
following the plan mentioned above plus some extra bits, I've implemented the following changes
.cargo
orbuild
directories, which in the compiler is known asextra_filename
and is configured by cargo, but it turns out this doesn't guarantee uniquenessNext Steps:
exact_div
rust#139911 tracking the specific functions used)yaahc commentedon Apr 22, 2025
posting this here so I can link to it in other places, I've setup the basic usage over time chart using some synthesized data that just emulates quadraticly (is that a word?) increasing feature usage for my given feature over the course of a week (the generated data starts at 0 usages per day and ends at 1000 usages per day). This chart counts the usage over each day long period and charts those counts over a week. The dip at the end is the difference between when I generated the data, after which there is zero usage data, and when I queried it.
With this I should be ready to just upload the data once we've gathered it from docs.rs, all I need to do is polish and export the dashboards I've made from grafana to the rust-lang grafana instance, connect that instance to the rust-lang influxdb instance, and upload the data to influxdb once we've gathered it.
yaahc commentedon May 26, 2025
Quick update, Data is currently being gathered (and has been for almost 2 weeks now) on docs.rs and I should have it uploaded and accessible on the PoC dashboard within the next week or two (depending on how long I want to let the data gather).
yaahc commentedon Jun 3, 2025
Bigger Update,
I've done the initial integration with the data gathered so far since rustweek. I have the data uploaded to the influxdb cloud instance managed by the infra team, I connected the infra team's grafana instance to said influxdb server and I imported my dashboards so we now have fancy graphs with real data on infra managed servers 🎉
I'm now working with the infra team to see how we can open up access of the graphana dashboard so that anyone can go and poke around and look at the data.
Another issue that came up is that the influxdb cloud serverless free instance that we're currently using has a mandatory max 30 day retention policy, so either I have to figure out a way to get that disabled on our instance or our data will get steadily deleted and will only be useful as a PoC demo dashboard for a short window of time.