Skip to content
This repository has been archived by the owner on Nov 7, 2022. It is now read-only.

Fix starttime and summary has no quantiles for prometheus receiver #597

Conversation

fivesheep
Copy link
Contributor

To address start time for counter/histogram/summary, and take delta as values for these types.
For issue: #588

It also fixes the issue when summary has no quantiles, allow it to produce a summary with nil snapshots
For issue: #589

cc @dinooliva @rghetia

…as values for these types. It also fixes the issue when summary has no quantiles, allow it to produce a summary with nil snapshots
@codecov
Copy link

codecov bot commented Jul 1, 2019

Codecov Report

Merging #597 into master will decrease coverage by 0.15%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #597      +/-   ##
==========================================
- Coverage   69.13%   68.97%   -0.16%     
==========================================
  Files          92       93       +1     
  Lines        6110     6125      +15     
==========================================
+ Hits         4224     4225       +1     
- Misses       1665     1679      +14     
  Partials      221      221
Impacted Files Coverage Δ
...ceiver/prometheusreceiver/internal/metricfamily.go 100% <100%> (ø)
...iver/prometheusreceiver/internal/metricsbuilder.go 100% <100%> (ø) ⬆️
...eceiver/prometheusreceiver/internal/transaction.go 84.84% <100%> (-4.22%) ⬇️
receiver/prometheusreceiver/internal/ocastore.go 100% <100%> (+27.58%) ⬆️
receiver/prometheusreceiver/internal/logger.go 0% <0%> (-77.78%) ⬇️
receiver/prometheusreceiver/internal/metadata.go 0% <0%> (-75%) ⬇️
receiver/prometheusreceiver/metrics_receiver.go 67.21% <0%> (-4.92%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7b57179...a837176. Read the comment docs.

Copy link
Contributor

@rghetia rghetia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First round. Some minor comments. still reviewing..

receiver/prometheusreceiver/README.md Outdated Show resolved Hide resolved
receiver/prometheusreceiver/README.md Outdated Show resolved Hide resolved
receiver/prometheusreceiver/internal/metadata.go Outdated Show resolved Hide resolved
…receiver-starttime

# Conflicts:
#	receiver/prometheusreceiver/internal/metricsbuilder.go
#	receiver/prometheusreceiver/internal/metricsbuilder_test.go
#	receiver/prometheusreceiver/internal/ocastore_test.go
#	receiver/prometheusreceiver/internal/transaction.go
#	receiver/prometheusreceiver/metrics_receiver_test.go
@fivesheep
Copy link
Contributor Author

@rghetia @dinooliva updated as per #598
please do review the rewritten TestEndToEnd which integrated all the new things together, it's the best place to describe the receiver's behavior

@fivesheep
Copy link
Contributor Author

the test failure is not related to the PR:

=== RUN   TestReception
--- FAIL: TestReception (0.00s)
    trace_receiver_test.go:47: Starting
    trace_receiver_test.go:50: StartTraceReception failed: listen udp :6832: bind: address already in use
FAIL
coverage: 80.5% of statements
FAIL	github.com/census-instrumentation/opencensus-service/receiver/jaegerreceiver	1.173s

@@ -79,44 +105,205 @@ func TestNew(t *testing.T) {
}
}

// Test data for EndToEnd test

var target1Page1 = `
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivesheep - yes, the endtoend test does help clarify things. A couple of comments:

  1. Adding a some comments on the target pages about what's being tested would be helpful.
  2. It looks like resets aren't tested - adding that to a test page would also be worthwhile
  3. It also looks like empty histogragms/summaries aren't tested here either - I think they would be worthwhile to have too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure will do.
I just checked the reset cases, it seems like when it resets, the adjuster will discard the value (use it as initial) when a reset is detected. Is it an expected behavior? I thought it would use the timestamp from previous scrape as starttime, and value as it is (starting from 0). it's not right or wrong here, just a decision of best-effort. if this is what we want, I can keep the test case this way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a reset acts like an initial point since we don't know when the reset actually occurred. Using the timestamp from the previous scrape does give a much more constrained lower bound than the initial point and may be worth considering - I need to think through the implications.

@rghetia - thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to treat reset as initial point. Consider this
0. At time T0 a job reported data

  1. At time T1 a job stops reporting for whatever reason.
  2. At time T2 the job resets
  3. At time T3 the job resumes reporting.

If time lag between T0 and T3 is within scraping interval then it is okay to use T0 as a starttime. Otherwise it will provide incorrect data. I would always prefer no-data vs incorrect data.

Copy link
Contributor

@songy23 songy23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rebase and resolve the merge conflicts.

…iver-starttime

# Conflicts:
#	receiver/prometheusreceiver/internal/metrics_adjuster.go
@fivesheep
Copy link
Contributor Author

@songy23 merged
@dinooliva I have add a test for the resetting scenario, in case we might change the behavior of using the previous scrape timestamp as starttime, I have also restructure the end to end tests to make it easy to modify and adjust

@songy23
Copy link
Contributor

songy23 commented Jul 29, 2019

Will merge this PR by EOD tomorrow if there're no further concerns/comments.

@songy23 songy23 merged commit b5a7ae3 into census-instrumentation:master Jul 30, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants