Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Print traces for missed events in upgrade tests #6249

Merged
merged 10 commits into from
Apr 1, 2022

Conversation

mgencur
Copy link
Contributor

@mgencur mgencur commented Mar 9, 2022

This builds on top of #6219
Fixes point 3. from #6145 (comment)

Proposed Changes

  • Add a new Tag called "target" to the Traces that helps differentiate traces from different tests running in parallel.
  • Search Zipkin endpoint for traces for Events that were not delivered. Store each trace in a json file separately.
  • Step events are stored in step-<num>.json. Finished event is stored in finished.json under traces/missed-events/
  • Readme with instructions on how to vizualize saved traces
  • Increase memory limits for Zipkin Pod. Increase -Xmx settings for gathering more traces from upgrade tests

Pre-review Checklist

  • At least 80% unit test coverage
  • E2E tests for any new behavior
  • Docs PR for any user-facing impact
  • Spec PR for any new API feature
  • Conformance test for any change to the spec

Release Note


Docs

@knative-prow-robot knative-prow-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 9, 2022
@knative-prow-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mgencur
To complete the pull request process, please assign aliok after the PR has been reviewed.
You can assign the PR to them by writing /assign @aliok in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow-robot knative-prow-robot added the area/test-and-release Test infrastructure, tests or release label Mar 9, 2022
@mgencur mgencur force-pushed the wathola_tracing_main_printtrace branch from eceb1b3 to 9da3205 Compare March 9, 2022 15:19
@mgencur
Copy link
Contributor Author

mgencur commented Mar 9, 2022

This log shows printing a Zipkin trace for an event that is missed (the link points exactly to the line where the trace is printed)

The problem is that I had to increase the interval between events to 500ms (from 10ms) to prevent dropping trace spans because the in-memory Zipkin database doesn't handle such a big load and goes OutOfMemory several times during the regular test. This was just to demonstrate the missed event and to get the trace from Zipkin. The change will not be part of this PR. We need to discuss possible options to resolve this.

@codecov
Copy link

codecov bot commented Mar 9, 2022

Codecov Report

Merging #6249 (85836d5) into main (3890b39) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main    #6249   +/-   ##
=======================================
  Coverage   82.18%   82.18%           
=======================================
  Files         231      231           
  Lines        7787     7787           
=======================================
  Hits         6400     6400           
  Misses        937      937           
  Partials      450      450           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3890b39...85836d5. Read the comment docs.

@mgencur
Copy link
Contributor Author

mgencur commented Mar 10, 2022

The output is meant to be readable from the console, I've converted the http response from Zipkin into a "trace tree" and reused the tree API from conformance tests. No sure if it's necessary to have GUI for reading the trace, it doesn't give you much more except for searching capabilities, but this is already a single trace (no search needed).
Anyway, if we wanted to use GUI to view the trace it could be done this way:

  • Change the impl to print to console exactly the response from the Zipkin http endpoint (it is JSON - an array of Spans)

Then later, the user would do this:

  • Start zipkin container locally by podman run -p 9411:9411 ghcr.io/openzipkin/zipkin:2
  • Send the JSON to Zipkin container by this:
curl -X POST localhost:9411/api/v2/spans \
   -H 'Content-Type: application/json' \
   -d "<json_with_spans>"
  • Open the UI of Zipkin at http://localhost:9411
  • View the trace in UI

Copy link
Member

@pierDipi pierDipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this PR!

I'm wondering if we can make it a library that would work in any rekt tests out of the box

@mgencur
Copy link
Contributor Author

mgencur commented Mar 10, 2022

@pierDipi thanks ! :) Making it a library would probably make sense. Let me think about it.

@cardil
Copy link
Contributor

cardil commented Mar 10, 2022

I would like to have the span saved as an artifact, to later inspect, as you've described @mgencur. I think reading the span in text format is cumbersome. Loading it to GUI makes more sense to me.

@cardil
Copy link
Contributor

cardil commented Mar 10, 2022

The message could be:

event #10 should be received once, but was received 0 times, trace saved as: /artifacts/traces/missed-events/event-10.json

@mgencur
Copy link
Contributor Author

mgencur commented Mar 10, 2022

@cardil I guess the logic for getting the trace will need to be moved from Receiver (in-cluster) to Prober (localhost). And so the message will probably don't include anything about the trace. The Prober can later get traces for both Step events (based on report.Thrown.Missing) and Finished event (if state is still active) and store them in artifacts.

@mgencur
Copy link
Contributor Author

mgencur commented Mar 11, 2022

Pushed changes for exporting traces into files.
This log shows exporting traces for step event #10. The trace is then available in artifacts

@mgencur
Copy link
Contributor Author

mgencur commented Mar 14, 2022

I have figured out the memory settings for Zipkin which works for both upgrade tests and non-upgrade Github tests. The Zipkin pod is now able to store all traces and the upgrade tests properly get the trace at the end as can be seen in this run

@mgencur mgencur force-pushed the wathola_tracing_main_printtrace branch from 3d97325 to 95ca3c8 Compare March 14, 2022 11:26
@knative-prow-robot knative-prow-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 18, 2022
@mgencur mgencur force-pushed the wathola_tracing_main_printtrace branch from 68bca22 to 2fec7c7 Compare March 29, 2022 13:30
@knative-prow-robot knative-prow-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 29, 2022
@knative-prow knative-prow bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 29, 2022
@mgencur
Copy link
Contributor Author

mgencur commented Mar 29, 2022

Latest run which shows exporting a trace for a missed event.

@mgencur mgencur changed the title [WIP] Print traces for missed events in upgrade tests Print traces for missed events in upgrade tests Mar 29, 2022
@knative-prow knative-prow bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 29, 2022
@mgencur
Copy link
Contributor Author

mgencur commented Mar 30, 2022

The failure in downstream tests for eventing-rabbitmq seems to be unrelated to this PR.

Comment on lines 111 to 123
func (p *prober) getTraceForStepEvent(eventNo string) []byte {
query := fmt.Sprintf("step=%s and cloudevents.type=%s and target=%s",
eventNo, event.StepType, fmt.Sprintf(forwarderTargetFmt, p.client.Namespace))
trace, err := event.FindTrace(query)
if err != nil {
p.log.Warn(err)
}
return trace
}

func (p *prober) getTraceForFinishedEvent() []byte {
query := fmt.Sprintf("cloudevents.type=%s and target=%s",
event.FinishedType, fmt.Sprintf(forwarderTargetFmt, p.client.Namespace))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these queries work across source, broker, channel implementations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They should work because these Tags are added by Wathola sender when generating the events: link.
They're basically added at the beginning of the Trace and the queries return all traces that have these tags.
Right now it should work anywhere where the http sender is used because it properly passes the Context (propagates the trace). I still want to add it to the KafkaSender here (I have the code locally but waiting for the knative/eventing changes to propagate to that repo)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mgencur keep in mind that wathola sender has retries implemented (see here). That means we could have multiple spans for a single step event, and we should report all of them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cardil The query should return all traces for step/finished events with the given step/type as long as the event was not delivered in the end. For a missed event there might by multiple traces included in the report if the sender was trying to re-deliver them and it failed. And the Zipkin UI should be able to display them all.
I am not saving traces for duplicated events (as long as they were delivered) so I'm sending another commit that should fix it. Testing it now...

@mgencur mgencur force-pushed the wathola_tracing_main_printtrace branch from 34f620d to c126521 Compare April 1, 2022 07:04
@mgencur
Copy link
Contributor Author

mgencur commented Apr 1, 2022

@cardil Pushed some minor fixes. The previous run had both a missed event and duplicate events.
If there's a duplicate event all the traces are stored in the single file and can be displayed as below (the event was delivered 4 times):
step-10

Copy link
Contributor

@cardil cardil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😍 Looks great!

/lgtm

Comment on lines +123 to +142
#### Inspecting Zipkin traces for undelivered events

When tracing is enabled in the `config-tracing` config map in the system namespace
the prober collects traces for undelivered events. The traces are exported as json files
under the artifacts dir. Traces for each event are stored in a separate file.
Step event traces are stored as `$ARTIFACTS/traces/missed-events/step-<step_number>.json`
The finished event traces are stored as `$ARTIFACTS/traces/missed-events/finished.json`

Traces can be viewed as follows:
- Start a Zipkin container on localhost:
```
$ docker run -d -p 9411:9411 ghcr.io/openzipkin/zipkin:2
```
- Send traces to the Zipkin endpoint:
```
$ curl -v -X POST localhost:9411/api/v2/spans \
-H 'Content-Type: application/json' \
-d @$ARTIFACTS/traces/missed-events/step-<step_number>.json
```
- View traces in Zipkin UI at `http://localhost:9411/zipkin`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

@knative-prow knative-prow bot added the lgtm Indicates that a PR is ready to be merged. label Apr 1, 2022
@knative-prow
Copy link

knative-prow bot commented Apr 1, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cardil, mgencur

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow knative-prow bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 1, 2022
@knative-prow knative-prow bot merged commit 380849f into knative:main Apr 1, 2022
mgencur added a commit to mgencur/eventing that referenced this pull request Apr 20, 2022
* Upgrade tests reporting Trace information for missed events

* TMP: Induce missed event

* Revert "TMP: Induce missed event"

This reverts commit 2fec7c7.

* Report trace also for Duplicated events

* TMP: Induce missed event

* TMP: Simulate duplicate events

* Fix readme

* Unify path for duplicate and missed events

* Revert "TMP: Simulate duplicate events"

This reverts commit c126521.

* Revert "TMP: Induce missed event"

This reverts commit fcd9185.
knative-prow bot pushed a commit that referenced this pull request Apr 20, 2022
* Wathola Tracing for upgrade tests (#6219)

* wathola exposing trace information

* Run update-deps.sh

* Fix license

* Fix import

* Ensure backwards compatibility

* Assert ParentID not nil in test

* Separate old and new events sender APIs

* Make loggingCfg in client private

* Wait only 1 second for flushing tracing info

The Reporter is created with a default batch interval 1 second. So, it
should be enough to wait just 1 second because the data is flushed every
1 second.

* Increase the sleep time to 1.5 seconds to be safe

* The ticker runs every 100ms so it could be 1100 ms until the buffer
really flushes.

* Use Log.Fatal when tracing is not set up properly

* Increase the sleep time to 5 seconds and reference knative/pkg issue

* Process empty tracing config in test images (#6289)

* Print traces for missed events in upgrade tests (#6249)

* Upgrade tests reporting Trace information for missed events

* TMP: Induce missed event

* Revert "TMP: Induce missed event"

This reverts commit 2fec7c7.

* Report trace also for Duplicated events

* TMP: Induce missed event

* TMP: Simulate duplicate events

* Fix readme

* Unify path for duplicate and missed events

* Revert "TMP: Simulate duplicate events"

This reverts commit c126521.

* Revert "TMP: Induce missed event"

This reverts commit fcd9185.

* Do not fail upgrade tests if tracing is not configured (#6299)

* Do not fail upgrade tests if tracing is not configured

* TMP: Do not deploy Knative Monitoring

* Revert "TMP: Do not deploy Knative Monitoring"

This reverts commit 086a8f9.

* Limit the number of exported traces (#6329)

Exporting traces for a large number of events can exceed the timeout of
the whole test suite, leading to all upgrade tests being reported as
failed.

* Cleanup Zipkin tracing only once in upgrade test suite (#6331)

* NPE fix (#6343)

Co-authored-by: Chris Suszynski <[email protected]>
knative-prow-robot pushed a commit to knative-prow-robot/eventing that referenced this pull request Apr 20, 2022
* Upgrade tests reporting Trace information for missed events

* TMP: Induce missed event

* Revert "TMP: Induce missed event"

This reverts commit 2fec7c7.

* Report trace also for Duplicated events

* TMP: Induce missed event

* TMP: Simulate duplicate events

* Fix readme

* Unify path for duplicate and missed events

* Revert "TMP: Simulate duplicate events"

This reverts commit c126521.

* Revert "TMP: Induce missed event"

This reverts commit fcd9185.
matzew pushed a commit to matzew/eventing that referenced this pull request Apr 20, 2022
* Wathola Tracing for upgrade tests (knative#6219)

* wathola exposing trace information

* Run update-deps.sh

* Fix license

* Fix import

* Ensure backwards compatibility

* Assert ParentID not nil in test

* Separate old and new events sender APIs

* Make loggingCfg in client private

* Wait only 1 second for flushing tracing info

The Reporter is created with a default batch interval 1 second. So, it
should be enough to wait just 1 second because the data is flushed every
1 second.

* Increase the sleep time to 1.5 seconds to be safe

* The ticker runs every 100ms so it could be 1100 ms until the buffer
really flushes.

* Use Log.Fatal when tracing is not set up properly

* Increase the sleep time to 5 seconds and reference knative/pkg issue

* Process empty tracing config in test images (knative#6289)

* Print traces for missed events in upgrade tests (knative#6249)

* Upgrade tests reporting Trace information for missed events

* TMP: Induce missed event

* Revert "TMP: Induce missed event"

This reverts commit 2fec7c7.

* Report trace also for Duplicated events

* TMP: Induce missed event

* TMP: Simulate duplicate events

* Fix readme

* Unify path for duplicate and missed events

* Revert "TMP: Simulate duplicate events"

This reverts commit c126521.

* Revert "TMP: Induce missed event"

This reverts commit fcd9185.

* Do not fail upgrade tests if tracing is not configured (knative#6299)

* Do not fail upgrade tests if tracing is not configured

* TMP: Do not deploy Knative Monitoring

* Revert "TMP: Do not deploy Knative Monitoring"

This reverts commit 086a8f9.

* Limit the number of exported traces (knative#6329)

Exporting traces for a large number of events can exceed the timeout of
the whole test suite, leading to all upgrade tests being reported as
failed.

* Cleanup Zipkin tracing only once in upgrade test suite (knative#6331)

* NPE fix (knative#6343)

Co-authored-by: Chris Suszynski <[email protected]>
knative-prow bot pushed a commit that referenced this pull request Apr 20, 2022
* Wathola Tracing for upgrade tests (#6219)

* wathola exposing trace information

* Run update-deps.sh

* Fix license

* Fix import

* Ensure backwards compatibility

* Assert ParentID not nil in test

* Separate old and new events sender APIs

* Make loggingCfg in client private

* Wait only 1 second for flushing tracing info

The Reporter is created with a default batch interval 1 second. So, it
should be enough to wait just 1 second because the data is flushed every
1 second.

* Increase the sleep time to 1.5 seconds to be safe

* The ticker runs every 100ms so it could be 1100 ms until the buffer
really flushes.

* Use Log.Fatal when tracing is not set up properly

* Increase the sleep time to 5 seconds and reference knative/pkg issue

* Process empty tracing config in test images (#6289)

* Print traces for missed events in upgrade tests (#6249)

* Upgrade tests reporting Trace information for missed events

* TMP: Induce missed event

* Revert "TMP: Induce missed event"

This reverts commit 2fec7c7.

* Report trace also for Duplicated events

* TMP: Induce missed event

* TMP: Simulate duplicate events

* Fix readme

* Unify path for duplicate and missed events

* Revert "TMP: Simulate duplicate events"

This reverts commit c126521.

* Revert "TMP: Induce missed event"

This reverts commit fcd9185.

* Do not fail upgrade tests if tracing is not configured (#6299)

* Do not fail upgrade tests if tracing is not configured

* TMP: Do not deploy Knative Monitoring

* Revert "TMP: Do not deploy Knative Monitoring"

This reverts commit 086a8f9.

* Limit the number of exported traces (#6329)

Exporting traces for a large number of events can exceed the timeout of
the whole test suite, leading to all upgrade tests being reported as
failed.

* Cleanup Zipkin tracing only once in upgrade test suite (#6331)

* NPE fix (#6343)

Co-authored-by: Martin Gencur <[email protected]>
Co-authored-by: Chris Suszynski <[email protected]>
openshift-merge-robot pushed a commit to openshift/knative-eventing that referenced this pull request Apr 21, 2022
* Wathola Tracing for upgrade tests (knative#6219)

* wathola exposing trace information

* Run update-deps.sh

* Fix license

* Fix import

* Ensure backwards compatibility

* Assert ParentID not nil in test

* Separate old and new events sender APIs

* Make loggingCfg in client private

* Wait only 1 second for flushing tracing info

The Reporter is created with a default batch interval 1 second. So, it
should be enough to wait just 1 second because the data is flushed every
1 second.

* Increase the sleep time to 1.5 seconds to be safe

* The ticker runs every 100ms so it could be 1100 ms until the buffer
really flushes.

* Use Log.Fatal when tracing is not set up properly

* Increase the sleep time to 5 seconds and reference knative/pkg issue

* Process empty tracing config in test images (knative#6289)

* Print traces for missed events in upgrade tests (knative#6249)

* Upgrade tests reporting Trace information for missed events

* TMP: Induce missed event

* Revert "TMP: Induce missed event"

This reverts commit 2fec7c7.

* Report trace also for Duplicated events

* TMP: Induce missed event

* TMP: Simulate duplicate events

* Fix readme

* Unify path for duplicate and missed events

* Revert "TMP: Simulate duplicate events"

This reverts commit c126521.

* Revert "TMP: Induce missed event"

This reverts commit fcd9185.

* Do not fail upgrade tests if tracing is not configured (knative#6299)

* Do not fail upgrade tests if tracing is not configured

* TMP: Do not deploy Knative Monitoring

* Revert "TMP: Do not deploy Knative Monitoring"

This reverts commit 086a8f9.

* Limit the number of exported traces (knative#6329)

Exporting traces for a large number of events can exceed the timeout of
the whole test suite, leading to all upgrade tests being reported as
failed.

* Cleanup Zipkin tracing only once in upgrade test suite (knative#6331)

* NPE fix (knative#6343)

Co-authored-by: Chris Suszynski <[email protected]>

Co-authored-by: Martin Gencur <[email protected]>
Co-authored-by: Chris Suszynski <[email protected]>
openshift-cherrypick-robot pushed a commit to openshift-cherrypick-robot/knative-eventing that referenced this pull request Apr 21, 2022
* Wathola Tracing for upgrade tests (knative#6219)

* wathola exposing trace information

* Run update-deps.sh

* Fix license

* Fix import

* Ensure backwards compatibility

* Assert ParentID not nil in test

* Separate old and new events sender APIs

* Make loggingCfg in client private

* Wait only 1 second for flushing tracing info

The Reporter is created with a default batch interval 1 second. So, it
should be enough to wait just 1 second because the data is flushed every
1 second.

* Increase the sleep time to 1.5 seconds to be safe

* The ticker runs every 100ms so it could be 1100 ms until the buffer
really flushes.

* Use Log.Fatal when tracing is not set up properly

* Increase the sleep time to 5 seconds and reference knative/pkg issue

* Process empty tracing config in test images (knative#6289)

* Print traces for missed events in upgrade tests (knative#6249)

* Upgrade tests reporting Trace information for missed events

* TMP: Induce missed event

* Revert "TMP: Induce missed event"

This reverts commit 2fec7c7.

* Report trace also for Duplicated events

* TMP: Induce missed event

* TMP: Simulate duplicate events

* Fix readme

* Unify path for duplicate and missed events

* Revert "TMP: Simulate duplicate events"

This reverts commit c126521.

* Revert "TMP: Induce missed event"

This reverts commit fcd9185.

* Do not fail upgrade tests if tracing is not configured (knative#6299)

* Do not fail upgrade tests if tracing is not configured

* TMP: Do not deploy Knative Monitoring

* Revert "TMP: Do not deploy Knative Monitoring"

This reverts commit 086a8f9.

* Limit the number of exported traces (knative#6329)

Exporting traces for a large number of events can exceed the timeout of
the whole test suite, leading to all upgrade tests being reported as
failed.

* Cleanup Zipkin tracing only once in upgrade test suite (knative#6331)

* NPE fix (knative#6343)

Co-authored-by: Chris Suszynski <[email protected]>
openshift-merge-robot pushed a commit to openshift/knative-eventing that referenced this pull request Apr 21, 2022
* Wathola Tracing for upgrade tests (knative#6219)

* wathola exposing trace information

* Run update-deps.sh

* Fix license

* Fix import

* Ensure backwards compatibility

* Assert ParentID not nil in test

* Separate old and new events sender APIs

* Make loggingCfg in client private

* Wait only 1 second for flushing tracing info

The Reporter is created with a default batch interval 1 second. So, it
should be enough to wait just 1 second because the data is flushed every
1 second.

* Increase the sleep time to 1.5 seconds to be safe

* The ticker runs every 100ms so it could be 1100 ms until the buffer
really flushes.

* Use Log.Fatal when tracing is not set up properly

* Increase the sleep time to 5 seconds and reference knative/pkg issue

* Process empty tracing config in test images (knative#6289)

* Print traces for missed events in upgrade tests (knative#6249)

* Upgrade tests reporting Trace information for missed events

* TMP: Induce missed event

* Revert "TMP: Induce missed event"

This reverts commit 2fec7c7.

* Report trace also for Duplicated events

* TMP: Induce missed event

* TMP: Simulate duplicate events

* Fix readme

* Unify path for duplicate and missed events

* Revert "TMP: Simulate duplicate events"

This reverts commit c126521.

* Revert "TMP: Induce missed event"

This reverts commit fcd9185.

* Do not fail upgrade tests if tracing is not configured (knative#6299)

* Do not fail upgrade tests if tracing is not configured

* TMP: Do not deploy Knative Monitoring

* Revert "TMP: Do not deploy Knative Monitoring"

This reverts commit 086a8f9.

* Limit the number of exported traces (knative#6329)

Exporting traces for a large number of events can exceed the timeout of
the whole test suite, leading to all upgrade tests being reported as
failed.

* Cleanup Zipkin tracing only once in upgrade test suite (knative#6331)

* NPE fix (knative#6343)

Co-authored-by: Chris Suszynski <[email protected]>

Co-authored-by: Martin Gencur <[email protected]>
Co-authored-by: Chris Suszynski <[email protected]>
hawkli-1994 pushed a commit to katanomi/knative-eventing that referenced this pull request Jul 24, 2023
* Wathola Tracing for upgrade tests (knative#6219)

* wathola exposing trace information

* Run update-deps.sh

* Fix license

* Fix import

* Ensure backwards compatibility

* Assert ParentID not nil in test

* Separate old and new events sender APIs

* Make loggingCfg in client private

* Wait only 1 second for flushing tracing info

The Reporter is created with a default batch interval 1 second. So, it
should be enough to wait just 1 second because the data is flushed every
1 second.

* Increase the sleep time to 1.5 seconds to be safe

* The ticker runs every 100ms so it could be 1100 ms until the buffer
really flushes.

* Use Log.Fatal when tracing is not set up properly

* Increase the sleep time to 5 seconds and reference knative/pkg issue

* Process empty tracing config in test images (knative#6289)

* Print traces for missed events in upgrade tests (knative#6249)

* Upgrade tests reporting Trace information for missed events

* TMP: Induce missed event

* Revert "TMP: Induce missed event"

This reverts commit 2fec7c7.

* Report trace also for Duplicated events

* TMP: Induce missed event

* TMP: Simulate duplicate events

* Fix readme

* Unify path for duplicate and missed events

* Revert "TMP: Simulate duplicate events"

This reverts commit c126521.

* Revert "TMP: Induce missed event"

This reverts commit fcd9185.

* Do not fail upgrade tests if tracing is not configured (knative#6299)

* Do not fail upgrade tests if tracing is not configured

* TMP: Do not deploy Knative Monitoring

* Revert "TMP: Do not deploy Knative Monitoring"

This reverts commit 086a8f9.

* Limit the number of exported traces (knative#6329)

Exporting traces for a large number of events can exceed the timeout of
the whole test suite, leading to all upgrade tests being reported as
failed.

* Cleanup Zipkin tracing only once in upgrade test suite (knative#6331)

* NPE fix (knative#6343)

Co-authored-by: Chris Suszynski <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/test-and-release Test infrastructure, tests or release lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants