Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PESDLC-1489 StreamVerifier service #19976

Merged
merged 2 commits into from
Jul 22, 2024
Merged

PESDLC-1489 StreamVerifier service #19976

merged 2 commits into from
Jul 22, 2024

Conversation

savex
Copy link
Contributor

@savex savex commented Jun 24, 2024

This implements service part of the StreamVerifier that is created in this PR.

Service will supports all commands including 'produce', 'consume' and 'atomic' as well as getting current status and active command

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.1.x
  • v23.3.x
  • v23.2.x

Release Notes

  • none

@savex savex requested review from clee and bharathv June 24, 2024 21:13
Copy link
Contributor

@bharathv bharathv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is marked as draft, so unsure if this is going in the current form but please see my comments.

tests/rptest/transactions/verifiers/stream_verifier.py Outdated Show resolved Hide resolved
tests/rptest/transactions/verifiers/stream_verifier.py Outdated Show resolved Hide resolved
tests/rptest/transactions/verifiers/stream_verifier.py Outdated Show resolved Hide resolved
tests/rptest/transactions/verifiers/stream_verifier.py Outdated Show resolved Hide resolved
tests/setup.py Outdated Show resolved Hide resolved
tests/rptest/transactions/verifiers/stream_verifier.py Outdated Show resolved Hide resolved
@savex savex force-pushed the 1489-stream-verifier-service branch 5 times, most recently from cda563c to e09ac04 Compare June 29, 2024 00:13
@savex
Copy link
Contributor Author

savex commented Jun 29, 2024

Most recent commit provides visibility on how very basic test structure might look like. One more thing to add is message validation option on consumption via externally pluggable validator.

@savex savex force-pushed the 1489-stream-verifier-service branch 6 times, most recently from 12fa904 to e1d15ac Compare July 8, 2024 22:54
@savex savex marked this pull request as ready for review July 8, 2024 22:54
@savex savex requested a review from bharathv July 8, 2024 22:54
@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Jul 9, 2024

new failures in https://buildkite.com/redpanda/redpanda/builds/51238#019094e8-d681-417f-a7c3-8db584f7d721:

"rptest.tests.delete_records_test.DeleteRecordsTest.test_delete_records_segment_deletion.cloud_storage_enabled=True.truncate_point=at_high_watermark"
"rptest.transactions.tx_admin_api_test.TxAdminTest.test_mark_transaction_expired"

new failures in https://buildkite.com/redpanda/redpanda/builds/51587#0190bc79-b124-448a-a387-9c4f087b5635:

"rptest.transactions.stream_verifier_test.StreamVerifierTest.test_simple_produce_consume_txn_with_add_node"

new failures in https://buildkite.com/redpanda/redpanda/builds/51658#0190c184-0228-4a99-9546-e8430d8b45dd:

"rptest.transactions.stream_verifier_test.StreamVerifierTest.test_simple_produce_consume_txn_with_add_node"

new failures in https://buildkite.com/redpanda/redpanda/builds/51658#0190c184-fa72-4f55-bca5-85c2cdc49e69:

"rptest.transactions.stream_verifier_test.StreamVerifierTest.test_simple_produce_consume_txn_with_add_node"

@savex savex force-pushed the 1489-stream-verifier-service branch from fc8c811 to d7efb91 Compare July 12, 2024 22:26
@savex
Copy link
Contributor Author

savex commented Jul 12, 2024

@bharathv , updated to fit simplification. Will continue on Monday to debug and polish.
Right now there is strange count mismatch happening

[INFO:2024-07-12 22:09:04,105]: RunnerClient: rptest.transactions.stream_verifier_test.StreamVerifierTest.test_streaming_upgrade: Data: None
test_id:    rptest.transactions.stream_verifier_test.StreamVerifierTest.test_simple_produce_consume_txn_with_add_node
status:     PASS
run time:   1 minute 39.690 seconds
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_id:    rptest.transactions.stream_verifier_test.StreamVerifierTest.test_streaming_restart_unsafely_and_add
status:     IGNORE
run time:   0.000 seconds
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_id:    rptest.transactions.stream_verifier_test.StreamVerifierTest.test_streaming_restart_safely_and_add
status:     FAIL
run time:   3 minutes 10.183 seconds

...
AssertionError: Produced/Atomic message count mismatch: 1250/1251
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_id:    rptest.transactions.stream_verifier_test.StreamVerifierTest.test_streaming_upgrade
status:     FAIL
run time:   2 minutes 4.507 seconds

...
AssertionError: Produced/Atomic message count mismatch: 996/69

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
============================================================================================================================================================================================================================================================================================================================
SESSION REPORT (ALL TESTS)
ducktape version: 0.8.18
session_id:       2024-07-12--012
run time:         7 minutes 1.258 seconds
tests run:        4
passed:           1
flaky:            0
failed:           2
ignored:          1
opassed:          0
ofailed:          0
opassedfips:      0
ofailedfips:      0
============================================================================================================================================================================================================================================================================================================================

Most probably due to lack of message retry to produce after KafkaErrors on node unavailability.

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Jul 13, 2024

skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51451#0190a94f-5ebe-47b9-baed-e16504ea0f32:
pandatriage cache was not found

skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51451#0190a965-6c5a-43db-9d72-64e6387d099f:
pandatriage cache was not found

@savex savex force-pushed the 1489-stream-verifier-service branch 2 times, most recently from c43066b to cf47568 Compare July 15, 2024 19:15
@savex
Copy link
Contributor Author

savex commented Jul 15, 2024

Sample test runs using EC2

test_id:    rptest.transactions.stream_verifier_test.StreamVerifierTest.test_simple_produce_consume_txn_with_add_node
status:     PASS
run time:   2 minutes 21.611 seconds
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_id:    rptest.transactions.stream_verifier_test.StreamVerifierTest.test_simple_produce_consume_txn_with_add_node
status:     PASS
run time:   2 minutes 41.324 seconds
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_id:    rptest.transactions.stream_verifier_test.StreamVerifierTest.test_simple_produce_consume_txn_with_add_node
status:     PASS
run time:   2 minutes 28.480 seconds
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_id:    rptest.transactions.stream_verifier_test.StreamVerifierTest.test_simple_produce_consume_txn_with_add_node
status:     PASS
run time:   2 minutes 16.879 seconds
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_id:    rptest.transactions.stream_verifier_test.StreamVerifierTest.test_simple_produce_consume_txn_with_add_node
status:     PASS
run time:   2 minutes 18.292 seconds
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_id:    rptest.transactions.stream_verifier_test.StreamVerifierTest.test_simple_produce_consume_txn_with_add_node
status:     PASS
run time:   2 minutes 24.543 seconds
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_id:    rptest.transactions.stream_verifier_test.StreamVerifierTest.test_simple_produce_consume_txn_with_add_node
status:     PASS
run time:   2 minutes 26.633 seconds
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_id:    rptest.transactions.stream_verifier_test.StreamVerifierTest.test_simple_produce_consume_txn_with_add_node
status:     PASS
run time:   2 minutes 18.435 seconds
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_id:    rptest.transactions.stream_verifier_test.StreamVerifierTest.test_simple_produce_consume_txn_with_add_node
status:     PASS
run time:   2 minutes 27.722 seconds
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_id:    rptest.transactions.stream_verifier_test.StreamVerifierTest.test_simple_produce_consume_txn_with_add_node
status:     PASS
run time:   2 minutes 17.179 seconds
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
=============================================================================================================================================================================================================================================================================================================================================
SESSION REPORT (ALL TESTS)
ducktape version: 0.8.18
session_id:       2024-07-15--010
run time:         24 minutes 17.711 seconds
tests run:        10
passed:           10
flaky:            0
failed:           0
ignored:          0
opassed:          0
ofailed:          0
opassedfips:      0
ofailedfips:      0
=============================================================================================================================================================================================================================================================================================================================================

Copy link
Contributor

@bharathv bharathv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bunch of nits.


class ConsistencyViolationException(Exception):
pass

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, missed that. Thanks.

self.logger.debug(f"[{action}] ...{count} messages")
return count

def ensure_progress(self, action, delta, timeout_sec):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In these tests yes, but in case of using service for some node operations is useful. I would leave it.

def get_produce_status(self):
return self._get(self._get_url("produce"), raise_on_fails=False)

def get_consume_status(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit on naming, make it consistent with "verify" semantics in the other class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, will do

self.target_topic = "stream_topic_dst"

# Calculated speed of producer is ~100k messages per minute, which is ~1500/sec
# Amtomic thread processed ~250 messages per second.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

self.verifier.remote_start_produce(self.source_topic,
self.default_message_count,
messages_per_sec=messages_per_sec)
self.logger.info(f"Waiting for {wait_msg_count} produces messages")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this missing a call to ensure progress or am I misreading?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bharathv
Copy link
Contributor

@savex Failure seems related.

@savex
Copy link
Contributor Author

savex commented Jul 17, 2024

@bharathv, yes, I keep forgetting that CDT runs in docker and there should be different set of timings

@savex savex force-pushed the 1489-stream-verifier-service branch from 3bc1cc1 to a2db278 Compare July 17, 2024 18:09
@savex
Copy link
Contributor Author

savex commented Jul 17, 2024

/cdt
release
rp_version=build
skip-units
dt-repeat=20
tests/rptest/transactions/stream_verifier_test.py

@savex savex force-pushed the 1489-stream-verifier-service branch from a2db278 to d0b7563 Compare July 17, 2024 18:14
@savex
Copy link
Contributor Author

savex commented Jul 17, 2024

/cdt
release
rp_version=build
skip-units
dt-repeat=20
tests/rptest/transactions/stream_verifier_test.py

@savex savex requested a review from bharathv July 18, 2024 18:24
    Based on remote script that has stream verifier webservice
    implemented, this servie provides easy to use interface for
    produce, atomic consume/produce and consume actions.

    Basic service workflow is
    verifier.start
    verifier.update_service_config
    verifier.remote_start_produce
    verifier.remote_start_atomic
    verifier.remote_wait_action('produce')
    verifier.remote_wait_action('atomic')
    verifier.remote_start_consume
    verifier.remote_wait_action('consume')
    And use verifier.get_produce_status to check on offsets and/or
    message count
    Test checks that produce/atomic/consume action can handle broker
    addition safely.

    Also, some internal unified routines implemented to make test itself
    look simpler.

Fixed typos and remove unused exception

Updated message count for non-dedicated nodes env
@savex savex force-pushed the 1489-stream-verifier-service branch from d0b7563 to 5aafe8c Compare July 19, 2024 19:08
@savex savex merged commit c04ae45 into dev Jul 22, 2024
25 checks passed
@savex savex deleted the 1489-stream-verifier-service branch July 22, 2024 14:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants