Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Veracity repository #1

Merged
merged 20 commits into from
May 28, 2024
Merged

Create Veracity repository #1

merged 20 commits into from
May 28, 2024

Conversation

wp0pw
Copy link
Contributor

@wp0pw wp0pw commented May 23, 2024

Create Veracity repository

Copy link
Contributor

@robinbryce robinbryce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thanks!!

robinbryce and others added 20 commits May 28, 2024 11:33
* veracity demo tool for diagnosing and inspecting forestrie logs

AB#9304

feat: ability to recover the effects of bug 9308

tests: add tests for the diagnostic tool so it doesn't rot

review: use slices.Contains

* fix: default container name

* Add massif pre-computes tool

* review: add function comments

* review: happy lefting

* review: use a slightly more specific type

---------

Co-authored-by: Robin Bryce <[email protected]>
* CI experiment AB#9372

* Revert PR#127 changes AB#9372
* parameterise massif height

re AB#9232

* fixup! parameterise massif height

* fixup! fixup! parameterise massif height

---------

Co-authored-by: Henry Jewell <[email protected]>
Also driveby consistent naming of idTimestamp outside of the snowflakeid package.

re: AB#9421

Co-authored-by: jgough <[email protected]>
* Move the trie key format to H(domain || tenantid || eventid)

re: AB#9419

---------

Co-authored-by: jgough <[email protected]>
* Move the trie key format to H(domain || logid || appid)

re: AB#9419

---------

Co-authored-by: jgough <[email protected]>
* add missing completeness dep

re AB#9191

* remove eth dependency

re AB#9191

---------

Co-authored-by: Henry Jewell <[email protected]>
…logs (#176)

* feat: single fixed cost query to find all the recently updated forestrie logs

This pr introduces the intended scheme for efficiently and cost
effectively discovering forestrie logs that have been updated since some
specific previous point.

Each time the blob is updated we set a tag containing the idtimestamp
value of the last leaf entry added to the log.

A demo tool in this pr provides a logwatch tool which issues a single
find tags query every second. The query takes 2-30 ms and returns all
the blobs for all the tenants in a single hit.

The service code will be updated in subsequent work to restore the
batch sealing model we originally intended. This PR just shows that the
support added to go-datatrails-common behaves as we expect.

Some conditioning is then required to pick the "latest" blob for each
tenant, but no further list queries are required.

In the event that we occasionaly discover a blob that has completed
since we checked, we just get the next one. This also does not require a
subsequent list query.

The idtimestamp on the tag can always (and will be) checked against the
value stored in the blob header record.

AB#9402

* fix: remove secret values from git (keys have been rolled & platform informed)

* remove the dev keys from settings too

* fix: only run the tags integration test if configured for real storate

because the azurite emulator does not support filter by tags

* use explicit var to indicate storage auth availablity

* remove experimental change to unrelated test case

* remove half implemented test that cant be completed with azurite

---------

Co-authored-by: Robin Bryce <[email protected]>
* Rename forestrie mmrblobs to massifs

re: AB#9461

---------

Co-authored-by: jgough <[email protected]>
* identify the latest blobs for each tenant

Please note this is work in progress. The logwatcher_*.go files will
move once the repostitory shuffling is done

AB#9402

* review: consistenly provide function comments

* Update go-forestrie/demos/veracity/logwatcher_pathparse.go

Co-authored-by: Joe Gough <[email protected]>
Signed-off-by: robinbryce <[email protected]>

---------

Signed-off-by: robinbryce <[email protected]>
Co-authored-by: Robin Bryce <[email protected]>
Co-authored-by: Joe Gough <[email protected]>
Ran smoke tests while watching the output of the veracity watch
subcommand

AB#9402

Co-authored-by: Robin Bryce <[email protected]>
AB#9402

A service to monitor the recently active tenant massifs and seals.

It will be responsible for posting batches to a topic. The sealer will
pick up the batches, with the assumption that the batch is *at least*
those tenants active since the last batch. But which MAY include
redundant entries.

In cluster tests show the idle query costs < 5m. And the CPU cost, both
empirically and from the azure docs, should be fairly constant, but no
worse than proportional to the number of *matched blobs*

The azure request cost is 1. And we issue the query once per service
instance per deployment per interval. The interval defaults to 2
seconds. We could probably make that faster.

THe time horizon under consideration is currently set to 30 seconds. The
azure docs suggest that is safe. But super safe would be about a minute
or two.

Other notables:

snowflakeid was causing un-necessary dependencies on datatrails event
api consuming code. And it was particularly painful for this change. So
I've provisionaly refactored and moved to
go-datatrails/massifs/snowflakeid

task helm:test:forestrie was broken

It now correctly infers the --values files and renders the forestrie
chart corectly

The only remaining gap with  respect to skaffold.yml is how skaffold
images are specified

review: minors and and rootpublisher -> batchpublisher

Co-authored-by: Robin Bryce <[email protected]>
* Use go-datatrails-merklelog as a dependency for mmr, massifs and mmrtesting

re: AB#9467

---------

Co-authored-by: jgough <[email protected]>
* create shell of new tools module

re AB#9465

* fixup! create shell of new tools module

* rename tools to logverification

re AB#9465

* fixup! rename tools to logverification

* fixup! fixup! rename tools to logverification

---------

Co-authored-by: Henry Jewell <[email protected]>
* Move forestrie taskfile inline with common task runes
* Rejig go work and go modules
* Also add a compile task that compiles go services locally like in avid

re: AB#9473

---------

Co-authored-by: jgough <[email protected]>
* fixes: deal with very out of date seal blobs

And generally clean up the changes in the log confirmer.

We currently have a single horizon of  15 minutes. The intent is to have
a second instance of the watcher which is configured for a much longer
range horizon, but which has a poll rate of hours rather than minutes.

Note that in the limit, for the message bus message per event model we
were talking in the order of a day for the maximum re-try. Configuring
the watcher can achieve the same but offer more control and performance.

fix: memory usage for large batches

just do one at a time, rather than a pre pass to build a pending list

feat: batching for the active set broadcast

The service configration gets to specify a maximum number of tenants per
broad cast message

All active tenents in watch cycle get broadcast, but the get broken into
fixed size pages, to make retries sane and make the load on the sealer
predicatable no matter how many watchers we have and how long their
horizons are.

Also, randomly suffle the tenants in each broadcast to avoid weird
effects with tenants due to lexical ordering of tenant uuids

* rebase to pick up  go mod and task changes

* review: add missing function comment

* review: function comment

* fix: update the batch and page counters for the log messages

* attempt to deal with module rename

* build: bump to massifs v0.0.3

and restore accidentaly deleted go.sum files

AB#9402

---------

Co-authored-by: Robin Bryce <[email protected]>
@wp0pw wp0pw force-pushed the dev/waldek/9479-move-veracity branch from b6ce4ea to 009d297 Compare May 28, 2024 10:37
@wp0pw wp0pw merged commit 9c355f4 into main May 28, 2024
1 check passed
@wp0pw wp0pw deleted the dev/waldek/9479-move-veracity branch May 28, 2024 10:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants