Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support uploading blobs separately #3048

Open
Tracked by #1981
afck opened this issue Dec 17, 2024 · 1 comment
Open
Tracked by #1981

Support uploading blobs separately #3048

afck opened this issue Dec 17, 2024 · 1 comment
Milestone

Comments

@afck
Copy link
Contributor

afck commented Dec 17, 2024

Blobs can be big, so they shouldn't have to share a gRPC message with each other or with a block. For downloads to the client, we can use gRPC subscriptions and stream them, but for uploads we cannot use them because of the web client.

When uploading a confirmed block certificate that uses blobs, the server cannot execute it if blobs are missing. It can, however, note in the blob state in storage that it now has seen proof for these blobs. So the client can now, in one message per blob, upload the blobs, and then re-upload the certificate.

When uploading a block proposal, the read blobs must already be available anyway (otherwise the client has to send them together with proof, which is already implemented). For the published blobs, validators could simply sign the proposal even if they don't have them yet. Once the client has the confirmed block certificate, it can use the above method to upload the certificate and the published blobs.

@afck afck added this to the Testnet #2 milestone Dec 17, 2024
afck added a commit that referenced this issue Jan 7, 2025
## Motivation

For #3048 I will need
to add another RPC message, to _upload_ a blob. This means a
`BlobContent` can now be turned into both an
`RpcMessage::DownloadBlobContentResponse` and
`RpcMessage::UploadBlobContent`.

In general, it is dangerous to use `RpcMessage::from` to turn data types
into messages: It's easy to make a mistake when adding a new message
variant with a field that is idential with another message variant's
field that already has a `From` implementation, and accidentally turning
it into the unintended message.

## Proposal

Remove all `impl From<_> for RpcMessage` except for the `NodeError`.
Construct the messages explicitly.

## Test Plan

This doesn't change the logic. CI should catch regressions.

## Release Plan

- Nothing to do / These changes follow the usual release cycle.

## Links

- [reviewer
checklist](https://github.com/linera-io/linera-protocol/blob/main/CONTRIBUTING.md#reviewer-checklist)
afck added a commit that referenced this issue Jan 13, 2025
## Motivation

Including blobs with the gRPC message that contains a block proposal or
certificate severely limits the total size of the blobs. (See #3048.)

## Proposal

As a first step, remove the blobs from the
`handle_confirmed_certificate` functions and messages.

Instead, when a validator sees a fully signed confirmed block it creates
the blob states in its local storage even if it doesn't have the blobs
yet. The client can then upload the blobs one by one, and the validator
will accept them. Finally, the client can retry sending the certificate.

We don't do this for block proposals or validated blocks yet: These will
need to be handled differently, because in these cases the blob has not
necessarily been successfully published yet, so we should _not_ create a
blob state. Instead, we will put these blobs into a temporary cache.

## Test Plan

The existing tests are now using the new flow for confirmed block
certificates.

## Release Plan

- Nothing to do / These changes follow the usual release cycle.

## Links

- Part of #3048
- [reviewer
checklist](https://github.com/linera-io/linera-protocol/blob/main/CONTRIBUTING.md#reviewer-checklist)
@afck
Copy link
Contributor Author

afck commented Jan 13, 2025

With #3108, this is solved for confirmed blocks.

For block proposals, we do want to have the published blobs available at validation time, to check their size. And for validated blocks, we want to avoid storing the blobs in the chain manager, but also can't write them to storage yet.

So we add to the chain state (as views; shown as simplified types here):

  • pending_validated_blobs: Option<(ValidatedBlockCertificate, Map<BlobId, Option<Blob>)> with the validated block in the current round if it's still missing blobs.
  • pending_proposal_blobs: Map<Owner, (BlockProposal, Map<BlobId, Option<Blob>>)> with any pending proposals and their sets of published blobs. (Each owner is only allowed to make one proposal at a time. If the proposal is a re-proposal, all required blobs are included.)

Both fields are cleared whenever a new block height is reached and the chain manager is reset. The chain manager becomes a view with subviews, and the blobs belonging to the proposed resp. locked block become a MapView, too.

There is a new endpoint UploadPendingBlob(ChainId, BlobContent) that adds the blob to pending_proposal_blobs and pending_validated_blobs wherever it matches, handles any that now have their complete set of blobs, and then returns the ChainManagerInfo. So clients can now:

  • First upload their proposal/validated block.
  • Then upload all the blobs that belong to it.

Conversely, if a client receives a chain manager with a proposal or a locked block that they are missing blobs for, they request it with a new endpoint DownloadPendingBlob(ChainId, BlobId), which looks up the blob in locked_blobs and proposal_blobs. (Not in the "pending" fields.)

These changes should be done in two separate PRs:

  • Make locked_blobs and proposed_blobs separate MapViews, add a DownloadPendingBlob endpoint; do staged downloads.
  • Add pending_validated_blobs, pending_proposal_blobs, and an UploadPendingBlob endpoint; do staged uploads.
  • Check total published blob size already in the pending_proposal_blobs field. (Only for new proposals!)

afck added a commit that referenced this issue Jan 13, 2025
## Motivation

Ultimately we want to transfer all blobs separately, rather than in a
single message.
(#3048)

This PR is one step towards that goal: When the client fetches the
locked block from a validator, it now requests the corresponding blobs
one by one, rather than all at once.

## Proposal

Add a `DownloadPendingBlob` endpoint; remove the locked blobs from the
`ChainManagerInfo`.

## Test Plan

Existing tests exercise this scenario, e.g.
`test_finalize_locked_block_with_blobs`.

## Release Plan

- Nothing to do / These changes follow the usual release cycle.

## Links

- Part of #3048.
- [reviewer
checklist](https://github.com/linera-io/linera-protocol/blob/main/CONTRIBUTING.md#reviewer-checklist)
afck added a commit that referenced this issue Jan 15, 2025
## Motivation

Currently the chain manager is serialized as a whole and stored in a
`RegisterView` in the chain state view. Since it contains blobs it can
be very large, and the blobs are not needed every time the chain manager
is loaded.

## Proposal

Make the chain manager a `View`, and put the blobs in a `MapView`.

## Test Plan

This doesn't change any logic, so CI should catch regressions. (In fact,
it already did: #3133)

## Release Plan

- Nothing to do / These changes follow the usual release cycle.

## Links

- In preparation for:
#3048
- [reviewer
checklist](https://github.com/linera-io/linera-protocol/blob/main/CONTRIBUTING.md#reviewer-checklist)
afck added a commit that referenced this issue Jan 21, 2025
…3153)

## Motivation

Ultimately we want to transfer all blobs separately, rather than in a
single message.
(#3048)

This PR is another step towards that goal: The blobs required by a
validated block certificate are now uploaded separately, rather than in
the same message as the certificate itself.

## Proposal

Add a map of missing blobs for the highest-round validated block to the
chain state, and a `HandlePendingBlob` endpoint to populate that map.

## Test Plan

There are already tests covering different scenarios with validated
blocks' blobs. Where necessary, these were updated.

## Release Plan

- Nothing to do / These changes follow the usual release cycle.

## Links

- Closes #3152.
- [reviewer
checklist](https://github.com/linera-io/linera-protocol/blob/main/CONTRIBUTING.md#reviewer-checklist)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

1 participant