Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to back-fill events by re-execution of messages #11744

Closed
3 of 9 tasks
rvagg opened this issue Mar 19, 2024 · 5 comments · May be fixed by #12330
Closed
3 of 9 tasks

Option to back-fill events by re-execution of messages #11744

rvagg opened this issue Mar 19, 2024 · 5 comments · May be fixed by #12330
Assignees
Labels
area/events good first issue Good for newcomers kind/feature Kind: Feature

Comments

@rvagg
Copy link
Member

rvagg commented Mar 19, 2024

Checklist

  • This is not brainstorming ideas. If you have an idea you'd like to discuss, please open a new discussion on the lotus forum and select the category as Ideas.
  • I have a specific, actionable, and well motivated feature request to propose.

Lotus component

  • lotus daemon - chain sync
  • lotus fvm/fevm - Lotus FVM and FEVM interactions
  • lotus miner/worker - sealing
  • lotus miner - proving(WindowPoSt/WinningPoSt)
  • lotus JSON-RPC API
  • lotus message management (mpool)
  • Other

What is the motivation behind this feature request? Is your feature request related to a problem? Please describe.

lotus-shed indexes backfill-events will walk through specified epochs and extract events from the AMT referenced by receipt.EventsRoot and put them in the index for you. But, it can't do anything if you haven't been recording the events in your blockstore.

If both Fevm.EnableEthRPC (existing config option, defaults to false) or Events.EnableActorEventsAPI (new with 1.26.0, defaults to false) are set to false then EnableStoringEvents isn't set, which prevents both decoding and storing of events on FVM execution.

  • lotus/chain/vm/fvm.go

    Lines 526 to 531 in 2e75f3b

    if vm.returnEvents && len(ret.EventsBytes) > 0 {
    applyRet.Events, err = decodeEvents(ret.EventsBytes)
    if err != nil {
    return nil, fmt.Errorf("failed to decode events returned by the FVM: %w", err)
    }
    }
    - events don't get decoded
  • if storingEvents {
    // Appends nil when no events are returned to preserve positional alignment.
    events = append(events, r.Events)
    }
    - events don't get populated
  • func (t *TipSetExecutor) StoreEventsAMT(ctx context.Context, cs *store.ChainStore, events []types.Event) (cid.Cid, error) {
    cst := cbor.NewCborStore(cs.ChainBlockstore())
    objs := make([]cbg.CBORMarshaler, len(events))
    for i := 0; i < len(events); i++ {
    objs[i] = &events[i]
    }
    return amt4.FromArray(ctx, cst, objs, amt4.UseTreeBitWidth(types.EventAMTBitwidth))
    }
    - event blocks and the corresponding AMT don't get stored in the blockstore

Because the current backfill operation relies on ChainGetEvents which itself relies on being able to load the event AMT from its root CID on the message receipt from the blockstore. Which won't be there if you haven't configured Fevm.EnableEthRPC=true or Events.EnableActorEventsAPI=true.

Describe the solution you'd like

We have a StateCompute API call which should be able re-execute arbitrary tipsets as long as we have both the previous state and the message. It can be seen used in lotus shed compute-state-range and lotus-shed mismatches.

In theory you should be able to do a two-step events index backfill by running this and then running lotus-shed indexes backfill-events. This needs investigation but it's not clear to me at the time of writing that the chain.consensus.TipSetExecutor#ApplyBlocks code I pointed to above gets involved at all in the StateCompute path, it may use an alternative executor. In which case we may need to either come up with a new StateCompute, or get it to turn on ReturnEvents by default and then persist events like TipSetExecutor does.

The ideal, however, is to provide an option to lotus-shed indexes backfill-events to be able to re-execute tipsets where the events were not collected.

Describe alternatives you've considered

Document a two-step process. of using two separate lotus-shed operations—but I'm not sure this will actually work as it is today.

Additional context

No response

@jennijuju jennijuju added the good first issue Good for newcomers label Mar 19, 2024
@jennijuju jennijuju added this to FilOz Mar 19, 2024
@jennijuju jennijuju moved this to 🐱Todo in FilOz Mar 19, 2024
@rjan90 rjan90 moved this from 🐱Todo to Triage in FilOz Mar 19, 2024
@Stebalien
Copy link
Member

See

events, err = ca.ChainGetEvents(ctx, *rct.EventsRoot)
if err != nil {
// Fore-recompute, we must have enabled the Event APIs after computing this
// tipset.
if _, _, err := sa.StateManager.RecomputeTipSetState(ctx, ts); err != nil {
return api.EthTxReceipt{}, xerrors.Errorf("failed get events: %w", err)
}
// Try again
events, err = ca.ChainGetEvents(ctx, *rct.EventsRoot)
if err != nil {
return api.EthTxReceipt{}, xerrors.Errorf("failed get events: %w", err)
}
}

For how I'm doing this in the Eth API.

@rjan90 rjan90 moved this from Triage to 🐱Todo in FilOz Mar 19, 2024
@Stebalien
Copy link
Member

In terms of automatically backfilling, we can:

  1. Spin up N backfill goroutines (we can do this in parallel).
  2. Try to backfill based on the events we have.
  3. If we find that we don't have the events associated with a message receipt, re-execute that tipset and try again.

@akaladarshi
Copy link
Contributor

Hey @rjan90,

Is this issue available?
Let me know if the solution mentioned in the issue is feasible or if there are any other pointers.

@rvagg
Copy link
Member Author

rvagg commented Jul 24, 2024

This issue is coupled with #11007; we've been discussing that we probably need to ditch the lotus-shed command entirely since it's broken (e.g. see this).

But it might not be a bad idea to bite off this as separate piece of work that can be pulled in to an automatic backfilling (in #11007) later. So making it work in lotus-shed to start with would probably be helpful.

@akaladarshi the code pointer referenced above by stebalien is 👌 for doing this. You could experience this problem if you synced a lotus node from a recent snapshot (mainnet or calibnet), let it run and sync, and then try and run lotus-shed indexes backfill-events with appropriate arguments to point to your lotus node and some appropriate epochs. Even if you turn on events and restart lotus so that it's collecting events from that point forward, you won't be able to backfill past events because the events data structure never gets persisted in the blockstore.

@BigLep
Copy link
Member

BigLep commented Oct 4, 2024

This is being subsumed by work in #12453

@BigLep BigLep closed this as completed Oct 4, 2024
@github-project-automation github-project-automation bot moved this from ⌨️ In Progress to 🎉 Done in FilOz Oct 4, 2024
@github-project-automation github-project-automation bot moved this from In progress to Done in PLDG Cohort 0 Project Board Oct 4, 2024
@rjan90 rjan90 moved this from 🎉 Done to ☑️ Done (Archive) in FilOz Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/events good first issue Good for newcomers kind/feature Kind: Feature
Projects
Status: ☑️ Done (Archive)
Development

Successfully merging a pull request may close this issue.

5 participants