Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

boostd dagstore initialize-all crashes Boost - error coming from reader in lotus storage #9324

Open
9 of 18 tasks
rjan90 opened this issue Sep 16, 2022 · 3 comments · May be fixed by #12491
Open
9 of 18 tasks

boostd dagstore initialize-all crashes Boost - error coming from reader in lotus storage #9324

rjan90 opened this issue Sep 16, 2022 · 3 comments · May be fixed by #12491

Comments

@rjan90
Copy link
Contributor

rjan90 commented Sep 16, 2022

Checklist

  • This is not a security-related bug/issue. If it is, please follow please follow the security policy.
  • This is not a question or a support request. If you have any lotus related questions, please ask in the lotus forum.
  • This is not a new feature request. If it is, please file a feature request instead.
  • This is not an enhancement request. If it is, please file a improvement suggestion instead.
  • I have searched on the issue tracker and the lotus forum, and there is no existing related issue or discussion.
  • I am running the Latest release, or the most recent RC(release canadiate) for the upcoming release or the dev branch(master), or have an issue updating to any of these.
  • I did not make any code changes to lotus.

Lotus component

  • lotus daemon - chain sync
  • lotus miner - mining and block production
  • lotus miner/worker - sealing
  • lotus miner - proving(WindowPoSt)
  • lotus miner/market - storage deal
  • lotus miner/market - retrieval deal
  • lotus miner/market - data transfer
  • lotus client
  • lotus JSON-RPC API
  • lotus message management (mpool)
  • Other

Lotus Version

boostd version 1.4.0+git.810afec
Based on the boostd version, the SP is running either Lotus v1.17.0 or v1.17.1

Describe the Bug

This is was an error reported initially by @stuberman in the Boost-repo, but the crash is coming from the reader in the lotus storage. Issue-report here.

When running boostd dagstore initialize-all --concurrency=3 or even boostd dagstore initialize-all --concurrency=1 then boostd crashes after a few minutes. See log below:

Logging Information

2022-09-15T18:28:09.163Z	INFO	boost-storage-deal	logs/log.go:40	current sealing state	{"id": "dd917e35-d1ac-4f30-9b96-e55d00bf4799", "state": "Packing"}
panic: runtime error: slice bounds out of range [:16777216] with capacity 8388608

goroutine 599691 [running]:
github.com/filecoin-project/lotus/storage/sealer/fr32.(*unpadReader).Read(0x0, {0xc02eba5bfc, 0x140b154, 0x140b154})
	/home/stuart/go/pkg/mod/github.com/filecoin-project/[email protected]/storage/sealer/fr32/readers.go:62 +0x378
bufio.(*Reader).Read(0xc026924060, {0xc02eba5bfc, 0x140b154, 0x1})
	/usr/local/go/src/bufio/bufio.go:213 +0x106
bufio.(*Reader).Read(0xc0269240c0, {0xc02eba5bfc, 0x140b154, 0xa0a640})
	/usr/local/go/src/bufio/bufio.go:213 +0x106
io.ReadAtLeast({0x42ef300, 0xc0269240c0}, {0xc02eb86000, 0x142ad50, 0x142ad50}, 0x142ad50)
	/usr/local/go/src/io/io.go:328 +0x9a
io.ReadFull(...)
	/usr/local/go/src/io/io.go:347
github.com/filecoin-project/lotus/storage/sealer.(*pieceReader).readAtUnlocked(0xc021ceafc0, {0xc02eb86000, 0x64c349, 0x142ad50}, 0x4)
	/home/stuart/go/pkg/mod/github.com/filecoin-project/[email protected]/storage/sealer/piece_reader.go:187 +0xb17
github.com/filecoin-project/lotus/storage/sealer.(*pieceReader).Read(0xc021ceafc0, {0xc02eb86000, 0x142ad50, 0x142ad50})
	/home/stuart/go/pkg/mod/github.com/filecoin-project/[email protected]/storage/sealer/piece_reader.go:100 +0x155
io.ReadAtLeast({0x7f4dae370820, 0xc021ceafc0}, {0xc02eb86000, 0x142ad50, 0x142ad50}, 0x142ad50)
	/usr/local/go/src/io/io.go:328 +0x9a
io.ReadFull(...)
	/usr/local/go/src/io/io.go:347
github.com/ipld/go-car/v2/internal/carv1/util.LdRead({0x7f4dae370820, 0xc021ceafc0}, 0x0, 0x2000000)
	/home/stuart/go/pkg/mod/github.com/ipld/go-car/[email protected]/internal/carv1/util/util.go:85 +0x19e
github.com/ipld/go-car/v2/internal/carv1.ReadHeader({0x7f4dae370820, 0xc021ceafc0}, 0x401)
	/home/stuart/go/pkg/mod/github.com/ipld/go-car/[email protected]/internal/carv1/car.go:63 +0x32
github.com/ipld/go-car/v2.ReadVersion({0x7f4dae370820, 0xc021ceafc0}, {0xc00f2d7c98, 0x9749a5, 0xc011b88300})
	/home/stuart/go/pkg/mod/github.com/ipld/go-car/[email protected]/reader.go:364 +0x90
github.com/ipld/go-car/v2.ReadOrGenerateIndex({0x7f4dae3707f8, 0xc021ceafc0}, {0xc00f2d7c98, 0x2, 0x2})
	/home/stuart/go/pkg/mod/github.com/ipld/go-car/[email protected]/index_gen.go:191 +0x7a
github.com/filecoin-project/dagstore.(*DAGStore).initializeShard.func1({0xc00f2d7d80, 0xc00f2d7d50})
	/home/stuart/go/pkg/mod/github.com/filecoin-project/[email protected]/dagstore_async.go:123 +0xcb
github.com/filecoin-project/dagstore/throttle.(*throttler).Do(0xc000b6c280, {0x433abd8, 0xc0000520b8}, 0xc025a847b0)
	/home/stuart/go/pkg/mod/github.com/filecoin-project/[email protected]/throttle/throttler.go:38 +0x118
github.com/filecoin-project/dagstore.(*DAGStore).initializeShard(0xc000a99b80, {0x433abd8, 0xc0000520b8}, 0xc016d8a3f0, {0x43489f0, 0xc0129bdef0})
	/home/stuart/go/pkg/mod/github.com/filecoin-project/[email protected]/dagstore_async.go:121 +0x422
created by github.com/filecoin-project/dagstore.(*DAGStore).control
	/home/stuart/go/pkg/mod/github.com/filecoin-project/[email protected]/dagstore_control.go:101 +0x705
^C
[1]+  Exit 2                  nohup boostd run --pprof > /market/logs/boost.log 2>&1

Repo Steps

  1. Run boostd dagstore initialize-all --concurrency=3 on a Boost-node
  2. See error coming from the reader in the lotus storage
    ...
@rjan90
Copy link
Contributor Author

rjan90 commented Sep 16, 2022

@LexLuthr found out that the crash is coming from the reader in the lotus storage:

func (r *unpadReader) Read(out []byte) (int, error) {
	if r.left == 0 {
		return 0, io.EOF
	}

	chunks := len(out) / 127

	outTwoPow := 1 << (63 - bits.LeadingZeros64(uint64(chunks*128)))

	if err := abi.PaddedPieceSize(outTwoPow).Validate(); err != nil {
		return 0, xerrors.Errorf("output must be of valid padded piece size: %w", err)
	}

	todo := abi.PaddedPieceSize(outTwoPow)
	if r.left < uint64(todo) {
		todo = abi.PaddedPieceSize(1 << (63 - bits.LeadingZeros64(r.left)))
	}

	r.left -= uint64(todo)

	n, err := io.ReadAtLeast(r.src, r.work[:todo], int(todo))     <--- ?Slice out of bound?
	if err != nil && err != io.EOF {
		return n, err
	}
	if n < int(todo) {
		return 0, xerrors.Errorf("didn't read enough: %d / %d, left %d, out %d", n, todo, r.left, len(out))
	}

	Unpad(r.work[:todo], out[:todo.Unpadded()])

	return int(todo.Unpadded()), err
}

@jennijuju
Copy link
Member

@nonsense - is there any reason why boost is depend on lotus on this?

@dirkmc
Copy link
Contributor

dirkmc commented Sep 19, 2022

The code for reading data from the workers lives in lotus so boost needs to depend on lotus to use that code.

@rjan90 rjan90 added this to the LM-Tech-Debt-Legacy-Markets milestone Mar 24, 2023
@rjan90 rjan90 moved this to Dagstore in Lotus-Miner-V2 Mar 24, 2023
@rjan90 rjan90 moved this from Dagstore to Boost <> Lotus collab in Lotus-Miner-V2 Mar 24, 2023
@magik6k magik6k linked a pull request Sep 19, 2024 that will close this issue
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Boost <> Lotus collab
Development

Successfully merging a pull request may close this issue.

3 participants