Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node go down after "too many open files" and won't recover nor crash #37

Open
rickmak opened this issue Sep 6, 2021 · 3 comments
Open

Comments

@rickmak
Copy link
Collaborator

rickmak commented Sep 6, 2021

Attached logs, it appears my setting for open file is not well tune. It leads to too many openfiles error than a consenus failure. It can't recover itself, which is fine, if it crash itself. The problem is that the software is not crash itself nor recover itself. Therefor, systemd is not regards it as down, so the software is up and running and do nothing.

Provided it does not crash, any suggested way to handle it? i.e. how to detect such failure.

Sep 05 15:17:29 fotan-likecoin liked[522]: 3:17PM ERR CONSENSUS FAILURE!!! err="can't get node F845F93348CD781977344A4423DB29149C7FCE03F9183BBD6AFB2F25E809921F: open /home/fotan/.liked/data/application.db/8390493.ldb: too many open files" module=consensus stack="goroutine 601 [running]:
runtime/debug.Stack(0xc015937c08, 0x1994aa0, 0xc008041580)
        /usr/lib/go-1.16/src/runtime/debug/stack.go:24 +0x9f
github.com/tendermint/tendermint/consensus.(*State).receiveRoutine.func2(0xc000156380, 0x1ea6df8)
        /home/fotan/go/pkg/mod/github.com/tendermint/[email protected]/consensus/state.go:726 +0x5b
panic(0x1994aa0, 0xc008041580)
        /usr/lib/go-1.16/src/runtime/panic.go:965 +0x1b9
github.com/cosmos/iavl.(*nodeDB).GetNode(0xc000c8bbc0, 0xc00b0f9920, 0x20, 0x20, 0x0)
        /home/fotan/go/pkg/mod/github.com/cosmos/[email protected]/nodedb.go:88 +0x779
github.com/cosmos/iavl.(*Node).getRightNode(...)
        /home/fotan/go/pkg/mod/github.com/cosmos/[email protected]/node.go:428
github.com/cosmos/iavl.(*Node).get(0xc00d7be820, 0xc009198198, 0xc00c904990, 0x1d, 0x30, 0x1d, 0x1, 0xc000cbcbd0, 0xc000c8bbc0)
        /home/fotan/go/pkg/mod/github.com/cosmos/[email protected]/node.go:175 +0x2f6
github.com/cosmos/iavl.(*Node).get(0xc00d7be0a0, 0xc009198198, 0xc00c904990, 0x1d, 0x30, 0x1d, 0xffffffffffffffff, 0xc000cbcbd0, 0xc000c8bbc0)
        /home/fotan/go/pkg/mod/github.com/cosmos/[email protected]/node.go:176 +0x273
github.com/cosmos/iavl.(*Node).get(0xc010283b80, 0xc009198198, 0xc00c904990, 0x1d, 0x30, 0x1d, 0xffffffffffffffff, 0xc000cbcbd0, 0xc000c8bbc0)
        /home/fotan/go/pkg/mod/github.com/cosmos/[email protected]/node.go:173 +0x1af
github.com/cosmos/iavl.(*Node).get(0xc0102835e0, 0xc009198198, 0xc00c904990, 0x1d, 0x30, 0x1d, 0xffffffffffffffff, 0xc000cbcbd0, 0xc000c8bbc0)
        /home/fotan/go/pkg/mod/github.com/cosmos/[email protected]/node.go:173 +0x1af
github.com/cosmos/iavl.(*Node).get(0xc010283540, 0xc009198198, 0xc00c904990, 0x1d, 0x30, 0x1d, 0xffffffffffffffff, 0xc000cbcbd0, 0xc000c8bbc0)
        /home/fotan/go/pkg/mod/github.com/cosmos/[email protected]/node.go:173 +0x1af
github.com/cosmos/iavl.(*Node).get(0xc0102834a0, 0xc009198198, 0xc00c904990, 0x1d, 0x30, 0x1d, 0x1, 0xc000cbcbd0, 0xc000c8bbc0)
        /home/fotan/go/pkg/mod/github.com/cosmos/[email protected]/node.go:173 +0x1af
github.com/cosmos/iavl.(*Node).get(0xc010283400, 0xc009198198, 0xc00c904990, 0x1d, 0x30, 0x1d, 0xffffffffffffffff, 0xc000cbcbd0, 0xc000c8bbc0)
        /home/fotan/go/pkg/mod/github.com/cosmos/[email protected]/node.go:176 +0x273
github.com/cosmos/iavl.(*Node).get(0xc010283360, 0xc009198198, 0xc00c904990, 0x1d, 0x30, 0x1d, 0x1, 0xc000cbcbd0, 0xc000c8bbc0)
        /home/fotan/go/pkg/mod/github.com/cosmos/[email protected]/node.go:173 +0x1af
github.com/cosmos/iavl.(*Node).get(0xc0102832c0, 0xc009198198, 0xc00c904990, 0x1d, 0x30, 0x1d, 0x1, 0xc000cbcbd0, 0xc000c8bbc0)
        /home/fotan/go/pkg/mod/github.com/cosmos/[email protected]/node.go:176 +0x273
github.com/cosmos/iavl.(*Node).get(0xc0102830e0, 0xc009198198, 0xc00c904990, 0x1d, 0x30, 0x1d, 0xffffffffffffffff, 0x40946b, 0xc000c8bbc0)
        /home/fotan/go/pkg/mod/github.com/cosmos/[email protected]/node.go:176 +0x273
github.com/cosmos/iavl.(*Node).get(0xc0102821e0, 0xc009198198, 0xc00c904990, 0x1d, 0x30, 0x1d, 0xffffffffffffffff, 0x20, 0xc000c8bbc0)
        /home/fotan/go/pkg/mod/github.com/cosmos/[email protected]/node.go:173 +0x1af
github.com/cosmos/iavl.(*Node).get(0xc00c3699a0, 0xc009198198, 0xc00c904990, 0x1d, 0x30, 0x1d, 0x1, 0x28, 0x30)
        /home/fotan/go/pkg/mod/github.com/cosmos/[email protected]/node.go:173 +0x1af
github.com/cosmos/iavl.(*Node).get(0xc00c369860, 0xc009198198, 0xc00c904990, 0x1d, 0x30, 0x30, 0x19a2ce0, 0xc02128be01, 0xc00a23af00)
        /home/fotan/go/pkg/mod/github.com/cosmos/[email protected]/node.go:176 +0x273
github.com/cosmos/iavl.(*ImmutableTree).Get(0xc009198198, 0xc00c904990, 0x1d, 0x30, 0x0, 0x0, 0x0, 0x0)
        /home/fotan/go/pkg/mod/github.com/cosmos/[email protected]/immutable_tree.go:152 +0x5a
github.com/cosmos/cosmos-sdk/store/iavl.(*Store).Get(0xc000bc1dd0, 0xc00c904990, 0x1d, 0x30, 0x0, 0x0, 0x0)
        /home/fotan/go/pkg/mod/github.com/cosmos/[email protected]/store/iavl/store.go:172 +0x166
github.com/cosmos/cosmos-sdk/store/cache.(*CommitKVStoreCache).Get(0xc0014c7320, 0xc00c904990, 0x1d, 0x30, 0x2d01620, 0x0, 0x30)
        /home/fotan/go/pkg/mod/github.com/cosmos/[email protected]/store/cache/cache.go:111 +0x11e
github.com/cosmos/cosmos-sdk/store/cachekv.(*Store).Get(0xc0097f31d0, 0xc00c904990, 0x1d, 0x30, 0x0, 0x0, 0x0)
        /home/fotan/go/pkg/mod/github.com/cosmos/[email protected]/store/cachekv/store.go:64 +0x2b0
github.com/cosmos/cosmos-sdk/store/gaskv.(*Store).Get(0xc0043db020, 0xc00c904990, 0x1d, 0x30, 0x0, 0x0, 0x0)
        /home/fotan/go/pkg/mod/github.com/cosmos/[email protected]/store/gaskv/store.go:41 +0x198
github.com/cosmos/cosmos-sdk/x/slashing/keeper.Keeper.GetValidatorMissedBlockBitArray(0x2097a48, 0xc000cfcb50, 0x20c6f18, 0xc000cc0e60, 0x20c1a58, 0xc000c80580, 0x20b55c0, 0xc000085080, 0x20b1a68, 0xc000040090, ...)
        /home/fotan/go/pkg/mod/github.com/cosmos/[email protected]/x/slashing/keeper/signing_info.go:60 +0x11d
github.com/cosmos/cosmos-sdk/x/slashing/keeper.Keeper.HandleValidatorSignature(0x2097a48, 0xc000cfcb50, 0x20c6f18, 0xc000cc0e60, 0x20c1a58, 0xc000c80580, 0x20b55c0, 0xc000085080, 0x20b1a68, 0xc000040090, ...)
        /home/fotan/go/pkg/mod/github.com/cosmos/[email protected]/x/slashing/keeper/infractions.go:36 +0x5b2
github.com/cosmos/cosmos-sdk/x/slashing.BeginBlocker(0x20b1a68, 0xc000040090, 0x20c6928, 0xc00a9effc0, 0xb, 0x0, 0xc00d1bed08, 0x12, 0x3da6b, 0x12350f6, ...)
        /home/fotan/go/pkg/mod/github.com/cosmos/[email protected]/x/slashing/abci.go:23 +0x265
github.com/cosmos/cosmos-sdk/x/slashing.AppModule.BeginBlock(...)
        /home/fotan/go/pkg/mod/github.com/cosmos/[email protected]/x/slashing/module.go:164
github.com/cosmos/cosmos-sdk/types/module.(*Manager).BeginBlock(0xc0000bf2d0, 0x20b1a68, 0xc000040090, 0x20c6928, 0xc00a9effc0, 0xb, 0x0, 0xc00d1bed08, 0x12, 0x3da6b, ...)
        /home/fotan/go/pkg/mod/github.com/cosmos/[email protected]/types/module/module.go:338 +0x1d8
github.com/likecoin/likechain/app.(*LikeApp).BeginBlocker(...)
        /home/fotan/likecoin-chain/app/app.go:417
github.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).BeginBlock(0xc000d53040, 0xc011ef3760, 0x20, 0x20, 0xb, 0x0, 0xc00d1bed08, 0x12, 0x3da6b, 0x12350f6, ...)
        /home/fotan/go/pkg/mod/github.com/cosmos/[email protected]/baseapp/abci.go:191 +0x7b8
github.com/tendermint/tendermint/abci/client.(*localClient).BeginBlockSync(0xc000c38240, 0xc011ef3760, 0x20, 0x20, 0xb, 0x0, 0xc00d1bed08, 0x12, 0x3da6b, 0x12350f6, ...)
        /home/fotan/go/pkg/mod/github.com/tendermint/[email protected]/abci/client/local_client.go:274 +0xfa
github.com/tendermint/tendermint/proxy.(*appConnConsensus).BeginBlockSync(0xc000c70eb0, 0xc011ef3760, 0x20, 0x20, 0xb, 0x0, 0xc00d1bed08, 0x12, 0x3da6b, 0x12350f6, ...)
        /home/fotan/go/pkg/mod/github.com/tendermint/[email protected]/proxy/app_conn.go:81 +0x75
github.com/tendermint/tendermint/state.execBlockOnProxyApp(0x20b2670, 0xc000f141e0, 0x20bdb28, 0xc000c70eb0, 0xc0039a21e0, 0x20c6ac8, 0xc000c117e0, 0x1, 0xc0022bdfc0, 0x20, ...)
        /home/fotan/go/pkg/mod/github.com/tendermint/[email protected]/state/execution.go:307 +0x51b
github.com/tendermint/tendermint/state.(*BlockExecutor).ApplyBlock(0xc000ccb650, 0xb, 0x0, 0xc000c8efe0, 0x7, 0xc001d7fb60, 0x12, 0x1, 0x3da6a, 0xc0022bdfc0, ...)
        /home/fotan/go/pkg/mod/github.com/tendermint/[email protected]/state/execution.go:140 +0x168
github.com/tendermint/tendermint/consensus.(*State).finalizeCommit(0xc000156380, 0x3da6b)
        /home/fotan/go/pkg/mod/github.com/tendermint/[email protected]/consensus/state.go:1635 +0xb48
github.com/tendermint/tendermint/consensus.(*State).tryFinalizeCommit(0xc000156380, 0x3da6b)
        /home/fotan/go/pkg/mod/github.com/tendermint/[email protected]/consensus/state.go:1546 +0x428
github.com/tendermint/tendermint/consensus.(*State).enterCommit.func1(0xc000156380, 0xc000000000, 0x3da6b)
        /home/fotan/go/pkg/mod/github.com/tendermint/[email protected]/consensus/state.go:1481 +0x8e
github.com/tendermint/tendermint/consensus.(*State).enterCommit(0xc000156380, 0x3da6b, 0x0)
        /home/fotan/go/pkg/mod/github.com/tendermint/[email protected]/consensus/state.go:1519 +0x6be
github.com/tendermint/tendermint/consensus.(*State).addVote(0xc000156380, 0xc00fd53e00, 0xc0014a5b30, 0x28, 0x1ea91e0, 0xc01593fc08, 0x1784839)
        /home/fotan/go/pkg/mod/github.com/tendermint/[email protected]/consensus/state.go:2132 +0xde5
github.com/tendermint/tendermint/consensus.(*State).tryAddVote(0xc000156380, 0xc00fd53e00, 0xc0014a5b30, 0x28, 0xc0031da900, 0xc0031dc120, 0xc04555c25e439495)
        /home/fotan/go/pkg/mod/github.com/tendermint/[email protected]/consensus/state.go:1930 +0x56
github.com/tendermint/tendermint/consensus.(*State).handleMsg(0xc000156380, 0x2078640, 0xc00fb75520, 0xc0014a5b30, 0x28)
        /home/fotan/go/pkg/mod/github.com/tendermint/[email protected]/consensus/state.go:838 +0x8cd
github.com/tendermint/tendermint/consensus.(*State).receiveRoutine(0xc000156380, 0x0)
        /home/fotan/go/pkg/mod/github.com/tendermint/[email protected]/consensus/state.go:762 +0x3f2
created by github.com/tendermint/tendermint/consensus.(*State).OnStart
        /home/fotan/go/pkg/mod/github.com/tendermint/[email protected]/consensus/state.go:378 +0x8c5
@faddat
Copy link
Contributor

faddat commented Dec 8, 2021

ulimit -n 500000

@faddat
Copy link
Contributor

faddat commented Dec 8, 2021

in systemd that is

LimitNOFILE=500000

@rickmak
Copy link
Collaborator Author

rickmak commented Dec 16, 2021

I think it is not relate. I know setting ulimit will mute the too many open file issue. However, if it happen, the program should crash itself if it is no-longer functional.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants