Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Liftbridge panics, as too many open files #83

Open
meghamnagar opened this issue Sep 30, 2020 · 8 comments
Open

Liftbridge panics, as too many open files #83

meghamnagar opened this issue Sep 30, 2020 · 8 comments

Comments

@meghamnagar
Copy link

meghamnagar commented Sep 30, 2020

Using liftbridge in my application.
Multiple user creates connection to liftbridge, and releasing connection when not used.
After around 330+ connection, liftbridge panics

Logs of liftbridge:

panic: failed to checkpoint high watermark: cannot create temp file: open /tmp/liftbridge/server/streams/sip:[email protected]/0/replication-offset-checkpoint302940319: too many open files

goroutine 796 [running]:
github.com/liftbridge-io/liftbridge/server/commitlog.(*commitLog).checkpointHWLoop(0xc000865d00)
/home/nagarm/Projects/go/pkg/mod/github.com/liftbridge-io/[email protected]/server/commitlog/commitlog.go:702 +0x282
created by github.com/liftbridge-io/liftbridge/server/commitlog.New
/home/nagarm/Projects/go/pkg/mod/github.com/liftbridge-io/[email protected]/server/commitlog/commitlog.go:144 +0x58a
nats-server -c ../configdata/nats-server.conf

@tylertreat
Copy link
Member

Are those connections creating new streams? It sounds like potentially a large number of partitions are being created and the OS is hitting an open file limit.

@meghamnagar
Copy link
Author

Multiple users are connecting to liftbridge, all user is having different stream name, Hence each user creates new stream.
After around 300+ connection, liftbridge panics.

@tylertreat
Copy link
Member

What operating system are you running Liftbridge on?

@meghamnagar
Copy link
Author

What operating system are you running Liftbridge on?

Linux (Ubuntu)

@tylertreat
Copy link
Member

It sounds like you are hitting the open file limit due to the volume of streams/partitions being created. A couple things you can try are:

  1. Increase the open file limit (I believe the default is 1024).
  2. Rather than creating a stream per user, create a single stream with a fixed number of partitions and map users to partitions. This is a common pattern since partitions are ordered. For example, you can set the user ID as a key on the message and partition by key so that all messages are ordered for each user. This will substantially reduce the number of files that are opened.

@meghamnagar
Copy link
Author

meghamnagar commented Oct 7, 2020

I tried the second approach which you provided.
Tried with 2 ways

  1. Using lift.ToPartition(1),
    I have created multiple key which is publishing on same partition.

Using this:
// Publish to a specific partition.
client.Publish(context.Background(), "bar-stream", []byte("hello"),
lift.Key([]byte("key")),
lift.ToPartition(1),
)
// Subscribe to a specific partition.
client.Subscribe(ctx, "bar-stream", func(msg *lift.Message, err error) {
fmt.Println(msg.Offset, string(msg.Value))
}, lift.Partition(1))

Now Subscribe is getting all the data published to partition 1 for all Keys

  1. Using lift.PartitionByKey()
    Publishing the data using PartitionByKey

Using this:
// Publish to partition based on message key hash.
client.Publish(context.Background(), "bar-stream", []byte("hello"),
lift.Key([]byte("key")),
lift.PartitionByKey(),
)
// Subscribe to a specific partition.
client.Subscribe(ctx, "bar-stream", func(msg *lift.Message, err error) {
fmt.Println(msg.Offset, string(msg.Value))
}, lift.Partition(1))

In this case, as we are publishing using PartitionByKey, it can publish to any partition, but on subscribe side, checking only for partition 1.

Expectations is:
As you suggested, considering user as key

Publish:
key 1 = "abcd" publishing on partition = 1
key 2 = "wxyz" publishing on partition = 1

Subscribe: to lift.Partition(1)
if user is "abcd", should receive data of "abcd", don't need data of "wxyz"

We want data for only specific key. Can we get that?

@meghamnagar
Copy link
Author

As per this: liftbridge-io/liftbridge#81
Subscribe API has input parameters as subject("foo.bar") and stream("my-stream"):
client.Subscribe(ctx, "foo.bar", "my-stream", func(msg *proto.Message, err error) {
// ...
}, lift.Partition(1))

But as per this doc "https://github.com/liftbridge-io/go-liftbridge" ,
Subscribe API doesn't have subject as an input parameter.

Can you please tell where can I get the first syntax?

@tylertreat
Copy link
Member

We want data for only specific key. Can we get that?

If you are sharing messages amongst stream partitions, no. You would need to filter on the subscriber if you care only about certain keys. If you really need messages of the same key in their own streams then you will need a stream per key, but as I mentioned above, this isn't a common pattern for scalability reasons. If this is what you want, you will likely need to increase the open file limit in the OS.

Can you please tell where can I get the first syntax?

Subscribing to a stream using a subject is not supported since Subscribe operates on streams, which are intended to provide an abstraction over NATS subjects. If you want to subscribe directly to a NATS subject, use a NATS client.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants