Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are you using this in production yourself yet? #3

Open
ulrikjohansson opened this issue Feb 23, 2019 · 2 comments
Open

Are you using this in production yourself yet? #3

ulrikjohansson opened this issue Feb 23, 2019 · 2 comments

Comments

@ulrikjohansson
Copy link

Hi!

I just stumbled over this schema-registry/serializer project b.c. I was about to implement one myself =)

Where I work we have a working serializer/deserializer talking to the confluent schema registry, but it's a bit of a hack.
The schema fetching is done separately in a startup script before starting the consumer/producer, and the schemas are put into the aiohttp app container, so we have to lug that around everywhere.

So I was wondering if this library is ready for use yet? I noticed it's very new (january of this year).

@jonathansick
Copy link
Member

Hey @ulrikjohansson I'm happy you found this! Yeah, it's super new but we're starting to use it in apps that I would characterize as being prototypes/betas. I'm fairly certain Kafkit will become a core part of our infrastructure. We are building this at the same time as we're adopting Kafka, so we're still figuring out best practices.

An example of a producer is our Slack listener for chatops:
https://github.com/lsst-sqre/sqrbot-jr

There serializers are set up here: https://github.com/lsst-sqre/sqrbot-jr/blob/master/sqrbot/avroformat.py and the handler that converts an incoming HTTP event to a Kafka message is here: https://github.com/lsst-sqre/sqrbot-jr/blob/master/sqrbot/handlers/event.py

My thought so far is to keep original schemas in the producer apps and make them responsible for registering schemas. This way we don't have to bake schema IDs into the producer apps (although you could do that if you wanted).

An example of a consumer app is https://github.com/lsst-sqre/templatebot and specifically https://github.com/lsst-sqre/templatebot/blob/master/templatebot/slacklistener.py

If you want to use it, I'd suggest pinning because it's so new. If you'd like to contribute code, that'd be great too! It might be good to open an issue before writing code to make sure it doesn't conflict with something we've got coming down the pipe. API docs are at https://kafkit.lsst.io

@ulrikjohansson
Copy link
Author

ulrikjohansson commented Feb 25, 2019

I'll take a look at those links, thanks for a thorough introduction!

Our setup looks roughly like this atm (we have 1 big monolith repo, with a bunch of smaller service repos budding off of the monolith):

  1. Specify schemas for topics in a separate file in "the big monolith repo". Rule is 1 schema per topic
  2. Load schemas into schema registry at monolith deploy time. Schema registry compatibility is set to FULL_TRANSITIVE so all schema versions should be backwards and forwards compatible.
  3. Producers load the schema(s) from the registry for the topic(s) it wants to produce to at producer startup.
  4. Consumers do the same.

This process is a result both of the fact that our kafka journey started in the monolith, and the fact that we started pre-fetching schemas b.c that was the quickest way to get stuff going.

This workflow is starting to hurt us though, since it slows down development of new services, and the services owning the topics/schemas don't have either topic or schema definitions in their respective repos.

So that makes your workflow very appealing to me. As soon as I can find the time, I'll try this library out on one of our less critical services, and we'll see how it goes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants