Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure Function Python v2 #762

Merged
merged 12 commits into from
Jan 6, 2025
Merged

Azure Function Python v2 #762

merged 12 commits into from
Jan 6, 2025

Conversation

karynzv
Copy link
Contributor

@karynzv karynzv commented Dec 5, 2024

This is an example of an enrichment process function which consumes events from Event Hub, processes them and insert into CrateDB.

Summary of the changes / Why this is an improvement

Checklist

  • Link to issue this PR refers to (if applicable): Fixes #???

This is an example of an enrichment process function which consumes events from Event Hub, processes them and insert into CrateDB.
Added the details on how to deploy the Azure Function from VS Code
Copy link
Member

@amotl amotl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Didn't do a thorough review, but added one little suggestion. I'd also like to change the folder name and add a little software test case, but this can also happen after merging.

@hammerhead
Copy link
Member

Thanks for making it work with the v2 model!

For this to be an example of basic interoperability between CrateDB/Azure Functions, I think we can trim this down quite a bit. The code originates from a specific implementation project that has all sorts of complexity and corner cases. Which I feel we can skip here, trying to make this a concise example showing how easy it is to make the integration work.

Examples:

  • Duplicate storage raw/reading: Would it be our default recommendation to store both pre/post enrichment payloads? We could eliminate the raw table, try to insert directly into reading, and if it fails, iterate through the batch row-by-row and insert into error if needed. That would eliminate the whole ValueCache complexity, looking up rows by trace_id and move them from one list to another.
  • Speaking of trace_id, that could then also go away if we don't need to correlate rows between raw and reading anymore.
  • Data model: I noticed reading doesn't use OBJECT but has the JSON attributes transformed to top-level columns. Using OBJECT(DYNAMIC) is one of the strengths of CrateDB, I suggest using it. That could also reduce the complexity of enrichment if we just pass the original payload through as-is. Or make a very lightweight example of how one can do simple transformations, such as renaming a JSON attribute before insertion, if needed.

Removed much of the previous logic that only applies to more complex use cases with more transformations.
Submit updates to ReadMe to correspond to the latest simplified version
karynzv and others added 2 commits December 27, 2024 11:37
- Renamed `crate_writer` to `cratedb_writer`
Simplified error handling by removing - `error_empty`
- ran `black .`
Copy link
Member

@amotl amotl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. Two more suggestions from my side, feel free to amend.

  • Folder name: topic/serverless/azure-eventhub.
  • Squash all commits into a single one.

Restructure folder to comply with repository form.
@karynzv karynzv merged commit d1900c6 into main Jan 6, 2025
@karynzv karynzv deleted the AF-example branch January 6, 2025 09:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants