Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve schema of CosmosDB chat history to handle long conversations #2312

Merged
merged 14 commits into from
Jan 29, 2025

Conversation

pamelafox
Copy link
Collaborator

@pamelafox pamelafox commented Jan 27, 2025

Purpose

The original approach to CosmosDB chat history was to store both the session information and all the chat messages inside the same document. However, we received feedback from the CosmosDB team that such an architecture is risky because a single session could have so many messages that it exceeds the 2MB document size limit. The safer approach is to store the sessions and messages separately, in one document each.

This PR migrates to that schema, and is based off the same schema used in this Cosmos DB sample:
https://github.com/AzureCosmosDB/cosmosdb-nosql-copilot

It should be just as performant, as we're using hierarchical partition keys (user_id/session_id) for all the operations.

However, this PR is a breaking change: if a developer already had this deployed with history and deployed the new version, users would lose their current history and start over (in a new container). If this is an issue for some folks, we could write a migration script to migrate over the data from the old container to the new one. I'm not sure how many folks are using history in production yet, since it's a fairly new feature.

Does this introduce a breaking change?

When developers merge from main and run the server, azd up, or azd deploy, will this produce an error?
If you're not sure, try it out on an old environment.

[X] Yes
[ ] No

Does this require changes to learn.microsoft.com docs?

This repository is referenced by this tutorial
which includes deployment, settings and usage instructions. If text or screenshot need to change in the tutorial,
check the box below and notify the tutorial author. A Microsoft employee can do this for you if you're an external contributor.

[ ] Yes
[X] No

Type of change

[X] Bugfix
[ ] Feature
[ ] Code style update (formatting, local variables)
[X] Refactoring (no functional changes, no api changes)
[ ] Documentation content changes
[ ] Other... Please describe:

Code quality checklist

See CONTRIBUTING.md for more details.

  • The current tests all pass (python -m pytest).
  • I added tests that prove my fix is effective or that my feature works
  • I ran python -m pytest --cov to verify 100% coverage of added lines
  • I ran python -m mypy to check for type errors
  • I either used the pre-commit hooks or ran ruff and black manually on my code.

@pamelafox pamelafox marked this pull request as ready for review January 28, 2025 00:54
@pamelafox pamelafox changed the title WIP: Improve schema of CosmosDB chat history to handle long conversations Improve schema of CosmosDB chat history to handle long conversations Jan 28, 2025
@pamelafox pamelafox merged commit 7a2044a into Azure-Samples:main Jan 29, 2025
15 checks passed
@pamelafox pamelafox requested a review from Copilot January 30, 2025 00:18

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 5 out of 14 changed files in this pull request and generated no comments.

Files not reviewed (9)
  • app/backend/requirements.txt: Language not supported
  • infra/main.bicep: Language not supported
  • tests/snapshots/test_cosmosdb/test_chathistory_getitem/auth_public_documents_client0/result.json: Language not supported
  • tests/snapshots/test_cosmosdb/test_chathistory_query/auth_public_documents_client0/result.json: Language not supported
  • tests/snapshots/test_cosmosdb/test_chathistory_query_continuation/auth_public_documents_client0/result.json: Language not supported
  • .pre-commit-config.yaml: Evaluated as low risk
  • app/backend/chat_history/cosmosdb.py: Evaluated as low risk
  • app/backend/config.py: Evaluated as low risk
  • app/frontend/src/api/api.ts: Evaluated as low risk
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants