Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move Case to Postgres #10502

Open
pixiwyn opened this issue Sep 24, 2024 · 6 comments
Open

Move Case to Postgres #10502

pixiwyn opened this issue Sep 24, 2024 · 6 comments
Assignees
Labels
5 (dx/ox) High Priority Devex/Opex Devex Yikes!

Comments

@pixiwyn
Copy link
Collaborator

pixiwyn commented Sep 24, 2024

branch off: origin/10391-dynamo-migrations-kysely-aurora-rds-tokens (till merged in)

DynamoDB Entities to Replace:

  • case

OpenSearch Indexes to Keep:

  • efcms-case (used to search cases)

convert the functions in web-api/src/persistence/dynamo/cases, and web-api/src/persistence/dynamo/caseWorksheet to postgres kysely functions.

DynamoDB Functions to Move (unit tests can be removed):
2024-09-25_11-24-03.png

@pixiwyn pixiwyn added the 5 (dx/ox) High Priority Devex/Opex label Sep 24, 2024
@pixiwyn pixiwyn changed the title Move Case Worksheets to Postgres Move Case and Case Worksheets to Postgres Sep 25, 2024
@pixiwyn pixiwyn added the Devex label Sep 25, 2024
@TomElliottFlexion TomElliottFlexion moved this from New Issues to Devex/Opex in US Tax Court Board Oct 6, 2024
@pixiwyn pixiwyn changed the title Move Case and Case Worksheets to Postgres Move Case to Postgres Oct 11, 2024
@Mwindo Mwindo added the Yikes! label Nov 21, 2024
@Mwindo
Copy link
Collaborator

Mwindo commented Nov 21, 2024

Since we'll need to test cases thoroughly anyway, we will also try to implement a CaseFactory as part of this ticket. See ustaxcourt#5557.

@Mwindo
Copy link
Collaborator

Mwindo commented Nov 22, 2024

We have moved over all of the seed data and have the app "running" locally (imperfectly). The next step is to get the app fully working locally and refactor. After that, it would be good to look at real data and eliminate unnecessary fields to simplify our tables (or include any fields we are missing).

@Mwindo
Copy link
Collaborator

Mwindo commented Dec 13, 2024

We have the app working locally after fixing various issues. Current work:

  • Investigating how to get database updates indexed in Open Search. Right now, the current thought seems to be something like: Persistence method called --> Change Data Capture tool (probably AWS DMS or Data Activity Stream, although database plugins of various sorts are possible too) --> Kinesis --> Lambda --> Open Search
  • Deciding if it would be worth taking a detour to implement performance monitoring (which can be tweaked but is otherwise more or less ready) before deploying the change to migrate Case over to postgres.
  • Fixing various TODOs and refining the code. Case touches everything, so we want to get things right.

@Mwindo
Copy link
Collaborator

Mwindo commented Dec 18, 2024

A few scattered notes:

  • We have decided that the logs in Open Search should be enough and that we do not need the performance monitoring for now.
  • We explored Data Activity Stream. Unfortunately, Data Activity Stream only works with provisioned Aurora, not serverless Aurora.
  • I have been exploring Data Migration Service. The issue here is that the service is near real-time (NRT), which means we could experience delays of minutes. This is not an acceptable latency for certain tasks (e.g., searching by case docket number).
  • I have briefly explored other options. In general, and unsurprisingly, provisioned Aurora seems to play more nicely with CDC tooling (e.g., Debezium). I am still looking into these options.
  • A "last-resort" option is to move into Postgres those Open Search persistence queries requiring low latency and then using DMS for the rest. This is not a terrible option considering we want to move these persistence queries into Postgres anyway.

@Mwindo
Copy link
Collaborator

Mwindo commented Dec 18, 2024

After some further research, my findings are basically the following:

Serverless Postgres, while great for scaling up and down and for not having to worry about dependencies, etc., is not ideal for low-latency CDC. Most tools work better, and often exclusively, with provisioned Aurora. As far as I can tell, out options include:

  • DMS: This is the general approach people seem to use, but it isn't meant for realish-time data ETL.
  • Postgres triggers that invoke a lambda to stream data into Kinesis: fast, but less robust (what if the lambda fails?)
  • Postgres triggers with pg_notify and a dedicated listener: a worse version of the above since we would need to have a dedicated listener
  • Doing stuff in application logic. The best example is probably something like making a wrapper for getDbWriter that writes to postgres and kicks off a kinesis task. This is fast and keeps the two in sync, but we have to handle failure logic to kinesis.
  • Some postgres plugin (?) for CDC that plays nicely with serverless setup

Because of this, I discussed with Jim, and we are doing an experiment to see how difficult it is to just rewrite our Case open search queries in postgres. If this is not too difficult, we can probably just port most things over. The only thing that needs to be in Open Search, as far as I know, is Docket Entry text. If that's the case, the worries above are basically moot: latency is less of an issue, and we don't need to worry as much about having a highly scalable solution.

@Mwindo
Copy link
Collaborator

Mwindo commented Jan 29, 2025

General updates:

  • Most Case queries that used OpenSearch now use Postgres; however, caseAdvancedSearch, casePublicSearch (which is basically the same minus sealed cases), and advancedDocumentSearch proved too difficult to replicate. OpenSearch is a better tool for these scenarios. We have explored various ways of getting data into OpenSearch given the constraint that we need seconds latency between time of update to DB and time of indexing in OpenSearch. We have tentatively settled on the following: call to getDbWriter --> write to an SQS queue (in application code) --> trigger lambda --> OpenSearch. The downsides are 1) slightly more complicated application code, 2) changes that are made outside of DAWSON will not automatically update OpenSearch, and 3) potentially more complex transactions in multithreaded scenarios. The advantages are 1) no separate infrastructure (e.g., a FarGate instance), 2) lower costs, and 3) speed. We have made good headway on this (e.g., we have a proof of concept working locally), also investigating how transactions will work in the future, but we need to set up some terraform in a deployed environment to fully test.
  • We have fixed many tests (and any errors they reveal) and are still fixing them.
  • We have a few TODOs to clean up, and then we need to test everything in a deployed environment.

This ticket has been, unsurprisingly, complex: the backend is very heavily coupled to DynamoDB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 (dx/ox) High Priority Devex/Opex Devex Yikes!
Projects
Status: In Progress
Development

No branches or pull requests

2 participants