Skip to content

Conversation

@rtibbles
Copy link
Member

@rtibbles rtibbles commented Sep 18, 2025

Summary

  • Abstracts our existing database routers into a base class and reuses it in all existing implementations - cleaning up unnecessary code in the process
  • Migrates Kolibri from file based sessions to database based sessions using a custom model and backend based on Django Database Session backend in order to track the user id on the session model
  • For SQLite backends, it adds an additional database to limit the additional bottleneck that using the database backend might add
  • Adds logic to the queryset and model for FacilityUser to ensure that any time a user is deleted or soft deleted, we also clear any associated sessions
  • Cleans up previous API specific logic in favour of this approach
  • Adds tests for the new model and queryset updates

References

FIxes #13542

Note that this reverts #3944 which was originally implemented for a very modest performance improvement. I think the combination of a wide range of other performance improvements, plus the use of a separate database in this implementation means that any performance degradation from changing back here should be negligible.

Reviewer guidance

I used Claude Code to do the initial model and queryset updates and write tests for it, and then heavily edited it to reduce code verbosity, fix errors and ensure the tests were asserting the right things. All other code was hand written.

Both the deletion flows in the issue itself, when deleting in an LoD and also when deleting using the new bulk user management flows should work here. Also, if a user is deleted due to a facility being deleted from a device that should also work. Additionally, it is important to test both the soft deletion that happens by default and the permanent deletion, to make sure both work.

The result here should be that the user is logged out identically if the user's session timed out.

@github-actions github-actions bot added DEV: backend Python, databases, networking, filesystem... SIZE: large labels Sep 18, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Sep 18, 2025

@nucleogenesis nucleogenesis self-assigned this Sep 18, 2025
@github-actions github-actions bot added APP: Device Re: Device App (content import/export, facility-syncing, user permissions, etc.) APP: Facility Re: Facility App (user/class management, facility settings, csv import/export, etc.) APP: Learn Re: Learn App (content, quizzes, lessons, etc.) APP: Coach Re: Coach App (lessons, quizzes, groups, reports, etc.) labels Sep 18, 2025
@rtibbles rtibbles force-pushed the unhandle_my_lod branch 2 times, most recently from b6dfa77 to 59e4106 Compare September 18, 2025 19:55
Copy link
Member

@nucleogenesis nucleogenesis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some questions / thoughts - nothing blocking or urgent to reply to.

Overall I think that I got a good sense of how the changes work. The Session* classes were a bit weird to grok at first but seems the crux is in customizing the SessionStore so you can override create_model_instance to handle the user_id business. The rest then seems like glue code w/ the new Session model and the DB.

I didn't spin it up yet but don't want to forget to submit this review so I'll come back for further review after some manual testing.

Comment on lines +34 to +38
def cleanup_legacy_file_sessions():
"""
Clean up legacy file-based sessions when upgrading to database-backed sessions.
Removes the sessions directory from KOLIBRI_HOME if it exists.
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<3

@@ -0,0 +1,69 @@
from abc import ABC
from abc import abstractmethod
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(non-blocking curiosity) The reason I'm seeing for using ABC/abstractmethod here is to ensure that classes inheriting this will be required to override specific methods or throw an error.

Are there any other benefits to using an abstract class or would it be functionally similar to:

def MODEL_CLASSES(self):
    raise "Not implemented"

I think this came to mind because that's how I've done abstract classes in Ruby

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love how this whole module DRYs things up

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, ABC and abstractmethods are the more explicit way to do this.

user=self.full_name or self.username, facility=self.facility
)

def __init__(self, *args, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(non blocking opinion) - __init__ feels like it should always be at the top of a class to me but as I type this I'm feeling like this is largely based on vibes 😆 -- I think maybe I just like to see overrides at the top of a class so I can first answer the question "what is the class doing in relation to it's parent's implementation" then the second question of "now what does this class do uniquely".

In any case - __str__ is overridden down here too so the rest fit in well enough 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - I think I have a similar intuition - the only advantage to it being here is that its closer to where the intialized value is used.

Comment on lines -20 to -26
# Remove NotificationsRouter if sqlite is not being used:
if settings.DATABASES["default"]["ENGINE"] != "django.db.backends.sqlite3":
ROUTER_ID = "kolibri.core.notifications.models.NotificationsRouter"
if ROUTER_ID in settings.DATABASE_ROUTERS:
settings.DATABASE_ROUTERS = tuple(
filter(lambda x: x != ROUTER_ID, settings.DATABASE_ROUTERS)
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to see code like this in the red :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has been redundant for quite some time, because of the router conditionality being in settings/base.py instead!

@pcenov
Copy link
Member

pcenov commented Sep 22, 2025

Hi @rtibbles,
Unfortunately the issue reported in #13542 is still present.
Also when on device A I created a server with 2 facilities, and then on device B I went through the LOD user flow and imported a user from the 2nd facility and then deleted the entire facility the user remains signed in with no indication whatsoever that the facility has been deleted other than syncing not working anymore.
The same scenario is valid also for a single device with multiple facilities, where if I add a new facility, create and assign a learner to a class, sign in as the learner and then delete the entire facility - the learner still remains signed in.

Logs:
server-logs.zip
lod-logs.zip

Update all implementations.
Remove redundant unregistering of NotificationsRouter.
To allow sessions to be cleared via user id.
Use separate sessions database to prevent bottlenecks.
@pcenov
Copy link
Member

pcenov commented Oct 31, 2025

Hi @rtibbles, LGTM in general, just noting the following:

  1. When I delete a signed in user the user does get signed out, but for a brief moment there's the generic "Sorry! Something went wrong!" error screen:
error.mp4

The error details:

{
  "data": [
    {
      "id": "NOT_AUTHENTICATED",
      "metadata": {
        "view": "Learner Classroom Viewset Instance"
      }
    }
  ],
  "status": 403,
  "statusText": "Forbidden",
  "headers": {
    "allow": "GET, HEAD, OPTIONS",
    "content-length": "85",
    "content-security-policy": "script-src 'self' blob:; style-src 'self' data: blob: 'unsafe-inline'; frame-src 'self' data: blob: http: https:; default-src 'self' data: blob:",
    "content-type": "application/json",
    "date": "Fri, 31 Oct 2025 16:03:53 GMT",
    "referrer-policy": "same-origin",
    "server": "Cheroot/10.0.1",
    "vary": "Accept, Cookie",
    "x-content-type-options": "nosniff",
    "x-frame-options": "DENY"
  },
  "config": {
    "transitional": {
      "silentJSONParsing": true,
      "forcedJSONParsing": true,
      "clarifyTimeoutError": false
    },
    "adapter": [
      "xhr",
      "http",
      "fetch"
    ],
    "transformRequest": [
      null
    ],
    "transformResponse": [
      null
    ],
    "timeout": 0,
    "xsrfCookieName": "kolibri_csrftoken",
    "xsrfHeaderName": "X-CSRFToken",
    "maxContentLength": -1,
    "maxBodyLength": -1,
    "env": {},
    "headers": {
      "Accept": "application/json, text/plain, */*",
      "X-Requested-With": "XMLHttpRequest"
    },
    "paramsSerializer": {},
    "url": "/learn/api/learnerclassroom/3c43d8f06a7bf5c8dbc3e1c058bf9ff7/",
    "params": {},
    "allowAbsoluteUrls": true,
    "method": "get"
  },
  "request": {}
}
  1. The points from my previous comment (based on the following part of your reviewer guidance: Also, if a user is deleted due to a facility being deleted from a device that should also work. ) are still valid - if the user of the deleted facility is on a LOD, then the app just stops syncing, while otherwise the user is seeing a 500 error.
server.error.mp4

Copy link
Member

@nucleogenesis nucleogenesis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code overall looks great and made sense as I read through. I left a few questions, mostly for my own understanding of the changes.

I'll leave approval to @pcenov 's testing

return obj

@classmethod
def delete_all_sessions(cls, user_ids):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nitpick) - If this isn't overriding an existing method, would be clearer to call it delete_sessions as delete_all_sessions doesn't read like it should take any params but just delete all of the sessions

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason it's called delete_all_sessions is because each user can have multiple active sessions, if they have signed in from different devices, so this is to emphasize that this is what is happening.

all_objects = AllObjectsFacilityUserModelManager()
objects = FacilityUserModelManager()

syncing_objects = BaseFacilityUserModelManager()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only see this referenced here an in a migration - is this used internally to morango or somewhere else?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the manager used specifically by Morango. Practically, I'm not sure this is actually needed, because of the additional morango sync hooks we have, but it means if things are getting deleted via regular queryset methods in morango we have this happening for sure.

)

def save(self, *args, **kwargs):
# Call the parent save method first
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question for my own understanding - this & the delete method below are very similar to the update and delete methods in the FacilityUserQuerySet up above. What are the primary differences in the use cases for the two classes' handling of clearing sessions of deleted users?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the question - models can be updated and deleted either via the queryset, or by model methods - we have to do this in both places to ensure that the proper clearing of session data is not conditional on how it happens.

One possible weakness with the current implementation is that we are not catching bulk_update calls - but I am not aware of anywhere we use that in Kolibri currently.

def DB_NAME(self):
pass

def db_for_read(self, model, **hints):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's **hints here and why isn't it used in the methods?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a standard Django thing to provide additional context to the database router. We do not currently use it except in the allow_migrations method.

https://docs.djangoproject.com/en/3.2/topics/db/multi-db/#database-routers

@rtibbles
Copy link
Member Author

rtibbles commented Nov 4, 2025

@pcenov I think for both these cases, I'd rather file follow up issues - the use case of whole facility deletion is important, but I think significantly less common than individual user deletion, and I think we can push the minor 'error page' flash as a follow up in a patch release.

Copy link
Member

@pcenov pcenov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rtibbles - I've filled the notes from my comment as follow-up issues. This one is good to go!

Copy link
Member

@AlexVelezLl AlexVelezLl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if an extra pair of eyes is helpful, but just a note that I read this entire PR critically (I learned quite a bit) and didn't find anything unusual.

@rtibbles rtibbles merged commit 01dfa44 into learningequality:develop Nov 5, 2025
51 checks passed
@rtibbles rtibbles deleted the unhandle_my_lod branch November 5, 2025 21:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

APP: Coach Re: Coach App (lessons, quizzes, groups, reports, etc.) APP: Device Re: Device App (content import/export, facility-syncing, user permissions, etc.) APP: Facility Re: Facility App (user/class management, facility settings, csv import/export, etc.) APP: Learn Re: Learn App (content, quizzes, lessons, etc.) DEV: backend Python, databases, networking, filesystem... SIZE: large

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Manage users in LOD - Unhandled error when removing a signed-in user

4 participants