Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When LANGFLOW_LOAD_FLOWS_PATH is set and when docker_example is being used, when bringing up the service 2nd time, langflow crashes #6054

Open
shrikrishnaholla opened this issue Feb 1, 2025 · 7 comments
Labels
bug Something isn't working

Comments

@shrikrishnaholla
Copy link

Bug Description

I wanted to import the flow I had built in my dev environment and run it headlessly in a staging environment. LANGFLOW_LOAD_FLOWS_PATH seemed like the right fit. It did work as intended when I brought up the service the first time. However, in subsequent starts, langflow crashes.

The specific error message when the crash happens is

langflow-1  | sqlalchemy.exc.IntegrityError: (psycopg.errors.ForeignKeyViolation) insert or update on table "flow" violates foreign key constraint "flow_folder_id_fkey"
langflow-1  | DETAIL:  Key (folder_id)=(c9acccee-a2cc-479d-8731-e3671092b63f) is not present in table "folder".
langflow-1  | [SQL: UPDATE flow SET updated_at=%(updated_at)s::TIMESTAMP WITHOUT TIME ZONE, folder_id=%(folder_id)s::UUID WHERE flow.id = %(flow_id)s::UUID]
langflow-1  | [parameters: {'updated_at': datetime.datetime(2025, 2, 1, 5, 47, 6, 304320, tzinfo=datetime.timezone(datetime.timedelta(0), 'UTC')), 'folder_id': 'c9acccee-a2cc-479d-8731-e3671092b63f', 'flow_id': UUID('21b6f740-e1f2-4a24-a650-4379c7b52a66')}]
langflow-1  | (Background on this error at: https://sqlalche.me/e/20/gkpj)

I think either:

  1. Langflow has stored the name of the folder in my previous environment (which is its default folder only, I hadn't created any folder manually) and it's panicking because it's not finding the default folder in the newer environment. But this doesn't explain why it worked the first time I brought up the service.
  2. It's trying to create a new folder on each startup and the id of folder is not matching between startups. Also seems unlikely to be the case, but it's possible.

Reproduction

  1. In one environment, create a flow. Take an export as JSON
  2. In another environment, create a folder called "flows". Move the JSON there
  3. In the 2nd environment, create this docker-compose.yml file:
version: "3.8"

services:
  langflow:
    image: langflowai/langflow:latest # or another version tag on https://hub.docker.com/r/langflowai/langflow
    pull_policy: always               # set to 'always' when using 'latest' image
    command: python -m langflow run --backend-only --host 0.0.0.0 --port 7860
    # user: root                           # uncomment this if you get permissions issues
    ports:
      - "7860:7860"
    depends_on:
      - postgres
    environment:
      - LANGFLOW_DATABASE_URL=postgresql://langflow:langflow@postgres:5432/langflow
      # This variable defines where the logs, file storage, monitor data and secret keys are stored.
      - LANGFLOW_CONFIG_DIR=/app/langflow
      - LANGFLOW_AUTO_SAVING=false
      - LANGFLOW_AUTO_LOGIN=true
      - LANGFLOW_LOAD_FLOWS_PATH=/app/langflow-flows
    volumes:
      - langflow-data:/app/langflow
      - ./flows:/app/langflow-flows

  postgres:
    image: postgres:16
    environment:
      POSTGRES_USER: langflow
      POSTGRES_PASSWORD: langflow
      POSTGRES_DB: langflow
    expose:
      - 5432
    volumes:
      - langflow-postgres:/var/lib/postgresql/data

volumes:
  langflow-postgres:
  langflow-data:
  1. Bring up the service. It will start up normally
  2. Tweak any parameter in the docker-compose.yml, which causes docker to re-create the container (In my case, I had added the --backend-only flag to the command which triggered the errors). It will start throwing the error above

Expected behavior

Langflow should start normally even on repeated restarts when flows are mounted through LANGFLOW_LOAD_FLOWS_PATH

Who can help?

@italojohnny @ogabrielluiz

Operating System

KDE Neon 5.65 (Linux, based on Debian). Docker version 27.4.1, build b9d17ea

Langflow Version

v1.1.3

Python Version

None

Screenshot

Image

Flow File

The flow has credentials so can't really share it, but it must be possible to replicate this with any basic flow as well

@shrikrishnaholla shrikrishnaholla added the bug Something isn't working label Feb 1, 2025
Copy link

dosubot bot commented Feb 1, 2025

The issue you're encountering seems to be related to a foreign key constraint violation between the flow and folder tables. This typically happens when a flow references a folder_id that does not exist in the folder table.

Here are a few potential reasons and solutions:

  1. Folder ID Mismatch: When you first import the flow, it might be associated with a folder ID that exists in the initial environment but not in the new one. Ensure that the folder referenced by the flow exists in the folder table of the new environment.

  2. Persistent Volume: Make sure that the PostgreSQL data volume (langflow-postgres) is correctly set up to persist data across container restarts. If the volume is not persisting data correctly, the folder information might be lost between restarts.

  3. Manual Folder Creation: Before importing flows, manually create the necessary folders in the new environment to ensure that the folder IDs match those expected by the flows.

  4. Database Initialization: Check if the database schema is correctly initialized on each startup. You might need to run migrations or initialization scripts to ensure the database is in the expected state.

  5. Flow Import Logic: Review the logic in Langflow that handles flow imports to ensure it correctly associates flows with existing folders or creates new folders as needed.

By addressing these points, you should be able to resolve the foreign key violation issue and ensure that Langflow starts normally on repeated restarts.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@shrikrishnaholla
Copy link
Author

Somehow on first-start, folder_id in the json is being ignored in favour of attaching the flow to the default folder. But once the flow is already created, on next-start, if the LANGFLOW_LOAD_FLOWS_PATH contains the same json file, the backend logic is trying to update the flow to move it to the folder_id specified in the exported json, and upon not finding it, the system is panicking

@shrikrishnaholla
Copy link
Author

Found a temporary hack

insert into folder (id, name, description, parent_id, user_id)
values (
    <folder id from json>,
    <some name>,
    <some description>,
    null,
    (
      select id
      from public.user
      where username = 'langflow'
    )
  )

this needs to be executed in postgres shell (psql) in 2nd environment. Not encountering any issues.

@ogabrielluiz
Copy link
Contributor

Hey @shrikrishnaholla

Thank for this report! We'll look into it ASAP.

@bmmuc
Copy link

bmmuc commented Feb 3, 2025

To contribute to this discussion, I have tested deploying Langflow(v1.1.3) using a Helm chart in Kubernetes(runtime 0.1.1) and noticed two key issues:

  • If I set workers > 2, a concurrency issue occurs. Langflow attempts to create a superuser with the same name for each worker, causing the database to throw an error. I resolved this by enabling auto_login and manually creating a superuser.

  • However, the issue mentioned by shrikrishnaholla also occurs when workers > 2. Langflow seems to attempt saving the flow to the database multiple times. I strongly suspect this is a concurrency issue, though I’m not entirely sure yet. I’m currently analyzing the code, but if any contributors have insights, I’d appreciate some guidance. Depending on the complexity of the issue, I might be able to assist in resolving it.
    sqlalchemy.exc.IntegrityError: (raised as a result of Query-invoked autoflush; consider using a session.no_autoflush block if this flush is occurring prematurely)(sqlite3.IntegrityError) UNIQUE constraint failed: flow.user_id, flow.name
    [SQL: INSERT INTO flow (name, description, icon_bg_color, gradient, is_component, updated_at, webhook, endpoint_name, id, data, user_id, icon, tags, locked, folder_id) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]

@ogabrielluiz
Copy link
Contributor

@bmmuc The CLI has a superuser command which could help in this case.

I might add a setting for disabling the creation of a superuser automatically.

@ogabrielluiz
Copy link
Contributor

So, we got a few improvements that should help with this but I'm not sure it will fix it entirely.

Due to time constraints we are going to limit this fix at this for now but expect more thorough pass on this in the next few days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants