Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ModuleNotFoundError: No module named 'sqlglot' error across many containers #26997

Closed
3 tasks done
andrekef opened this issue Feb 2, 2024 · 49 comments
Closed
3 tasks done
Assignees
Labels
P1 Priority item - Major

Comments

@andrekef
Copy link

andrekef commented Feb 2, 2024

Bug description

Working on a 2023 Macbook Air M2

I followed the steps per wiki here https://superset.apache.org/docs/installation/installing-superset-using-docker-compose/ (which I believe needs updating) and I am unable to have stable build for many containers under the main superset container.

  • Cannot reach http://localhost:8088/. Page is blank.
  • Error I am getting below.
Skipping local overrides
Starting web app (using development server)...
Skipping local overrides
Starting web app (using development server)...
Skipping local overrides
Starting web app (using development server)...
<jemalloc>: MADV_DONTNEED does not work (memset will be used instead)
<jemalloc>: (This is the expected behaviour if you are running under QEMU)
Usage: flask run [OPTIONS]
Try 'flask run --help' for help.

Error: While importing 'superset.app', an ImportError was raised:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/flask/cli.py", line 218, in locate_app
    __import__(module_name)
  File "/app/superset/__init__.py", line 21, in <module>
    from superset.app import create_app
  File "/app/superset/app.py", line 24, in <module>
    from superset.initialization import SupersetAppInitializer
  File "/app/superset/initialization/__init__.py", line 35, in <module>
    from superset.extensions import (
  File "/app/superset/extensions/__init__.py", line 30, in <module>
    from superset.async_events.async_query_manager import AsyncQueryManager
  File "/app/superset/async_events/async_query_manager.py", line 26, in <module>
    from superset.utils.core import get_user_id
  File "/app/superset/utils/core.py", line 90, in <module>
    from superset.sql_parse import sanitize_clause
  File "/app/superset/sql_parse.py", line 29, in <module>
    from sqlglot import exp, parse, parse_one
ModuleNotFoundError: No module named 'sqlglot'

This error above is looping endlessly

How to reproduce the bug

  1. git clone https://github.com/apache/superset.git
  2. cd superset
  3. open docker-compose.yml
  4. paste platform: linux/x86_64/v8 under each superset container - Please include in your wiki for arm64 users, as this will save folks a lot of time.

Eg

superset:
    platform: linux/x86_64/v8
    ...
  superset-websocket:
    platform: linux/amd64
    ...
  superset-init:
    platform: linux/x86_64/v8
    ...
   superset-node:
    platform: linux/x86_64/v8
    ...
   superset-worker:
    platform: linux/x86_64/v8
    ...
  superset-worker-beat:
    platform: linux/x86_64/v8
    ...
  superset-tests-worker:
    platform: linux/x86_64/v8
  1. run docker compose up --build -d
  2. Notice the ModuleNotFoundError

Screenshots/recordings

Screenshot 2024-02-02 at 1 15 27 PM

Superset version

master / latest-dev

Python version

3.9

Node version

18 or greater

Browser

Chrome

Additional context

No response

Checklist

  • I have searched Superset docs and Slack and didn't find a solution to my problem.
  • I have searched the GitHub issue tracker and didn't find a similar bug report.
  • I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.
@vikramwalia
Copy link

+1 having the same exact issue.

@louisenguyen2203
Copy link

  • 1 also having the same problem

@BigDanTheOne
Copy link

Same

@andrekef
Copy link
Author

andrekef commented Feb 4, 2024

@michael-s-molina is there any version you would recommend back rolling superset so that arm64 machines are able to build superset fully in docker? FWIW, we are all building superset in development and not production

@stefanamaral
Copy link

Same goes for me and I have done exactly the same steps. Trying to push this in my company but is hard when the local thing just doesn't works ...

@MariaJSanchezD
Copy link

I have the exact same issue after trying everything to solve the compatibility error with M1

@ogr-git
Copy link

ogr-git commented Feb 5, 2024

same issue on Amazon x86-64 EC2 instance with Ubuntu

Virtualization: amazon
Operating System: Ubuntu 22.04.3 LTS
          Kernel: Linux 6.2.0-1018-aws
    Architecture: x86-64
 Hardware Vendor: Amazon EC2
  Hardware Model: t3.large

@rusackas
Copy link
Member

rusackas commented Feb 6, 2024

Pinging @betodealmeida and @john-bodley since it sounds related to their recent consolidation efforts.

@xiaoshan1213
Copy link

+1, is there any rollback branch or commit we can use for now?

@SbstnErhrdt
Copy link

SbstnErhrdt commented Feb 7, 2024

I resolved the issues by checking out the latest stable release

My system:

Ubuntu 22.04.3 LTS (GNU/Linux 5.15.0-92-generic x86_64)

Steps

git clone https://github.com/apache/superset.git
git checkout tags/3.1.0
docker compose up

If all runs smoothly

docker compose up -d

@andrekef
Copy link
Author

andrekef commented Feb 7, 2024

@SbstnErhrdt Thanks for your reply.
You did not specify what machine you are using.

  • Even if this is the case, did you had to specify platform: linux/x86_64/v8 in the docker file of each container?
  • What are your logs for superset-websocket now?
  • I also had to create a superset_config.py file and export its path to my superset dir so that the app reads it and override the secret key that is weak by default.
  • If I checkout the tag_3.1.0, I get a new log error for superset-websocket that says its looking for a config.json file now?? when its clearly using config.py for configuration

[email protected] start
node dist/index.js start
config.json file not found
{"date":"Wed Feb 07 2024 16:00:22 GMT+0000 (Coordinated Universal Time)","error":{},"exception":true,"level":"error","message":"uncaughtException: Please provide a JWT secret at least 32 bytes long\nError: Please provide a JWT secret at least 32 bytes long\n at Object. (/home/superset-websocket/dist/index.js:76:11)\n at Module._compile (node:internal/modules/cjs/loader:1198:14)\n at Object.Module._extensions..js (node:internal/modules/cjs/loader:1252:10)\n at Module.load (node:internal/modules/cjs/loader:1076:32)\n at Function.Module._load (node:internal/modules/cjs/loader:911:12)\n at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:81:12)\n at node:internal/main/run_main_module:22:47","os":{"loadavg":[29.07,19.59,13.16],"uptime":87469.63},"process":{"argv":["/usr/local/bin/node","/home/superset-websocket/dist/index.js","start"],"cwd":"/home/superset-websocket","execPath":"/usr/local/bin/node","gid":1000,"memoryUsage":{"arrayBuffers":74962,"external":948746,"heapTotal":18386944,"heapUsed":15244176,"rss":0},"pid":22,"uid":1000,"version":"v16.20.2"},"stack":"Error: Please provide a JWT secret at least 32 bytes long\n at Object. (/home/superset-websocket/dist/index.js:76:11)\n at Module._compile (node:internal/modules/cjs/loader:1198:14)\n at Object.Module._extensions..js (node:internal/modules/cjs/loader:1252:10)\n at Module.load (node:internal/modules/cjs/loader:1076:32)\n at Function.Module._load (node:internal/modules/cjs/loader:911:12)\n at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:81:12)\n at node:internal/main/run_main_module:22:47","trace":[{"column":11,"file":"/home/superset-websocket/dist/index.js","function":null,"line":76,"method":null,"native":false},{"column":14,"file":"node:internal/modules/cjs/loader","function":"Module._compile","line":1198,"method":"_compile","native":false},{"column":10,"file":"node:internal/modules/cjs/loader","function":"Module._extensions..js","line":1252,"method":".js","native":false},{"column":32,"file":"node:internal/modules/cjs/loader","function":"Module.load","line":1076,"method":"load","native":false},{"column":12,"file":"node:internal/modules/cjs/loader","function":"Module._load","line":911,"method":"_load","native":false},{"column":12,"file":"node:internal/modules/run_main","function":"Function.executeUserEntryPoint [as runMain]","line":81,"method":"executeUserEntryPoint [as runMain]","native":false},{"column":47,"file":"node:internal/main/run_main_module","function":null,"line":22,"method":null,"native":false}]}

@ricokali96
Copy link

Same problem here

@vikramwalia
Copy link

not resolved yet ! what [SbstnErhrdt] said does not work for me.

@akshayjain3450
Copy link

Are we looking for any update soon on this.

@yashagv
Copy link

yashagv commented Feb 8, 2024

Still Issue Persist. Superset team really need to sort it out, as everyone clone's from master only.

@SbstnErhrdt tags/3.1.0 didn't work for me.

Distributor ID: Ubuntu
Description: Ubuntu 20.04.6 LTS
Release: 20.04
Codename: focal

@SbstnErhrdt
Copy link

@yashagv @andrekef @vikramwalia

my system is

Ubuntu 22.04.3 LTS (GNU/Linux 5.15.0-92-generic x86_64)

@akshayjain3450
Copy link

akshayjain3450 commented Feb 8, 2024

The config.json file issue with WebSocket is in the master also. It's not just related to tags/3.1.0.
The solution for ModuleNotFoundError, for now, is to create a requirements-local.txt in ./docker path and put this sqlglot==20.8.0. Then if you run docker-compose again, you will not find this error.
@yashagv @vikramwalia @ricokali96 @andrekef

@rusackas
We are missing the dependency in the docker-image being used by the docker-compose. The image seems to be old and does not have the additional changes of base.txt. This needs to be fixed.

@yashagv
Copy link

yashagv commented Feb 8, 2024

Thanks @akshayjain3450
Resolved 'sqlglot' error after adding sqlglot==20.8.0.

All containers are up, but "http://localhost:8088/superset/welcome/" is continuously loading.
When I did port forwarding, it shows me favicon icon. http://localhost:8089/static/assets/images/favicon.png
Also not able to connect 5432 database.

Are you able to run, Can you suggest something?

@akshayjain3450
Copy link

akshayjain3450 commented Feb 8, 2024

Thanks @akshayjain3450

Resolved 'sqlglot' error after adding sqlglot==20.8.0.

All containers are up, but "http://localhost:8088/superset/welcome/" is continuously loading.

When I did port forwarding, it shows me favicon icon. http://localhost:8089/static/assets/images/favicon.png

Also not able to connect 5432 database.

Are you able to run, Can you suggest something?

I found one more error with postgres container where it is looking for a test table which does not exist.
What I did is added CREATE DATABASE test command in ./docker/examples-init.sh. And tried again.
I am still in the process to confirm if the UI opens up or not.

My two containers are down even after the module fix:

  1. superset_websocket with error: config.json file not found
  2. superset_tests_worker with error:
    2024-02-08 16:45:28 2024-02-08 11:15:28,136:ERROR:flask_appbuilder.security.sqla.manager:DB Creation and initialization failed: (psycopg2.OperationalError) connection to server at "localhost" (::1), port 5432 failed: Connection refused 2024-02-08 16:45:28 Is the server running on that host and accepting TCP/IP connections? 2024-02-08 16:45:28 connection to server at "localhost" (127.0.0.1), port 5432 failed: FATAL: database "test" does not exist
    Also, I am not able to login with username: admin and password: admin

Need help to make this working.

@geido geido added the P1 Priority item - Major label Feb 8, 2024
@betodealmeida
Copy link
Member

I'm not familiar with how we're building the Docker images, nor with Snarf — does it just mirror an image from Dockerhub?

I was able to repro, and the solution proposed by @akshayjain3450 worked for me, so that seems like an easy workaround for now.

@rusackas do you know how to build and publish a new docker image?

@rusackas
Copy link
Member

rusackas commented Feb 8, 2024

I think @mistercrunch has the most relevant docker-fu here.

Regarding Scarf (Gateway), if that's what you mean, it's just a proxy, more so than a mirror. It just passes the request through directly to dockerhub and clicks a counter along the way ;)

@mistercrunch
Copy link
Member

This should help at least with the confusion around having to change image targets for local arm64 development work on newer Apple silicon -> #27055

@rajdeepUOB
Copy link

is this the same solution for windows 10 user?

@mistercrunch
Copy link
Member

Screenshot 2024-02-12 at 6 21 56 PM

Can't recreate, can someone who had the issue confirmed it's fixed by now?

I tried:

docker-compose pull
docker-compose up

Also

docker compose -f docker-compose-non-dev.yml pull
docker compose -f docker-compose-non-dev.yml up

And also

git checkout 3.0.0
TAG=3.0.0 docker compose -f docker-compose-non-dev.yml pull
TAG=3.0.0 docker compose -f docker-compose-non-dev.yml up

All seemed to work on a recent Macbook M2. I hit localhost:8088 and things were snappy

@vikramwalia
Copy link

vikramwalia commented Feb 13, 2024

Still running into issues, it is not even getting to a point where the containers are created. I am following documentation for a 2 min setup. I am going to test this out by creating an .env file at the location below and testing. This is x86 / Ubuntu 22.04.

docker compose up
WARN[0000] The "SCARF_ANALYTICS" variable is not set. Defaulting to a blank string.
WARN[0000] The "CYPRESS_CONFIG" variable is not set. Defaulting to a blank string.
WARN[0000] The "CYPRESS_CONFIG" variable is not set. Defaulting to a blank string.
env file /home/superset/docker/.env not found: stat /home/superset/docker/.env: no such file or directory

@rajdeepUOB
Copy link

rajdeepUOB commented Feb 13, 2024 via email

@stefanamaral
Copy link

@rajdeepUWE follow the answer from @akshayjain3450 about creating a requirements-local with the sqlglot.

@rajdeepUOB
Copy link

@rajdeepUWE follow the answer from @akshayjain3450 about creating a requirements-local with the sqlglot.

I tried this. Let me get it, i need to make a .txt file name it requirements-local and write sqlglot==20.8.0 and save it in .docker/ or in the Docker directory inside superset (superset/Docker/)? I tried both, I am facing the same issue

@mistercrunch
Copy link
Member

mmmh, wondering why I can't recreate here on my local... Are you all on latest master?

sqlglot is properly referenced here -> https://github.com/apache/superset/blob/master/requirements/base.txt#L345
and this file is mounted/referenced in the Dockerfile -> https://github.com/apache/superset/blob/master/Dockerfile#L84-L86

There's a bit of a jumparoo here where requirements files point to one another, the chaing goes requirements/local.txt -> requirements/development.txt -> requirements/base.txt (where sqlglot is referenced)

@mistercrunch
Copy link
Member

Wondering if docker caching could be an issue here, where say base.txt changed, but the layer is cached because doesn't think the file has changed. But from my understanding of how docker cache works, if any of the mounted file changed, it's part of the cache key and will invalidate the cache.

@akshayjain3450
Copy link

https://github.com/apache/superset/blob/master/requirements/base.txt#L345

Did we have this dependency in the last stable release 3.1.0 @mistercrunch? Because, after I checkout to that branch I face error building the image. So we have a use there but this dependency I could not find in branch 3.1.0. Can you tell me what I am missing here? How is the public docker image released working and not our custom images without any change?

@mistercrunch
Copy link
Member

I don't see any reference of sqlglot when I checkout 3.1.0

$ git checkout 3.1.0
HEAD is now at 0cd2431989 bringin latest from master Dockerfile to allow for multi-platform builds
$ git grep -i sqlglot

@Sajawalgujjar381
Copy link

I am getting the same error. Does anyone resolve the error?

@andrekef
Copy link
Author

andrekef commented Feb 15, 2024

UPDATE:
I am still unable to get this to build in docker nice and clean for M2 machines. 2 other co-workers replicated my issue, also with M1 and M2 machines.

What worked for me and spinned up superset again is creating a venv and using these steps here: https://superset.apache.org/docs/installation/installing-superset-from-scratch/, which by the way, could use some extra clarification for new developers on things such as:

  • Running export SUPERSET_CONFIG_PATH=/Users/myuser/venv/lib/python3.9/site-packages/superset/superset_config.py so that config.py can locate the override file
  • Where to find the superset repo after I create a venv. I had to manually create a new folder, rename etc and run export SUPERSET_HOME=/path/to/superset_home_directory to avoid weird errors
  • Also, I had to manually allow mysql connectivity since these:
  1. # SQLALCHEMY_DATABASE_URI = 'mysql://myapp@localhost/myapp'
  2. # SQLALCHEMY_DATABASE_URI = 'postgresql://root:password@localhost/myapp'

were commented out in config.py by default, which I don't understand why

  • Also, there is no instruction on how to restart superset after changes to the config files so the app can pick it up and reflect the changes. Running the
  • PREVENT_UNSAFE_DB_CONNECTIONS = False to possibly be the default configuration in the config.py file? So many users had similar question in many forums and they had to tweak this and restart the app to establish any sqlalchemy connection at all.
  • There also seems to be a bug trying to save mysql tables into datasets. I will probably open a new issue here.

@Shivangini-G
Copy link

I faced the same issue while running docker compose up -d
So instead I did docker compose -f docker-compose-non-dev.yml up -d
and it worked for me.

@butuzov
Copy link

butuzov commented Feb 16, 2024

Hi, You might want to install it yourself (by editing to .superset/docker/requirements-local.txt).

This dependency was cherry picked and added to master, how ever you all still using outdated images. (For example me who use based on 3.1.0 multiplatform docker image.)

+sqlglot==20.8.0

Cheers

@Sajawalgujjar381
Copy link

Sajawalgujjar381 commented Feb 16, 2024 via email

@vikramwalia
Copy link

vikramwalia commented Feb 17, 2024

This solves only for the initial install, however the moment you change anything. Example , pip install snowflake-sqlalchemy
it stops working again with errors with ModuleNotFoundError: No module named 'sqlglot'.

Error: While importing 'superset.app', an ImportError was raised:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/flask/cli.py", line 218, in locate_app
import(module_name)
File "/app/superset/init.py", line 21, in
from superset.app import create_app
File "/app/superset/app.py", line 24, in
from superset.initialization import SupersetAppInitializer
File "/app/superset/initialization/init.py", line 35, in
from superset.extensions import (
File "/app/superset/extensions/init.py", line 30, in
from superset.async_events.async_query_manager import AsyncQueryManager
File "/app/superset/async_events/async_query_manager.py", line 26, in
from superset.utils.core import get_user_id
File "/app/superset/utils/core.py", line 90, in
from superset.sql_parse import sanitize_clause
File "/app/superset/sql_parse.py", line 29, in
from sqlglot import exp, parse, parse_one
ModuleNotFoundError: No module named 'sqlglot'
Installing local overrides at /app/docker/requirements-local.txt
Requirement already satisfied: snowflake-sqlalchemy in /usr/local/lib/python3.9/site-packages (from -r /app/docker/requirements-local.txt (line 1)) (1.5.1)

@mistercrunch
Copy link
Member

mistercrunch commented Feb 19, 2024

Digging a bit into this, I understand some things, but I'm not 100% clear on how this is setting or exactly how things are supposed to work here. But few things I know

docker-compose.yml has this ->

x-superset-image: &superset-image apachesuperset.docker.scarf.sh/apache/superset:${TAG:-latest-dev}

Which points to latest-dev which right now is 3.1.0 which does not have sqlglot. So if you git checkout 3.1.0 you can docker-compose up and things line up.

Pointing to master-dev seems more appropriate and much more likely to work, but if you're on arm we don't build that particular variation on these days.

With this PR -> #27146 we'll be having multi-platform build for master, so that would make things work better.

Though I'm not sure what's a normal setup for docker-compose here and how the repo is supposed to line up with the images. My best bet is recent master should work on top of a recent image build off of master (as in the master-dev tag), but that seems fragile, there's no guarantee that any particular SHA should match the latest image. I guess if you have a freshly rebases branch and we have a fresh image, things should line up most of the time.

@sfirke
Copy link
Member

sfirke commented Feb 21, 2024

I see several people here are running the command docker compose up -- my understanding is that if no compose file is specified in this command, Docker defaults to using a compose file called docker-compose.yml if one is present. On the Superset project this point to a potentially-unstable master branch release that should be used for development, not production.

In the install docs we tell people to run docker compose -f docker-compose-non-dev.yml up so that they get a stable / official release image, not the cutting-edge master branch build.

I wonder if we renamed the two files like this:

  • docker-compose.yml -> docker-compose-dev.yml
  • docker-compose-non-dev.yml -> docker-compose.yml

Would it result in a better experience for new users, because then the default is a stable image? non-dev always sounded clunky to me anyway.

@rusackas
Copy link
Member

Interesting question @sfirke - might be a good listserv or town hall question. As a contributor, I prefer to run "dev mode" so the default is sensible. You might be right that a "stable" version is the more sensible default for docker compose, but I'm not certain of it... the current behavior probably helps us find bugs on master a lot faster ;)

@sfirke
Copy link
Member

sfirke commented Feb 21, 2024

the current behavior probably helps us find bugs on master a lot faster ;)

Quite true! You can tell when master breaks as people come flooding into Slack reporting the same problem. But I don't think using new users as test subjects is probably good for the long-term health of the project. I expect for every person who reports a GitHub issue about broken master branch, many more give up silently and say Superset is not ready for production.

I will put it on the town hall list, good idea.

@mistercrunch
Copy link
Member

mistercrunch commented Feb 22, 2024

My expectation (and I think the common use) would be for docker-compose to build off the current branch using the local docker file while mounting the local files so that development can be done. Changing the python files in the repo should change the app, and it assumes youre running 'npm run dev' to build the JS assets.

Now for the other "non-dev"use case you'd build the actual files, no mounts, respecting the current dockerfile.

So all of this being deterministic and respecting the local branch/dockerfile, no divergence from previous layers allowed as it is the case now.

@mistercrunch
Copy link
Member

mistercrunch commented Feb 22, 2024

Screenshot_20240222_162159_ChatGPT.jpg

In terms of renaming the files. Here's what I would suggest:

  • docker-compose.yml: This would be the default Docker Compose file intended for general use, focusing on building immutable environments. This works without having to run / mount local assets. Run docker-compose up and it just works, assuming local branch is sound. Targets the "dev" later

  • docker-compose-interactive.yml: This file is specifically for interactive development, clearly indicating its purpose for development environments with live code mounting and other development conveniences. This requires running "npm run dev" also targets "dev". In theory there are steps like building JS that are not really required here. It could have intricate docker composition to support it, or be a little wasteful and throw away the npm builds in favor of local.

@sfirke
Copy link
Member

sfirke commented Feb 23, 2024

It sounds like both of those require git and target the master branch/latest commit. Yes we need to meet developer needs there.

But IMO to make it as easy as possible to deploy Superset this way for new users, there should be a docker compose workflow that does not require cloning the repo and points at a stable release. That is how Airflow instructs people to deploy with docker compose, their steps are:

  • curl a docker-compose.yml file that points to the latest stable release
  • set a few local configs
  • run docker compose up

@mistercrunch
Copy link
Member

But IMO to make it as easy as possible to deploy Superset this way for new users, there should be a docker compose workflow that does not require cloning the repo and points at a stable release.

From my understanding docker-compose shouldn't be used to productionize applications like Superset and exists to support developer workflows primarily. Helm is the tech that would provide more of the guarantees needed there. Helm should absolutely point to latest-type images by default, but it feels like docker-compose need to build deterministically off the current branch. I guess we could have a docker-compose-latest.yml that points to that image, but then it wouldn't mount anything or be related to the current branch in any other way than that file itself.

@Sajawalgujjar381
Copy link

I faced the same issue while running docker compose up -d So instead I did docker compose -f docker-compose-non-dev.yml up -d and it worked for me.

Yes, I tried your method and it really worked for me also.

@geido
Copy link
Member

geido commented Mar 6, 2024

Thanks everybody. Closing this one for now as we continue validating ways to improve our docker compose setup.

@geido geido closed this as completed Mar 6, 2024
@mistercrunch
Copy link
Member

Quick note that we met with the devx sub-team and I said I'd pick up doing a set of improvements for the docker-compose workflows addressing the core issues here. Main idea is referencing the branch's Dockerfile as opposed to a baked image like we do now.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 Priority item - Major
Projects
Development

No branches or pull requests