Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(docs): reorder, clarify PyPI installation #28221

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

sfirke
Copy link
Member

@sfirke sfirke commented Apr 25, 2024

SUMMARY

I tried installing from PyPI and found that some steps were missing or out of order. I removed irrelevant text, added other instructions, and reordered things.

Any concerns about suggesting that people install into a directory /superset and set up their virtual environment there?

There's now some redundancy with Configuring Superset docs, as they also talk about creating a superset_config.py file. Because the proper time to create one is in the install phase, when you are specifying a SECRET_KEY, I feel this belongs in this page. I think we should add the corresponding creating-a-config content to the install methods, then remove from Configuring Superset as everyone will have done it by that point.

TESTING INSTRUCTIONS

Follow these steps on a fresh Linux machine and see if they work. Please test!

@sfirke sfirke marked this pull request as draft April 25, 2024 19:33
@github-actions github-actions bot added the doc Namespace | Anything related to documentation label Apr 25, 2024
Copy link

@surapuramakhil surapuramakhil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I am not an official review/maintainer. As you are requested for feedback added my comments

@sfirke
Copy link
Member Author

sfirke commented Apr 26, 2024

Those are great additions, thanks @surapuramakhil !

@artofcomputing
Copy link
Contributor

I think we should add the corresponding creating-a-config content to the install methods, then remove from Configuring Superset as everyone will have done it by that point.

I agree, Configuring Superset should be about options available after Installation, much like day-to-day administration of a environment.

@sfirke sfirke marked this pull request as ready for review April 26, 2024 13:49
Had accidentally removed it in a prior commit instead of moving it
@sfirke sfirke requested a review from hughhhh April 26, 2024 14:11
@sfirke
Copy link
Member Author

sfirke commented Apr 26, 2024

I agree, Configuring Superset should be about options available after Installation, much like day-to-day administration of a environment.

@artofcomputing I have added this to the BugHerd task board.

# See https://docs.python.org/3.6/library/venv.html
python3 -m venv venv
. venv/bin/activate
mkdir superset
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. I'm a tad confused. Shouldn't the virtualenv be called venv—assuming it's inside the root superset folder?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now the docs don't say to make a folder called superset, that's something I'm adding here. I believe what they say currently makes a virtualenv called venv in the root directory, which is not descriptive and potentially a problem if another tutorial says to do the same.

Now maybe mkdir superset is unnecessary if python3 -m venv superset below would create the directory.

I may be misunderstanding though, I'm new to this aspect.

```

Or with pyenv-virtualenv:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we still want to keep pyenv documentation around, i.e., developers may want to switch which Python environment they want to use. In essence pyenv is a precursor to setting up Superset.

Copy link
Member Author

@sfirke sfirke Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again I'm new to this but my newbie perspective was: the docs say:

You can create and activate a virtual environment using: [virtual env instructions] ... Or with pyenv-virtualenv:

I was unclear, which should I use? And when I read a little about it, it seemed like the venv was the most widely-used and so I eliminated an alternative.

If we should include both, can someone add a note to help a user decide which to use?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@john-bodley circling back to this -- are you good with my changes or is there another committer you think could weigh in? Or if nothing else I can revert these parts, I'd rather just get the bulk of this merged.

@rusackas
Copy link
Member

rusackas commented Jun 3, 2024

Do we still need eyes on this @sfirke? Sorry it seems to have slipped under the collective radar.

@sfirke
Copy link
Member Author

sfirke commented Jun 3, 2024

@rusackas yes this still needs a review. It's tricky b/c I'm not enough of an expert to feel 100% I'm right but also I'm confident this improves on docs steps that are straight-up missing or out of order.

@olof-dev
Copy link

I'm a totally new Superset user (on linux) and ran into problems when following the PyPI installation instructions that are in the live docs. This PR is super useful, as it shows what the missing configuration steps are. The instructions clarified things a lot and worked well, but in the end I called the venv folder "venv" instead, as that's what I'm used to. I'm also used to these virtual environment folders being totally rebuildable by pip (and so not containing any custom files), so I put the Superset config file in the main folder instead. That is, I went with the following structure:

superset/ # Project folder
superset/venv/ # Virtual environment with python and all packages and dependencies
superset/superset_config.py # Configuration file

Specifically:

mkdir superset
cd superset
python3 -m venv venv
. venv/bin/activate
python3 -m pip install apache-superset
touch superset_config.py
export SUPERSET_CONFIG_PATH=superset_config.py
echo "SECRET_KEY='$(openssl rand -base64 42)'" | tee -a $SUPERSET_CONFIG_PATH

Setting up with example data and running development server:

export FLASK_APP=superset
superset db upgrade
superset fab create-admin
superset load_examples
superset init
superset run -p 8088 --with-threads --reload --debugger

(The SUPERSET_CONFIG_PATH environment variable still needs to be set for the above to work.)

It might be helpful to add a comment about it not in general being a good idea to set FLASK_APP globally, as one might want to run several Flask apps on the same machine.

@sfirke
Copy link
Member Author

sfirke commented Jun 27, 2024

@olof-dev that is very helpful feedback, thank you! I will incorporate that, including naming the venv venv -- that is standard as you point out, and having it inside the superset directory addresses my concern about collision with another project. Adding your feedback should unblock this and make it merge-able. Thanks for letting me know that it was an improvement, too.

@sfirke
Copy link
Member Author

sfirke commented Jul 5, 2024

A commenter in Slack shares this feedback:

I have the step pip install --upgrade setuptools pip before installing apache-superset in my documentation. Also, after installing apache-superset, it could at least be mentioned that now would be a good time to install other packages which are needed for running superset (in our case it would be ldap-packages and some special packages that are needed to communicate with certain databases). Besides, I am missing a system requirements section, especially since the newer versions of superset require python3.9 (which took me half a day to find out that it was that requirement that stopped me from getting the newest version and I still didnt get 4.0.2 working on my linux 20.04-server).

@sfirke
Copy link
Member Author

sfirke commented Aug 27, 2024

Taking stock today: I need to incorporate feedback from @olof-dev above and the Slack comment, then let's merge this as an improvement on what we have now, even if it's imperfect.


Finish installing by running through the following commands:
In a production setup, you would change this to point to say, a PostgreSQL database that gets backed up. You change this by specifying a new value in `superset_config.py` for the variable `SQLALCHEMY_DATABASE_URI`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In a production setup, you would change this to point to say, a PostgreSQL database that gets backed up. You change this by specifying a new value in `superset_config.py` for the variable `SQLALCHEMY_DATABASE_URI`.
In a production setup, you would change this to point to, for example, a PostgreSQL database that gets backed up. You change this by specifying a new value in `superset_config.py` for the variable `SQLALCHEMY_DATABASE_URI`.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Namespace | Anything related to documentation size/M
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants