You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: command-line.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,15 +12,15 @@ kernelspec:
12
12
(command-line)=
13
13
# The Command Line
14
14
15
-
In this chapter, you'll meet the *command line* and learn how to use it. Beyond a few key commands like `pip install <packagename>` you don't strictly need to know how to use the command line to follow the rest of this book. However, even a tiny bit of knowledge of the command line goes a long way in coding and will serve you well.
15
+
In this chapter, you'll meet the *command line* and learn how to use it. Beyond a few key commands like `uv add <packagename>` you don't strictly need to know how to use the command line to follow the rest of this book. However, even a tiny bit of knowledge of the command line goes a long way in coding and will serve you well.
16
16
17
17
To try out any of the commands in this chapter on your machine, you can select 'New Terminal' from the menu bar in Visual Studio Code (Mac and Linux), use the Windows Subsystem for Linux or git bash (Windows), or use a free [online terminal](https://cocalc.com/doc/terminal.html).
18
18
19
19
This chapter has benefited from numerous sources, including absolutely excellent notes by [Grant McDermott](https://grantmcdermott.com/), Melanie Walsh's [Introduction to Cultural Analytics & Python](https://melaniewalsh.github.io/Intro-Cultural-Analytics/welcome.html), [Data Science Bootstrap](https://ericmjl.github.io/data-science-bootstrap-notes/), [calmcode.io](https://calmcode.io/), and [Research Software Engineering with Python](https://merely-useful.tech/py-rse/). A promising resource that, at the time of writing, was still being compiled is [Data Science at the Command Line](https://www.datascienceatthecommandline.com/2e/).
20
20
21
21
## What is the command line?
22
22
23
-
The command line is a way to directly issue text-based commands to a computer one line at a time (as distinct from a graphical user interface, or GUI, that you navigate with a mouse). It goes under many names: shell, bash, terminal, CLI, and command line. These are actually different things but most people tend to use them to mean the same thing most of the time. The *shell* is the part of an operating system that you interact with but mostly people use shell to mean the command line. *bash* is the programming language that is used in the command line; it's actually a synonym for 'Born Again SHell'. The *terminal* is sometimes used to refer to the command line on Macs. Finally, a *CLI* is just an acronym for command line interface, and is often used in the context of an application; for example, pip has a command line interface because you run it on the command line to install packages (`pip install packagename`).
23
+
The command line is a way to directly issue text-based commands to a computer one line at a time (as distinct from a graphical user interface, or GUI, that you navigate with a mouse). It goes under many names: shell, bash, terminal, CLI, and command line. These are actually different things but most people tend to use them to mean the same thing most of the time. The *shell* is the part of an operating system that you interact with but mostly people use shell to mean the command line. *bash* is the programming language that is used in the command line; it's actually a synonym for 'Born Again SHell'. The *terminal* is sometimes used to refer to the command line on Macs. Finally, a *CLI* is just an acronym for command line interface, and is often used in the context of an application; for example, uv has a command line interface because you run it on the command line to install packages (`uv add packagename`).
24
24
25
25
It's worth mentioning that there's a big difference between the command line on UNIX based systems (MacOS and Linux), and on Windows systems. Here, we'll only address the UNIX version. There is a command line on Windows but it's not widely used for coding. If you're on a Windows machine, you can access a UNIX command line using the Windows Subsystem for Linux.
26
26
@@ -127,13 +127,13 @@ There are several ways in which the command line is useful for Python (and these
127
127
Of course, packages are installed at the command line, for example to install Jupyter Lab (for running notebooks), the command is
128
128
129
129
```bash
130
-
pip install jupyterlab
130
+
uv add jupyterlab
131
131
```
132
132
133
133
Say you have a script called `analysis.py`, you can run it with Python on the command line using
134
134
135
135
```bash
136
-
python analysis.py
136
+
uv run python analysis.py
137
137
```
138
138
139
139
which calls Python as a programme and gives it `analysis.py` as the argument. If you have multiple versions of Python, which you should do if you're following best practice and using a version per project, then you can see *which* version of Python is being used with
Copy file name to clipboardExpand all lines: data-import.ipynb
+3-3Lines changed: 3 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -32,7 +32,7 @@
32
32
"id": "e29b7103",
33
33
"metadata": {},
34
34
"source": [
35
-
"If this command fails, you don't have **pandas** installed. Open up the terminal in Visual Studio Code (Terminal -> New Terminal)and type in `conda install pandas`. Note that once **pandas** is installed, the convention is to import it into your Python session under the name `pd` by putting `import pandas as pd` at the top of your script."
35
+
"If this command fails, you don't have **pandas** installed. Open up the terminal in Visual Studio Code (Terminal -> New Terminal), `cd` to the folder you are working in, and type in `uv add pandas`. Note that once **pandas** is installed, the convention is to import it into your Python session under the name `pd` by putting `import pandas as pd` at the top of your script."
36
36
]
37
37
},
38
38
{
@@ -129,7 +129,7 @@
129
129
"\n",
130
130
"Once you read data in, the first step usually involves transforming it in some way to make it easier to work with in the rest of your analysis. For example, the column names in the `students` file we read in are formatted in non-standard ways.\n",
131
131
"\n",
132
-
"You might consider renaming them one by one with `.rename()` or you might use a convenience function from another package to clean them and turn them all into snake case at once. We will make use of the **skimpy** package to do this. **skimpy** is a smaller package so isn't available to install via conda; instead, install it by running `pip install skimpy` in the terminal.\n",
132
+
"You might consider renaming them one by one with `.rename()` or you might use a convenience function from another package to clean them and turn them all into snake case at once. We will make use of the **skimpy** package to do this. Install it by running `uv add skimpy` in the terminal.\n",
133
133
"\n",
134
134
"From **skimpy**, we will use the `clean_columns()` function; this takes in a data frame and returns a data frame with variable names converted to snake case."
135
135
]
@@ -349,7 +349,7 @@
349
349
"\n",
350
350
"If you want to save data in a file and have it remember the data types, you'll need to use a different data format. For temporary storage, we recommend using the *feather* format as it is very fast and interoperable with other programming languages. Interoperability is a good reason to avoid language-specific file formats such as Stata's .dta, R's .rds, and Python's .pickle.\n",
351
351
"\n",
352
-
"Note that the feather format has an additional dependency in the form of a package called **pyarrow**. To install it, run `pip install pyarrow` in a terminal window.\n",
352
+
"Note that the feather format has an additional dependency in the form of a package called **pyarrow**. To install it, run `uv add pyarrow` in a terminal window.\n",
Copy file name to clipboardExpand all lines: data-transform.ipynb
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -38,7 +38,7 @@
38
38
"id": "438cc0a4",
39
39
"metadata": {},
40
40
"source": [
41
-
"If this command fails, you don't have **pandas** installed. Open up the terminal in Visual Studio Code (Terminal -> New Terminal)and type in `conda install pandas`.\n",
41
+
"If this command fails, you don't have **pandas** installed. Open up the terminal in Visual Studio Code (Terminal -> New Terminal), `cd` to the folder you are working in, and type in `uv add pandas`.\n",
42
42
"\n",
43
43
"Furthermore, if you wish to check which version of **pandas** you're using, it's"
Copy file name to clipboardExpand all lines: data-visualise.ipynb
+3-3Lines changed: 3 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -28,7 +28,7 @@
28
28
"source": [
29
29
"### Prerequisites\n",
30
30
"\n",
31
-
"You will need to install the **letsplot** package for this chapter. To do this, open up the command line of your computer, type in `pip install lets-plot`, and hit enter."
31
+
"You will need to install the **letsplot** package for this chapter. To do this, open up the command line of your computer, type in `uv add lets-plot`, and hit enter."
32
32
]
33
33
},
34
34
{
@@ -48,9 +48,9 @@
48
48
"id": "e0ad70c8",
49
49
"metadata": {},
50
50
"source": [
51
-
"We'll also need to have the **pandas** package installed—this package, which we'll be seeing a lot of, is for data. You can similarly install it by running `pip install pandas` on the command line.\n",
51
+
"We'll also need to have the **pandas** package installed—this package, which we'll be seeing a lot of, is for data. You can similarly install it by running `uv add pandas` on the command line.\n",
52
52
"\n",
53
-
"Finally, we'll also need some data (you can't science without data). We'll be using the Palmer penguins dataset. Unusually, this can also be installed as a package—normally you would load data from a file, but these data are so popular for tutorials they've found their way into an installable package. Run `pip install palmerpenguins` to get these data."
53
+
"Finally, we'll also need some data (you can't science without data). We'll be using the Palmer penguins dataset. Unusually, this can also be installed as a package—normally you would load data from a file, but these data are so popular for tutorials they've found their way into an installable package. Run `uv add palmerpenguins` to get these data."
Copy file name to clipboardExpand all lines: databases.ipynb
+3-3Lines changed: 3 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -18,7 +18,7 @@
18
18
"\n",
19
19
"### Prerequisites\n",
20
20
"\n",
21
-
"You will need the **pandas**, **SQLModel**, and **ibis** packages for this chapter. You probably already have **pandas** installed; to install **SQLModel** and **ibis** respectively run `pip install sqlmodel` and `pip install ibis-framework` on your computer's command line. First, let's bring in some general packages and turn off verbose warnings."
21
+
"You will need the **pandas**, **SQLModel**, and **ibis** packages for this chapter. You probably already have **pandas** installed; to install **SQLModel** and **ibis** respectively run `uv add sqlmodel` and `uv add ibis-framework` on your computer's command line. First, let's bring in some general packages and turn off verbose warnings."
22
22
]
23
23
},
24
24
{
@@ -369,7 +369,7 @@
369
369
"- you can try out an online version (which has been hosted already on the cloud), for example [this database](https://global-power-plants.datasettes.com/global-power-plants/global-power-plants) of power stations\n",
370
370
"- you can use the online coding service glitch to run it. See an example [here](https://glitch.com/~datasette-csvs).\n",
371
371
"\n",
372
-
"**Datasette** comes as a Python package that you can install on the command line by running `pip install datasette`. Once you have it installed in a Python environment, run \n",
372
+
"**Datasette** comes as a Python package that you can install on the command line by running `uv tool install datasette`. Once you have it installed in a Python environment, run \n",
373
373
"\n",
374
374
"```bash\n",
375
375
"datasette path/to/database.db -o\n",
@@ -530,7 +530,7 @@
530
530
"\n",
531
531
"So a couple of key strengths of **sqlmodel** include fantastic auto-complete support and being very strict on datatypes (which will save time in the long run, especially if you are *creating* databases).\n",
532
532
"\n",
533
-
"First, make sure you have the package installed by running `pip install sqlmodel` on the command line."
533
+
"First, make sure you have the package installed by running `uv add sqlmodel` on the command line."
Copy file name to clipboardExpand all lines: dates-and-times.ipynb
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -55,7 +55,7 @@
55
55
"You will need to install the **seaborn** package for this chapter. This chapter uses the next generation version of **seaborn**, which can be installed by running the following on the command line (aka in the terminal): \n",
56
56
"\n",
57
57
"```bash\n",
58
-
"pip install --pre seaborn\n",
58
+
"uv run pip install --pre seaborn\n",
59
59
"```\n",
60
60
"\n",
61
61
"We will also be using the **pandas** package and numerical package **numpy**."
Copy file name to clipboardExpand all lines: exploratory-data-analysis.ipynb
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -22,7 +22,7 @@
22
22
"\n",
23
23
"### Prerequisites\n",
24
24
"\n",
25
-
"For doing EDA, we'll use the **pandas**, **skimpy**, and **pandas-profiling** packages. We'll also need **lets-plot** for data visualisation. All of these can be installed via `pip install <packagename>`.\n",
25
+
"For doing EDA, we'll use the **pandas**, **skimpy**, and **pandas-profiling** packages. We'll also need **lets-plot** for data visualisation. All of these can be installed via `uv add <packagename>`.\n",
26
26
"\n",
27
27
"As ever, we begin by loading these packages that we'll use:"
28
28
]
@@ -1065,7 +1065,7 @@
1065
1065
"source": [
1066
1066
"### **skimpy** for summary statistics\n",
1067
1067
"\n",
1068
-
"The **skimpy** package is a light weight tool that provides summary statistics about variables in data frames in the console (rather than in a big HTML report, which is what the other EDA packages in the rest of this chapter too). Sometimes running `.summary()` on a data frame isn't enough, and **skimpy** fills this gap. It also comes with the `clean_columns()` function for cleaning column names that we saw in an earlier chapter. To install **skimpy**, run `pip install skimpy` in the terminal.\n",
1068
+
"The **skimpy** package is a light weight tool that provides summary statistics about variables in data frames in the console (rather than in a big HTML report, which is what the other EDA packages in the rest of this chapter too). Sometimes running `.summary()` on a data frame isn't enough, and **skimpy** fills this gap. It also comes with the `clean_columns()` function for cleaning column names that we saw in an earlier chapter. To install **skimpy**, run `uv add skimpy` in the terminal.\n",
Copy file name to clipboardExpand all lines: numbers.ipynb
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -14,7 +14,7 @@
14
14
"\n",
15
15
"### Prerequisites\n",
16
16
"\n",
17
-
"This chapter mostly uses functions from **pandas**, which you are likely to already have installed bu you can install using `pip install pandas` in the terminal. We'll use real examples from nycflights13, as well as toy examples made with fake data.\n",
17
+
"This chapter mostly uses functions from **pandas**, which you are likely to already have installed but you can install using `uv add pandas` in the terminal. We'll use real examples from nycflights13, as well as toy examples made with fake data.\n",
Copy file name to clipboardExpand all lines: prerequisites.ipynb
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -278,7 +278,7 @@
278
278
"\n",
279
279
"As well as following this book using your own computer or on the cloud via Github Codespaces, you can run the code online through a few other options. The first is the easiest to get started with.\n",
280
280
"\n",
281
-
"1. [Google Colab notebooks](https://research.google.com/colaboratory/). Free for most use. You can launch most pages in this book interactively by using the 'Colab' button under the rocket symbol at the top of the page. It will be in the form of a notebook (which mixes code and text) rather than a script (.py file) but the code you write is the same. Note that you may need to update packages to the most recent versions. On Colab, you can do this by runnin `!pip install **packagename**` in a code cell—note the extra exclamation mark, which tells Colab that this is an instruction for the operating system rather than for Python.\n",
281
+
"1. [Google Colab notebooks](https://research.google.com/colaboratory/). Free for most use. You can launch most pages in this book interactively by using the 'Colab' button under the rocket symbol at the top of the page. It will be in the form of a notebook (which mixes code and text) rather than a script (.py file) but the code you write is the same. Note that you may need to update packages to the most recent versions. On Colab, you can do this by running `!pip install **packagename**` in a code cell—note the extra exclamation mark, which tells Colab that this is an instruction for the operating system rather than for Python.\n",
282
282
"2. [Gitpod Workspace](https://www.gitpod.io/). An alternative to Codespaces. This is a remote, cloud-based version of Visual Studio Code with Python installed and will run Python scripts. Note that the free tier covers 50 hours per month."
Copy file name to clipboardExpand all lines: spreadsheets.ipynb
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -42,7 +42,7 @@
42
42
"source": [
43
43
"### Prerequisites\n",
44
44
"\n",
45
-
"You will need to install the **pandas** package for this chapter. You will also need to install the **openpyxl** package by running `pip install openpyxl` in the terminal."
45
+
"You will need to install the **pandas** package for this chapter. You will also need to install the **openpyxl** package by running `uv add openpyxl` in the terminal."
0 commit comments