Skip to content

Commit 9313aba

Browse files
authored
Skeleton architecture documentation (#387)
Tries to explain a few concepts and expected outputs from papyri. There's a bunch of stuff to expand on but I thought this could be useful especially for new contributors. There's nothing really *new*, just some reorganization and highlighting.
2 parents d665532 + fc3cfcf commit 9313aba

File tree

1 file changed

+155
-66
lines changed

1 file changed

+155
-66
lines changed

Readme.md

Lines changed: 155 additions & 66 deletions
Original file line numberDiff line numberDiff line change
@@ -77,14 +77,24 @@ papyri enabled (left) and disabled (right).
7777
![](assets/vs_math.png)
7878
</detail>
7979

80+
---
81+
82+
## Table of contents
83+
84+
- [Installation](#installation)
85+
- [Usage](#usage)
86+
- [Rendering](#rendering)
87+
- [Architecture](#architecture)
88+
8089
## Installation (not fully functional):
8190

82-
Some functionality is not yet available when installing from PyPI.
83-
For now you need a dev-install (see next section) to access all features.
91+
Some functionality is not yet available when installing from PyPI. For now you
92+
need a [Development installation](#development-installation) to access all
93+
features.
8494

8595
You'll need Python 3.8 or newer, otherwise pip will tell you it can't find any matching distribution.
8696

87-
Pip install from PyPI:
97+
Install from PyPI:
8898

8999
```bash
90100
$ pip install papyri
@@ -111,7 +121,7 @@ This will augment the `?` operator to show better documentation (when installed
111121
*Papyri does not completely build its own docs yet, but you might be able to view a static rendering of it
112122
[here](https://pydocs.github.io/). It is not yet automatically built, so might be out of date.*
113123
114-
### Development install
124+
### Development installation
115125
116126
You may need to get a modified version of numpydoc depending on the stage of development. You will need [pip >
117127
21.3](https://pip.pypa.io/en/stable/news/#v21-3-1) if you want to make editable installs.
@@ -149,19 +159,19 @@ $ pytest
149159
150160
## Usage
151161
152-
In the end there should be roughly 3 steps,
162+
Papyri relies on three steps:
153163
154-
- IR generation (package maintainers)
155-
- IR installation (end user or via pip/conda)
156-
- IR rendering (usually IDE, CLI/webserver)
164+
- IR generation (executed by package maintainers);
165+
- IR installation (executed by end users or via pip/conda);
166+
- IR rendering (usually executed by the IDE, CLI/webserver).
157167
158-
### IR Generation
168+
### IR Generation (`papyri gen`)
159169
160170
This is the step you want to trigger if you are building documentation using Papyri for a library you maintain. Most
161171
likely as an end user you will not have to issue this step and can install pre-published documentation bundles.
162172
This step is likely to occur only once per new release of a project.
163173
164-
Look at the Toml files in `examples`, this will give you example configurations from some existing libraries.
174+
The Toml files in `examples` will give you example configurations from some existing libraries.
165175
166176
```
167177
$ ls -1 examples/*.toml
@@ -177,8 +187,8 @@ examples/skimage.toml
177187
178188
Right now these files lives in papyri but would likely be in relevant repositories under `docs/papyri.toml` later on.
179189
180-
It is _slow_ on full numpy/scipy; use `--no-infer` (see below) for a subpar but
181-
faster experience.
190+
> [!NOTE]
191+
> It is _slow_ on full numpy/scipy; use `--no-infer` (see below) for a subpar but faster experience.
182192
183193
Use `papyri gen <path to example file>`
184194
@@ -192,7 +202,16 @@ $ papyri gen examples/numpy.toml
192202
$ papyri gen examples/scipy.toml
193203
```
194204
195-
This will create intermediate docs files in in `~/.papyri/data/<library name>_<library_version>`
205+
This will create intermediate docs files in in `~/.papyri/data/<library name>_<library_version>`. See [Generation](#generation-papyri-gen) for more details.
206+
207+
You can also generate intermediate docs files for a subset of objects using the `--only` flag. For example:
208+
209+
```
210+
$ papyri gen examples/numpy.toml --only numpy:einsum
211+
```
212+
213+
> [!IMPORTANT]
214+
> To avoid ambiguity, papyri uses [fully qualified names](#qualified-names) to refer to objects. This means that you need to use `numpy:einsum` instead of `einsum` or `numpy.einsum` to refer to the `einsum` function in the `numpy` module, for example.
196215
197216
198217
### Installation/ingestion
@@ -210,11 +229,11 @@ You can ingest local folders with the following command:
210229
$ papyri ingest ~/.papyri/data/<path to folder generated at previous step>
211230
```
212231
213-
This will crosslink the newly generate folder with the existing ones.
232+
This will crosslink the newly generated folder with the existing ones.
214233
Ingested data can be found in `~/.papyri/ingest/` but you are not supposed to
215234
interact with this folder with tools external to papyri.
216235
217-
There is currently a couple of pre-built documentation bundles that can be
236+
There are currently a couple of pre-built documentation bundles that can be
218237
pre-installed, but are likely to break with each new version of papyri. We
219238
suggest you use the developer installation and ingestion procedure for now.
220239
@@ -225,134 +244,204 @@ is of interest to you. This will likely be done by your favorite IDE, probably
225244
just in time when you explore documentation. Nonetheless, we've
226245
implemented a couple of external renderers to help debug issues.
227246
228-
WARNING:
229-
230-
Many rendering methods current require papyri's own docs to be built and ingested
231-
first.
247+
> [!WARNING]
248+
> Many rendering methods currently require papyri's own docs to be built and ingested first.
232249
233250
```
234251
$ papyri gen examples/papyri.toml
235252
$ papyri ingest ~/.papyri/data/papyri_0.0.7 # or any current version
236253
```
237254
238-
Or you can try to pre-install an old papyri doc bundle
255+
Or you can try to pre-install an old papyri doc bundle:
239256
240257
```
241258
$ papyri install papyri
242259
```
243260
244261
### Standalone HTML rendering
245262
263+
To see the rendered documentation for all packages previously ingested, run
246264
247265
```bash
248-
$ papyri render # render all the html pages statically in ~/.papyri/html
249-
$ papyri serve-static # start a http.server with the propoer root to serve above files.
266+
$ papyri serve
250267
```
251268
269+
This will start a live server that will render the pages on the fly.
270+
271+
If you need to render static versions of the pages, use either of the following
272+
commands:
273+
252274
```bash
253-
$ papyri serve # start a server that will render the pages on the fly (nice to debug or iterate on theme, rendering)
275+
$ papyri render # render all the html pages statically in ~/.papyri/html
276+
$ papyri serve-static # start a http.server with the proper root to serve above files.
254277
```
255278
256-
### Ascii terminal rendering (experimental)
279+
### Rich terminal rendering
257280
281+
To render the documentation for a single object on a terminal, use
258282
259283
```
260-
$ papyri ascii <fully qualified names> # try to render in the terminal.
284+
$ papyri rich <fully qualified name>
261285
```
262286
263-
For example,
287+
For example:
264288
265289
```
266-
$ papyri ascii numpy.linspace
290+
$ papyri rich numpy:einsum # note the colon for the fully qualified name.
267291
```
268292
269-
The next step uses urwid to provide a browsable interface in terminal.
293+
To use the experimental interactive Textual interface in the terminal, use
270294
271295
```
272-
$ papyri browse <fully qualified name> # urwid documentation browser.
296+
$ papyri textual <fully qualified name>
297+
```
298+
299+
### IPython extension
300+
301+
To run `papyri` as an IPython extension, run:
302+
273303
```
304+
$ ipython --ext papyri.ipython
305+
```
306+
307+
This will start an IPython session with an augmented `?` operator.
308+
309+
### Jupyter extension
310+
311+
In progress.
274312
275-
Hacking on scrapping libraries `papyri gen --no-infer [...]` will skip type
276-
inference of examples. `--exec` option need to be passed to try to execute examples.
313+
### More commands
314+
315+
You can run `papyri` without a command to see all currently available commands.
277316
278317
## Papyri - Name's meaning
279318
280319
See the legendary [Villa of Papyri](https://en.wikipedia.org/wiki/Villa_of_the_Papyri), which get its name from its
281320
collection of many papyrus scrolls.
282321
322+
## Architecture
283323
284-
## Legacy (MISC/OLD) documentation (Inaccurate):
285-
286-
287-
#### Generation (`papyri gen`)
324+
### Generation (`papyri gen`)
288325
289-
Collects the documentation of a project into a DocBundle -- a number of
290-
DocBlobs (currently json files), with a defined semantic structure, and
326+
Collects the documentation of a project into a *DocBundle* -- a number of
327+
*DocBlobs* (currently json files), with a defined semantic structure, and
291328
some metadata (version of the project this documentation refers to, and
292329
potentially some other blobs).
293330
294-
During the generation a number of normalisation and inference can and should
295-
happen, for example
331+
During the generation a number of normalisation and inference steps can and
332+
should happen. For example:
296333
297-
- using type inference into the `Examples` sections of docstrings and storing
334+
- Using type inference into the `Examples` sections of docstrings and storing
298335
those as pairs (token, reference), so that you can later decide that
299336
clicking on `np.array` in an example brings you to numpy array
300-
documentation; whether or not we are currently in the numpy doc.
301-
- Parsing "See Also" into a well defined structure
302-
- running Example to generate images for docs with images (not implemented)
303-
- resolve package local references for example building numpy doc
304-
"`zeroes_like`" is non ambiguous and shoudl be Normalized to
305-
"`numpy.zeroes_like`", `~.pyplot.histogram`, normalized to
306-
`matplotlib.pyplot.histogram` as the **target** and `histogram` as the text
307-
...etc.
337+
documentation; whether or not we are currently in the numpy documentation;
338+
- Parsing "See Also" into a well defined structure;
339+
- Running examples to generate images for docs with images (partially
340+
implemented);
341+
- Resolve local references. For example, when building the NumPy docs,
342+
`zeroes_like` is non-ambiguous and should be normalized to
343+
`numpy.zeroes_like`. Similarly, `~.pyplot.histogram`, should be normalized
344+
to `matplotlib.pyplot.histogram` as the **target** and `histogram` as the
345+
text.
308346
309347
The Generation step is likely project specific, as there might be import
310-
conventions that are per-project and should not need to be repeated (`import
311-
pandas as pd`, for example,)
348+
conventions that are defined per-project and should not need to be repeated
349+
(`import pandas as pd`, for example.)
350+
351+
The generation step is likely to be the most time consuming, and for each
352+
project, results in the following outputs:
353+
354+
- A `papyri.json` file, which is a list of unique qualified names corresponding
355+
to the documented objects and some metadata;
356+
- A `toc.json` file, ?
357+
- An `assets` folder, containing all the images generated during the
358+
generation;
359+
- A `docs` folder, ?
360+
- An `examples` folder, ?
361+
- A `module` folder, containing one json file per documented object.
362+
363+
After the generation step, *what should have been processed*?
312364
313-
#### Ingestion (papyri ingest)
365+
### Ingestion (`papyri ingest`)
314366
315367
The ingestion step takes a DocBundle and/or DocBlobs and adds them into a graph
316368
of known items; the ingestion is critical to efficiently build the collection
317369
graph metadata and understand which items refers to which. This allows the
318370
following:
319371
320-
- Update the list of backreferences to a DocBundle
372+
- Update the list of backreferences to a *DocBundle*;
321373
- Update forward references metadata to know whether links are valid.
322374
323-
Currently the ingestion loads all in memory and update all the bundle in place
375+
Currently the ingestion loads all in memory and updates all the bundle in place
324376
but this can likely be done more efficiently.
325377
326378
A lot more can likely be done at larger scale, like detecting if documentation
327-
have changed in previous version so infer for which versions of a library this
379+
has changed in previous versions to infer for which versions of a library this
328380
documentation is valid.
329381
330382
There is also likely some curating that might need to be done at that point, as
331-
for example, numpy.array have an extremely large number of back-references.
383+
objects such as `numpy.array` have an extremely large number of back-references.
332384
385+
### Qualified names
333386
334-
### tree sitter info.
387+
To avoid ambiguity when referring to objects, papyri uses the
388+
*fully qualified name* of the object for its operations. This means that instead
389+
of a dot (`.`), we use a colon (`:`) to separate the module part from the
390+
object's name and sub attributes.
335391
336-
https://tree-sitter.github.io/tree-sitter/creating-parsers
337-
338-
339-
### When things don't work !
392+
To understand why we need this, assume the following situation: a top level
393+
`__init__` imports a function from a submodule that has the same name as the
394+
submodule:
340395
396+
```
397+
# project/__init__.py
398+
from .sub import sub
399+
```
341400
342-
#### `SqlOperationalError`:
401+
This submodule defines a class (here we use lowercase for the example):
343402
344-
- The DB schema likely have changed, try: `rm -rf ~/.papyri/ingest/`.
403+
```
404+
# project/sub.py
405+
class sub:
406+
attribute:str
407+
attribute = 'hello'
408+
```
345409
346-
#### Can't build tree-sitter:
410+
and a second submodule is defined:
411+
```
412+
# project/attribute.py
413+
None
414+
```
347415
348-
An error occurred trying to build-tree-sitter with clang, you likely have a conda environment. Install all the compilers
349-
in the current conda env:
416+
Using qualified names only with dots (`.`) can make it difficult to find out
417+
which object we are referring to, or implement the logic to find the object.
418+
For example, to get the object `project.sub.attribute`, one would do:
350419
351420
```
352-
conda install compilers
421+
import project
422+
x = getattr(project, 'sub')
423+
getattr(x, 'attribute')
353424
```
354425
426+
But here, because of the `from .sub import sub`, we end up getting the class
427+
attribute instead of the module. This ambiguity is lifted with a `:` as we now
428+
explicitly know the module part, and `package.sub.attribute` is distinct from
429+
`package.sub:attribute`. Note that `package:sub.attribute` is also
430+
non-ambiguous, even if not the right fully qualified name for an object.
431+
432+
Moreover, using `:` as a separator makes the implementation much easier, as
433+
in the case of `package.sub:attribute` it is possible to directly execute
434+
`importlib.import_module('package.sub')` to obtain a reference to the `sub`
435+
submodule, without try/except or recursive `getattr` checking for the type of an
436+
object.
355437
438+
### Tree sitter information
356439
440+
See https://tree-sitter.github.io/tree-sitter/creating-parsers
357441
358442
443+
### When things don't work !
444+
445+
#### `SqlOperationalError`:
446+
447+
- The DB schema likely have changed, try: `rm -rf ~/.papyri/ingest/`.

0 commit comments

Comments
 (0)