From 31ca60da86ed0c7cf62c260a7079ff2272291698 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Le=C3=B3n=20Orell=20Valerian=20Liehr?= <me@fmease.dev>
Date: Thu, 20 Mar 2025 09:11:45 +0100
Subject: [PATCH] [WIP] rustdoc testing: Further improve chapters and sections

---
 src/rustdoc-internals.md                    | 21 ++-----
 src/rustdoc-internals/rustdoc-test-suite.md | 69 +++++++++++++++------
 src/rustdoc.md                              |  2 +
 3 files changed, 57 insertions(+), 35 deletions(-)

diff --git a/src/rustdoc-internals.md b/src/rustdoc-internals.md
index 7f1c83e00..9dacf37a2 100644
--- a/src/rustdoc-internals.md
+++ b/src/rustdoc-internals.md
@@ -272,19 +272,10 @@ Some extra reading about `make_test` can be found
 
 ## Dotting i's And Crossing t's
 
-So that's `rustdoc`'s code in a nutshell, but there's more things in the
-compiler that deal with it. Since we have the full `compiletest` suite at hand,
-there's a set of tests in `tests/rustdoc` that make sure the final `HTML` is
-what we expect in various situations. These tests also use a supplementary
-script, `src/etc/htmldocck.py`, that allows it to look through the final `HTML`
-using `XPath` notation to get a precise look at the output. The full
-description of all the commands available to `rustdoc` tests (e.g. [`@has`] and
-[`@matches`]) is in [`htmldocck.py`].
-
-To use multiple crates in a `rustdoc` test, add `// aux-build:filename.rs`
-to the top of the test file. `filename.rs` should be placed in an `auxiliary`
-directory relative to the test file with the comment. If you need to build
-docs for the auxiliary file, use `// build-aux-docs`.
+<!-- FIXME(fmease):
+* Make this section make sense again
+* Deduplicate with section in ../rustdoc.md#tests
+-->
 
 In addition, there are separate tests for the search index and `rustdoc`'s
 ability to query it. The files in `tests/rustdoc-js` each contain a
@@ -295,10 +286,6 @@ that features results in all tabs can be found in `basic.js`. The basic idea is
 that you match a given `QUERY` with a set of `EXPECTED` results, complete with
 the full item path of each item.
 
-[`@has`]: https://github.com/rust-lang/rust/blob/master/src/etc/htmldocck.py#L39
-[`@matches`]: https://github.com/rust-lang/rust/blob/master/src/etc/htmldocck.py#L44
-[`htmldocck.py`]: https://github.com/rust-lang/rust/blob/master/src/etc/htmldocck.py
-
 ## Testing Locally
 
 Some features of the generated `HTML` documentation might require local
diff --git a/src/rustdoc-internals/rustdoc-test-suite.md b/src/rustdoc-internals/rustdoc-test-suite.md
index c2a2e6495..18f61b307 100644
--- a/src/rustdoc-internals/rustdoc-test-suite.md
+++ b/src/rustdoc-internals/rustdoc-test-suite.md
@@ -4,21 +4,19 @@ This page is specifically about the test suite named `rustdoc`, for other test s
 
 The `rustdoc` test suite is specifically used to test the HTML output of rustdoc.
 
-This is achived by means of `htmldocck.py`, a custom checker script that leverages [XPath].
+This is achieved by means of [`htmldocck.py`],
+a supplementary checker script invoked by compiletest that leverages [XPath].
 
+[`htmldocck.py`]: https://github.com/rust-lang/rust/blob/master/src/etc/htmldocck.py
 [XPath]: https://en.wikipedia.org/wiki/XPath
 
-## Directives
-Directives to htmldocck are similar to those given to `compiletest` in that they take the form of `//@` comments.
+## Htmldocck-Specific Directives
 
-In addition to the directives listed here,
-`rustdoc` tests also support most
-[compiletest directives](../tests/directives.html).
+Directives to htmldocck are similar to those given to `compiletest` in that they take the form of `//@` comments.
 
-All `PATH`s in directives are relative to the the rustdoc output directory (`build/TARGET/test/rustdoc/TESTNAME`),
-so it is conventional to use a `#![crate_name = "foo"]` attribute to avoid
-having to write a long crate name multiple times.
 To avoid repetion, `-` can be used in any `PATH` argument to re-use the previous `PATH` argument.
+It is conventional to use a `#![crate_name = "foo"]` attribute to avoid
+having to write a long crate name multiple times.
 
 All arguments take the form of quoted strings
 (both single and double quotes are supported),
@@ -34,6 +32,8 @@ In this case, the start of the next line should be `//`, with no `@`.
 
 For example, `//@ !has 'foo/struct.Bar.html'` checks that crate `foo` does not have a page for a struct named `Bar` in the crate root.
 
+<!-- FIXME(fmease): Mention that the  regexes match case-sensitively and in single-line mode? -->
+
 ### `has`
 
 Usage 1: `//@ has PATH`
@@ -41,26 +41,32 @@ Usage 2: `//@ has PATH XPATH PATTERN`
 
 In the first form, `has` checks that a given file exists.
 
-In the second form, `has` is an alias for `matches`,
+In the second form, `has` is the same as `matches`,
 except `PATTERN` is a whitespace-normalized[^1] string instead of a regex.
+<!-- FIXME(fmease): It's more important to note *here* that the file under test gets normalized too (PATTERN is in 99% cases already normalized)  -->
 
 ### `matches`
 
 Usage: `//@ matches PATH XPATH PATTERN`
 
-Checks that the text of each element selected by `XPATH` in `PATH` matches the python-flavored regex `PATTERN`.
+Checks that the text of each element selected by `XPATH` in `PATH` matches the Python-flavored regex `PATTERN`.
 
 ### `matchesraw`
 
-Usage: `//@ matchesraw PATH PATTERN`
+Usage: `//@ matchesraw PATH XPATH PATTERN`
 
 Checks that the contents of the file `PATH` matches the regex `PATTERN`.
 
+<!-- FIXME(fmease): This previously didn't mention XPATH, mention it in prose -->
+
 ### `hasraw`
 
-Usage: `//@ hasraw PATH PATTERN`
+Usage: `//@ hasraw PATH XPATH PATTERN`
 
 Same as `matchesraw`, except `PATTERN` is a whitespace-normalized[^1] string instead of a regex.
+<!-- FIXME(fmease): It's more important to note *here* that the file under test gets normalized too (PATTERN is in 99% cases already normalized)  -->
+
+<!-- FIXME(fmease): This previously didn't mention XPATH, mention it in prose -->
 
 ### `count`
 
@@ -78,34 +84,61 @@ determined by the XPath, and compares it to a pre-recorded value
 in a file. The file's name is the test's name with the `.rs` extension
 replaced with `.NAME.html`, where NAME is the snapshot's name.
 
-htmldocck supports the `--bless` option to accept the current subtree
+Htmldocck supports the `--bless` option to accept the current subtree
 as expected, saving it to the file determined by the snapshot's name.
 compiletest's `--bless` flag is forwarded to htmldocck.
 
+<!-- FIXME(fmease): Also mention that we normalize certain URLS
+both when and checking and when normalizing
+-->
+
 ### `has-dir`
 
 Usage: `//@ has-dir PATH`
 
-Checks for the existance of directory `PATH`.
+Checks for the existence of directory `PATH`.
 
 ### `files`
 
 Usage: `//@ files PATH ENTRIES`
 
 Checks that the directory `PATH` contains exactly `ENTRIES`.
-`ENTRIES` is a python list of strings inside a quoted string,
+`ENTRIES` is a Python list of strings inside a quoted string,
 as if it were to be parsed by `eval`.
 (note that the list is actually parsed by `shlex.split`,
-so it cannot contain arbitrary python expressions).
+so it cannot contain arbitrary Python expressions).
 
 Example: `//@ files "foo/bar" '["index.html", "sidebar-items.js"]'`
 
 [^1]: Whitespace normalization means that all spans of consecutive whitespace are replaced with a single space.  The files themselves are also whitespace-normalized.
 
+## Compiletest Directives
+
+In addition to the directives listed here,
+`rustdoc` tests also support most
+[compiletest directives](../tests/directives.html).
+
+<!-- FIXME(fmease):
+Should definitely also mention `//@ aux-crate` and `//@ proc-macro`
+UNLESS we nuke this paragraph entirely and refer to the compiletest section(s)?
+-->
+To use multiple crates in a `rustdoc` test, add `//@ aux-build:filename.rs`
+to the top of the test file. `filename.rs` should be placed in an `auxiliary`
+directory relative to the test file with the comment.
+
+<!-- FIXME(fmease): We might want to explain why this exists / what this actually means -->
+If you need to build docs for the auxiliary file, use `//@ build-aux-docs`.
+
+<!-- FIXME(fmease): Mention `//@ doc-flags`! -->
+
 ## Limitations
-`htmldocck.py` uses the xpath implementation from the standard library.
+
+Htmldocck uses the XPath implementation from the Python standard library.
 This leads to several limitations:
+
 * All `XPATH` arguments must start with `//` due to a flaw in the implemention.
 * Many XPath features (functions, axies, etc.) are not supported.
 * Only well-formed HTML can be parsed (hopefully rustdoc doesn't output mismatched tags).
 
+<!-- FIXME(fmease): Maybe link to revisions?  -->
+Furthmore, compiletest revisions are not supported.
diff --git a/src/rustdoc.md b/src/rustdoc.md
index 320dc9d58..421682b5f 100644
--- a/src/rustdoc.md
+++ b/src/rustdoc.md
@@ -81,6 +81,8 @@ does is call the `main()` that's in this crate's `lib.rs`, though.)
 
 ## Tests
 
+<!-- FIXME(fmease): Also mention `tests/ui/rustdoc/`! -->
+
 * Tests on search engine and index are located in `tests/rustdoc-js` and `tests/rustdoc-js-std`.
   The format is specified
   [in the search guide](rustdoc-internals/search.md#testing-the-search-engine).