-
Notifications
You must be signed in to change notification settings - Fork 272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Data Liberation] Add WXR import CLI script #2012
Conversation
5e80d0f
to
9124b25
Compare
9124b25
to
fab4f2f
Compare
This is a great start Francesco, thank you! What would it take to expand this to run the actual PHPunit test in context of WordPress — similarly to what WordPress core tests do? I'm not saying this would actually be useful at this early stage, but I'm just curious maybe it wouldn't be that heavy of a lift? There's some prior art in this repo if you search for PHPUnit |
I have added the possibility to run PHPUnit in Playground and updated the PR description with details. When you have time, please let me know what you think. Thanks. |
return false; | ||
} | ||
|
||
$is_wp_cli = defined( 'WP_CLI' ) && WP_CLI; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point, a dedicated WP_CLI command might make sense. It would only be a thin wrapper. The website and the unit tests would use the same underlying import library with their own dedicated logging facilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I made this way so that a user that want to use only the plugin do not need to have WP-CLI all the times.
}, | ||
{ | ||
"step": "runPHP", | ||
"code": "<?php require_once 'wordpress/wp-load.php'; $base = '/wordpress/wp-content/plugins/data-liberation/';\nrequire $base . 'vendor/autoload.php';\ntry {\n$arguments = [\n'--stderr',\n'--configuration', $base . 'phpunit.xml'\n];\n$res = (new PHPUnit\\TextUI\\Application())->run($arguments);\nif ( $res !== 0 ) {\ntrigger_error('PHPUnit failed', E_USER_ERROR);\n}\n} catch (Throwable $e) {\ntrigger_error('PHPUnit failed: ' . $e->getMessage(), E_USER_ERROR);\n};" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool idea! This will suffice for starters, but here's something if you'd like to the next level. What would it take to go from this to something more like a typical CLI command, e.g. cli --blueprint=... --mount=... run vendor/bin/phpunit --configuration phpunit.xml --stderr
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea. Didn't touched the CLI in this first phase.
Aside of my two notes, this looks good. Thank you Francesco! A useful next step would be adding a few actual assertions assertions and running that test in the CI |
I have added an issue to track the new step for the CLI. Fell free to change it if you want. |
…2058) ## Description Adds the Data Liberation WXR importer as an option in the `importWxr` step. The new importer is turned by including the `"importer": "data-liberation"` option: ```json { "steps": [ { "step": "importWxr", "file": { "resource": "url", "url": "https://raw.githubusercontent.com/wpaccessibility/a11y-theme-unit-test/master/a11y-theme-unit-test-data.xml" }, "importer": "data-liberation" } ] } ``` When the `importer` option is missing or set to "default," nothing changes in the behavior of the step and it continues using the https://github.com/humanmade/WordPress-Importer importer. The new importer: * Rewrites links in the imported content * Downloads assets through Playground's CORS proxy * Parallelizes the downloads * Communicates progress This PR is a part of #1894 ## Implementation details This `importWxr` step fetches and includes the `data-liberation-core.phar` file. The phar file is built with [Box](https://box-project.github.io/box/configuration/) and contains the importer library with its dependencies, which is a subset of the Data Liberation library, a subset of the Blueprints library, and a few vendor libraries. This, unfortunately, means that any changes in the PHP files require rebuilding the .phar file. Here's how you can do it: ```bash nx build:phar playground-data-liberation ``` You can also build the entire Data Liberation package as a WordPress plugin complete with a wp-admin page: ```bash nx build:plugin playground-data-liberation ``` Both commands will output the built files to `packages/playground/data-liberation/dist` The progress updates are a first-class feature of the new importer. The updated `importer` step receives them in real-time via a `post_message_to_js()` call running after every import step. Then, it passes them on to the progress bar UI. ### Other changes * **TLS traffic now goes through the CORS proxy.** Since the new importer uses `AsyncHTTP\Client` which deals with raw sockets, Playground's [TLS-based network bridge](#1926) runs the outbound traffic through a cors proxy. Technically, `TCPOverFetchWebsocket` gets the `corsProxy` URL passed to the `playground.boot()` call. * A few composer dependencies were forked, downgraded to PHP 7.2 using Rector, and bundled with this PR to keep the Data Liberation importer working. ## Remaining work - [x] PHP 7.2 compatibility. Done by forking and Rector-downgrading dependencies that were incompatible with PHP 7.2. - [x] Report the importer's progress on the overall Blueprint progress bar - [x] Enqueue the data liberation plugin files for downloading at the blueprint compilation stage - [x] Don't eagerly rewrite attachments URLs in `WP_Stream_Importer`. Exposing this information to the API consumer requires an explicit decision. Do we rewrite it? Or do we ignore it? - [x] Fix the TLS errors at the intersection of Playground network transport and the async HTTP client library - [x] Separate the markdown importer and its dependencies (md parser, frontmatter parser, Symfony libraries) from the core plugin - [x] Ship the importer and its tree-shaken deps (URL parser) as a minified zip/phar ## Follow-up work - [ ] Reconsider the `WP_Import_Session` API – do we need so many verbosely named methods? Can we achieve the same outcomes with fewer methods? - [ ] Investigate why there's a significant delay before media downloads start on PHP 7.2 – 7.4. It's likely a PHP.wasm issue. ## Testing instructions * Default importer – [Open this link](http://localhost:5400/website-server/#{%20%22plugins%22:%20[],%20%22steps%22:%20[%20{%20%22step%22:%20%22importWxr%22,%20%22file%22:%20{%20%22resource%22:%20%22url%22,%20%22url%22:%20%22https://raw.githubusercontent.com/wpaccessibility/a11y-theme-unit-test/master/a11y-theme-unit-test-data.xml%22%20}%20}%20],%20%22preferredVersions%22:%20{%20%22php%22:%20%228.3%22,%20%22wp%22:%20%226.7%22%20},%20%22features%22:%20{%20%22networking%22:%20true%20},%20%22login%22:%20true%20}) and confirm it does what the current `importWxr` step do, that is it stays at "Importing content" for a moment, fails to fetch media files (CORS issues in network tools), but inserts posts and pages. * Data Liberation – [Open this link](http://localhost:5400/website-server/#{%20%22plugins%22:%20[],%20%22steps%22:%20[%20{%20%22step%22:%20%22importWxr%22,%20%22importer%22:%20%22data-liberation%22,%20%22file%22:%20{%20%22resource%22:%20%22url%22,%20%22url%22:%20%22https://raw.githubusercontent.com/wpaccessibility/a11y-theme-unit-test/master/a11y-theme-unit-test-data.xml%22%20}%20}%20],%20%22preferredVersions%22:%20{%20%22php%22:%20%228.3%22,%20%22wp%22:%20%226.7%22%20},%20%22features%22:%20{%20%22networking%22:%20true%20},%20%22login%22:%20true%20}), confirm the import progress is visible and that the content and media indeed get imported: ![CleanShot 2024-12-08 at 14 54 49@2x](https://github.com/user-attachments/assets/a7da3244-a10f-43d2-8e94-43d305220a7e) ## Related issues * #1211 * #2012 * #1477 * #1250 * #1780
Add Data Liberation import script. The script lets you import a folder with WXRs inside WordPress. Add the possibility to run PHPUnit inside Playground.
cd packages/playground/data-liberation/bin/import bash import-wxr.sh /a-folder/with-the/wxr-files-to-import-inside
cd packages/playground/data-liberation nx run test:wp-phpunit
The import CLI is also registered as a WP-CLI command in the
init
action if WP-CLI is included. So it can also be run aswp data-liberation your-wrx-file-you-want-to-import.xml
.Motivation for the change, related issues
There's no good entry point to running that import right now; we use an ad-hoc code snippet inside the Data Liberation WordPress plugin. This new CLI command will make testing the import easy.
There must be also be the possibility of running the PHPUnit test in the context of WordPress.
require_once
Implementation details
This script consists of six major parts.
The bin/import/import-wxr.sh bash script
This script accepts a folder path. You can create one and put all the WXR you want to import inside it. It starts the
cli.ts
server, mounts the folder specified in/wordpress/wp-content/uploads/import-wxr
.The bin/import/blueprint-import-wxr.json blueprint
The bluescript enables the Data Liberation plugin. Enumerate all the files with .xml extension inside the mounted folder and import them all using a new function created.
The PHP snippet run in the
runPHP
step uses thewp_visit_file_tree
provided by the plugin:A new
data_liberation_import
functionThe new simple import function in the plugin runs
WP_Stream_Importer
and not much more.The new tests/import/run.sh script
This script runs PHPUnit inside Playground. It generates an error if PHPUnit generates an error.
The new tests/import/blueprint-import.json blueprint
This blueprint runs all PHPUnit tests found in
tests
inside Playground. It returns success if everything goes well. It returns an error if one or more tests fail.New unit test
The new
WPStreamImporterTests
class runs the first test usingWP_Stream_Importer::create_for_wxr_file
. It is only runnable inside WordPress, so there is a check insetUp()
if it's the right environment. Otherwise, it is not run.Testing Instructions (or ideally a Blueprint)
Import script
Example with one of the preexisting XML files:
Then check
http://127.0.0.1:9400/wp-admin/edit.php
. All the WXR posts should be there.PHPUnit run inside Playground
Run test on local:
1188 tests should succeed.
Run PHPUnit on Playground:
All tests should succeed and output "Successfully ran target test:wp-phpunit for project playground-data-liberation".