Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to CLI hooks 1028 (watch scripts) #106

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Update to CLI hooks 1028 (watch scripts) #106

wants to merge 1 commit into from

Conversation

vpetrovykh
Copy link
Member

Add "watch" configuration in addition to "hooks". This is for monitoring changes to things like files or database config and triggering scripts in response.

@vpetrovykh vpetrovykh requested review from a team January 28, 2025 12:01
text/1028-cli-hooks.rst Outdated Show resolved Hide resolved
Comment on lines 157 to 160
We may want to setup a bit of a delay before triggering the scripts so as to
allow a sequence of changes to happen before the response script is triggered.
This is mostly to reduce unnecessary multiple triggers for the same watched
entity.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was wondering a bit about overlapping globs like **/*.edgeql and dbschema/*.edgeql for instance: We don't necessarily want to try to sequence the scripts in a way that they affect each other, so blind parallel execution seems reasonable. Users will need to take care that their scripts do not cause conflicts with each other.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there are two problems:

  • IDEs or other tools modifying/moving multiple files in one after another. The RFC says (and I agree) that all the events should trigger the watch script only once. To make this happen, we'll use this: https://docs.rs/notify-debouncer-mini/latest/notify_debouncer_mini/

  • scripts modifying more files, which could recurse into the watch script again. This would be partially solved by a smart debouncer, but not entirely, since the watch script could first do some slow queries (say a second) and only then write the files. So I think it would be better to allow recursion, as you suggest @scotttrinh

entry in ``gel.toml``::

[watch.files]
"dbschema/**.schema"="gel migration try"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"dbschema/**.schema"="gel migration try"
"dbschema/**.gel"="gel migration try"

I kinda like the generalization idea but I think it's a step too far. IMO gel watch should just require either or both --schema / --files flags and do the right thing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From reading the code, I see that, currently, gel watch is triggering crate::migrations::dev_mode::migrate, without having any state in-between invocations, apart of error reporting.

https://github.com/edgedb/edgedb-cli/blob/master/src/watch/main.rs#L154-L182

So I actually do think generalization is possible.

text/1028-cli-hooks.rst Outdated Show resolved Hide resolved
text/1028-cli-hooks.rst Outdated Show resolved Hide resolved
@aljazerzen aljazerzen changed the title Update to 1028 Update to CLI hooks 1028 (watch scripts) Jan 31, 2025

The paths should be valid in the underlying file system (therefore using ``/``
for Linux and ``\`` for Windows, etc.). The relative file paths are assumed to
start at the project root.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will watching files in non-project directories be supported?

Copy link
Contributor

@aljazerzen aljazerzen Feb 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say no. I don't have a good reason why not, it just conflicts with my opinion of how code should be encapsulated in a repo, without any external paths.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one counter-argument to this is that sometimes the location of the gel.toml file will not be the root of the logical "project". Sometimes people really want to set up a monorepo with one of the sub-projects "owning" the database, so having the gel.toml in the true monorepo root feels less encapsulated. And in these cases, there might be valid use cases for reaching to a sibling package/module to trigger some script.

One possible example is having a monorepo with a Python backend (that owns the database) and having a separate single-page app and wanting to run some tests on the backend when changes to the frontend happen.


FWIW, I think in this particular scenario, you're now using the gel watch feature as a very generic file-watcher utility like watchman, so this might not be a use case we care about. We can stick to a stricter behavior now and gather feedback and use cases.

and we would need to implement all these interaction variants. Instead any
such complexity can be handled inside a singe script that the trigger
references.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if a file is matched by more than 1 watch files entry?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My interpretation is that all matching scripts are ran (sorted alphabetically, by the glob pattern).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the possible concern is that earlier it says:
The order in which hook *keys* appear does not impact their priority (we don't want people getting subtle bugs due to different key order).

Which means we could still be introducing subtle ordering bugs anyways.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want it:

  • being random (which would be caused by storing them in a HashMap),
  • depend on the order in the TOML table (because TOML tables are not supposed to have an order).

So I picked "an order" which might change with releases of the CLI, but will not change with each invocation of gel watch or with each file-change-event.

@vpetrovykh vpetrovykh force-pushed the watch branch 4 times, most recently from ac01775 to 59a306c Compare February 6, 2025 00:26
hooks in that relative order.
* ``gel branch wipe`` command triggers the ``branch.wipe.before``,
``schema.update.before``, ``branch.wipe.after``, and ``schema.update.after``
hooks in that relative order.
Copy link
Member

@1st1 1st1 Feb 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

edgedb branch wipe doesn't wipe the schema, only data, does it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

branch.wipe.before might abort the operation.

BTW, @aljazerzen, can we set some environment variables for the hook commands we run? Like for branch.wipe.before we can set

env GEL_HOOK_NAME=branch.wipe.before \
    GEL_HOOK_INSTANCE=1st1/production
    GEL_HOOK_BRANCH=main

and then people will be able to write scripts that protect/react to things.

We don't need to implement this now, but I'd like the RFC to have a "future extensions" section that mentions this. And we should discuss what happens if a "before" hook errors out (IMO the command needs to be aborted), and what happens if an "after" hook errors out (IMO there's not much we can do, so just face plant).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

edgedb branch wipe doesn't wipe the schema, only data, does it?

It actually does reset schema to initial; which does wipe the schema itself along with the data. FWIW, it would be awesome if there was a data-only version of this command since manually "clearing" a branch is actually a bit annoying.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does wipe the schema, not just data.

* ``gel project init`` command triggers the ``project.init.before``
and ``project.init.after`` hook. If the migrations are applied at the end of
the initialization, then the ``migration.apply.before`` and
``migration.apply.before`` hooks are also triggered.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You forgot schema.update

Copy link
Member

@1st1 1st1 Feb 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like us to list possible use cases here.

project.init.before can check that you have the necessary tools installed in your system, like npx

project.init.after can print a welcome message with instructions and next steps.

------------

These hooks are a bit more abstract than the ones corresponding to specific
CLI commands. The ``schema.update.before`` and ``schema.update.after`` hooks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this is a "meta" hook, it seems to me that it's unnecessary for it to have .before, it's basically a notification.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd just rename it to schema.update (or schema.updated'? schema.changed?`) and call it a day. Most likely this will be the most popular hook, so maybe let's make it high-level and shortly named.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems to me that it's unnecessary for it to have .before, it's basically a notification.

I think schema.update has just as much reason to have "before" hooks, especially if it can abort the update.

Most likely this will be the most popular hook, so maybe let's make it high-level and shortly named.

I think it would be very strange if one set of hooks was named in a totally different way than the rest to save a few keystrokes. If we want to just have a small improvement we can save a few characters by going with pre and post instead? Or do a similar imperative<>past-tense scheme for the other hooks (migration.apply and migration.applied etc)?

But my feeling is that most people will end up with:

[hooks]
"schema.update.after" = "npx generate edgeql-js"

And that doesn't seem much worse than

[hooks]
"schema.updated" = "npx generate edgeql-js"

And especially if they do the project.init.after hook, this feels really weird:

[hooks]
"schema.updated" = "npx generate edgeql-js"
"project.init.after" = "npx tsx ./initialize-db.ts"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use cases for some of the before hooks are more contrived than others, but excluding some hooks from having a before option seems to be too harsh. Checking stuff before proceeding with the command is a pretty general argument for any hook. "I want to check something before any schema change" and "I specifically want to check stuff before a migration" are both valid.

``migration.apply.before`` hooks are also triggered.
* ``gel branch switch`` command triggers ``branch.switch.before``,
``schema.update.before``, ``branch.switch.after``, and ``schema.update.after``
hooks in that relative order.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

branch.switch.before might check if .gel files have any uncommitted changes and abort the operation.
branch.switch.after might check your git branches and remind switching git branch too.

them, but we can generalize this mechanism and make it more flexible, much
like the hooks that respond to CLI commands.

The idea is to have a mechanism for specifying what changes you want to watch
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The idea is to have a mechanism for specifying what changes you want to watch
The idea is to have a mechanism for specifying a file system path you want to watch for modifications

The watch configuration will be described in the ``gel.toml`` project manifest
file in the ``[watch]`` table. We will introduce a ``[watch.files]`` sub-table
for watching file system changes and triggering scripts. Potentially, we will
also later introduce a ``[watch.gel-config]`` sub-table for monitoring changes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move [watch.gel-config] to "future extensions" or something. Probably should be just [watch.config]? Or [watch.server-config]?

--migrate`` will additionally monitor schema changes and perform real-time
migrations.

This is a backwards incompatible change compared to how ``gel watch`` operates
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This is a backwards incompatible change compared to how ``gel watch`` operates
This is a backwards incompatible change compared to how ``edgedb watch`` operates

future "before" hooks as well.
indicate which commands will trigger the hooks. The exception to that is the
``schema.update`` hooks that are intended to trigger in response to any kind
of schema change, regardless of which command causes it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth pointing out the two common ways that migrations get applied: via migration apply and with migration-watch-mode?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do talk about the schema.update hooks in their own section, where I mention this stuff.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this does not need to be mentioned here, because there are many ways in which migration might be applied:

  • gel migration apply
  • gel watch
  • gel project init
  • gel branch rebase
  • gel branch merge

Comment on lines +106 to +110
* ``gel migrateion apply`` (or ``gel migrate``) command triggers
``migration.apply.before``, ``schema.update.before``,
``migration.apply.after``, and ``schema.update.after`` hooks in that
relative order.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if these multi-hook cases should have the hooks wrapped rather than interleaved. In other words, for this one, change the order to:

  1. migration.apply.before
  2. schema.update.before
  3. schema.update.after
  4. migration.apply.after

This is mostly a vibes-based question, but it feels closer to how I imagine other similar things working, like resource management hooks like with, and things like middleware. 🤷

At the very least, if we keep both migration.apply and schema.update maybe we should talk about why we always chose to run one before the other (or "above", "outside") based on how we expect them to be used when both are defined?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a note about this into the Rejected Ideas section. The TLDR version is that "branch switch causes schema change" and "migration causes schema change" may need different things (like Postgres connection string) setup before the schema change. Hence why order is what it is.

Comment on lines 134 to 135
compatibility layer. It could also perform any project-specific custom
operation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
compatibility layer. It could also perform any project-specific custom
operation.
compatibility layer. It could also perform any project-specific custom
operation, like integration tests, type-checkers, or linters.

maybe?

Comment on lines 185 to 188
watch`` command. We may also consider setting up a logfile (because that
creates a record which survives closing of terminals or reboots) as an
additional convenience feature. This can be specified in the general
``[watch]`` section as ``logfile="<path-to-logfile>"``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should move to a "future extensions" section as well, I think.

Comment on lines 190 to 194
We may want to setup a debouncer so that we can delay before triggering the
scripts on a sequence of changes. This is mostly to reduce unnecessary
multiple triggers for the same watched entity. The debouncer delay may be
configured in the ``[watch]`` section as ``deboucer-delay=<integer>`` with the
value being the delay in milliseconds.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debouncing seems like a must-have feature, but being able to tweak the timing seems like a nice-to-have one.


The paths should be valid in the underlying file system (therefore using ``/``
for Linux and ``\`` for Windows, etc.). The relative file paths are assumed to
start at the project root.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one counter-argument to this is that sometimes the location of the gel.toml file will not be the root of the logical "project". Sometimes people really want to set up a monorepo with one of the sub-projects "owning" the database, so having the gel.toml in the true monorepo root feels less encapsulated. And in these cases, there might be valid use cases for reaching to a sibling package/module to trigger some script.

One possible example is having a monorepo with a Python backend (that owns the database) and having a separate single-page app and wanting to run some tests on the backend when changes to the frontend happen.


FWIW, I think in this particular scenario, you're now using the gel watch feature as a very generic file-watcher utility like watchman, so this might not be a use case we care about. We can stick to a stricter behavior now and gather feedback and use cases.

auto-migrating them as part of regular ``gel.toml`` watch spec. There is a bit
of special handling and safe-guards involved in that monitoring that make it a
little too special. Instead we offer ``--monitor`` flag to enable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
little too special. Instead we offer ``--monitor`` flag to enable
little too special. Instead we offer ``--migrate`` flag to enable

right?

Add "watch" configuration in addition to "hooks". This is for monitoring
changes to things like files or database config and triggering scripts
in response.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants