Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: add run_server to run eio servers #408

Merged
merged 2 commits into from
Jan 26, 2023

Conversation

bikallem
Copy link
Contributor

This PR adds run_server - to be used to spawn network socket server. The current version supports spawning both single domain and multi domain servers. The default is a single domain concurrent server.

Based on discussion in mirage/ocaml-cohttp#961 (comment).

/cc @talex5 @patricoferris

@bikallem
Copy link
Contributor Author

The ~additional_domains bit doesn't work with luv backend. Should this be reflected in the API doc?

lib_eio/net.ml Outdated Show resolved Hide resolved
lib_eio/net.ml Outdated Show resolved Hide resolved
@polytypic polytypic changed the title net: add run_sever to run eio servers net: add run_server to run eio servers Jan 23, 2023
lib_eio/net.ml Outdated Show resolved Hide resolved
@bikallem bikallem force-pushed the run_server_multicore branch from 959608b to 2aac3a1 Compare January 23, 2023 10:00
@bikallem
Copy link
Contributor Author

@polytypic Thank you for the review. I have pushed some changes that should address the raised issues.

@bikallem bikallem force-pushed the run_server_multicore branch 2 times, most recently from c0573c7 to 9423e7c Compare January 23, 2023 10:10
lib_eio/net.mli Outdated Show resolved Hide resolved
lib_eio/net.mli Outdated Show resolved Hide resolved
tests/network.md Outdated Show resolved Hide resolved
tests/network.md Outdated Show resolved Hide resolved
lib_eio/net.ml Outdated Show resolved Hide resolved
lib_eio/net.ml Outdated Show resolved Hide resolved
lib_eio/net.ml Outdated Show resolved Hide resolved
lib_eio/net.ml Outdated Show resolved Hide resolved
@bikallem bikallem force-pushed the run_server_multicore branch 6 times, most recently from c0c58b0 to cf3f440 Compare January 24, 2023 12:21
@bikallem
Copy link
Contributor Author

@talex5 I believe I have addressed all of your points. Could we include this PR in the next release please.

Copy link
Contributor

@polytypic polytypic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bikallem bikallem force-pushed the run_server_multicore branch from cf3f440 to ef0ad2d Compare January 25, 2023 11:20
Copy link
Collaborator

@talex5 talex5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you scaled back the tests since last time. I'm putting them back (using mocks) and fixing up some bugs that uncovered...

lib_eio/net.ml Outdated Show resolved Hide resolved
tests/network.md Outdated Show resolved Hide resolved
lib_eio/net.ml Outdated Show resolved Hide resolved
@bikallem bikallem force-pushed the run_server_multicore branch from ef0ad2d to c3c41f0 Compare January 25, 2023 12:18
@bikallem
Copy link
Contributor Author

I see you scaled back the tests since last time.

Yes, I was getting some error while trying to use the mocks. So I scaled it to testing just one client connection for now.

I'm putting them back (using mocks) and fixing up some bugs that uncovered..

That would be nice indeed.

@talex5
Copy link
Collaborator

talex5 commented Jan 25, 2023

I pushed my suggested changes here: https://github.com/talex5/eio/commits/run_server_multicore

  • Added mock tests for single and multi-domain cases, maximum connections, cancellation and graceful shutdown.
  • I brought back your stop parameter - I agree that it's a nicer way to do it.
  • The documentation said additional_domains gives the number of additional domains, but the code used it as the total. I changed the code to match.
  • I moved Semaphore.acquire outside of the protect. Otherwise failing to acquire it will do a release.
  • I got it to release the semaphore when the connection handler finishes, not when it starts.
  • Graceful shutdown previously only stopped the main domain. Now it stops all of them.
  • I left the test for max_connections allowing 0. It's not exactly wrong, so I guess we should allow it.

@bikallem
Copy link
Contributor Author

I pushed my suggested changes here: https://github.com/talex5/eio/commits/run_server_multicore

The changes look good to me, especially the tests are much more illustrative of the functionality of run_server.

One nit, I see that the documentation for @raise has been removed but we still raise Invalid_argument. Is this (the documentation) not required?

@bikallem
Copy link
Contributor Author

bikallem commented Jan 25, 2023

@talex5 I have cherry-picked your commit to this PR and re-added the @raise documentation.

I left the test for max_connections allowing 0. It's not exactly wrong, so I guess we should allow it.

max_connections value at <=0 raises Invalid_argument as you suggested here (#408 (comment)).

@bikallem
Copy link
Contributor Author

bikallem commented Jan 25, 2023

As an aside, Interestingly, Eio.Semaphore.make allows n=0. However, in such cases, how does Eio.Semaphore.acquire s work? It should never return right?

I just tried below in utop:

─( 13:55:02 )─< command 0 >─────────────────────────────────────────────────────────────────────────────────────{ counter: 0 }─
utop # let m = Eio.Semaphore.make 0;;
val m : Eio.Semaphore.t = <abstr>
─( 13:55:02 )─< command 1 >─────────────────────────────────────────────────────────────────────────────────────{ counter: 0 }─
utop # Eio.Semaphore.acquire;;
- : Eio.Semaphore.t -> unit = <fun>
─( 13:55:14 )─< command 2 >─────────────────────────────────────────────────────────────────────────────────────{ counter: 0 }─
utop # Eio.Semaphore.acquire m;;
Exception: Stdlib.Effect.Unhandled(Eio__core__Suspend.Suspend(_))

EDIT: See below comment, I seem to have missed the Eio_main.run bit.

@bikallem
Copy link
Contributor Author

Yes, it does hang. See below the correct utop session below:

utop # Eio_main.run @@ fun env ->
let m = Eio.Semaphore.make 0 in
Eio.Semaphore.acquire m;;
....

The .... bit denotes the Eio.Semaphore.acquire m never returns. Should Eio.Semaphore.make n require n > 0?

@bikallem
Copy link
Contributor Author

In order to not derail this PR and track the issue separately, I have opened an issue for Semaphore #415.

lib_eio/net.ml Outdated
Domain_manager.run domain_mgr (fun () -> Switch.run @@ fun sw -> accept sw);
done;
| None -> ());
Switch.run @@ fun sw -> accept sw
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! I missed this during my review.

So, IIUC, this now creates a new switch that is different from the switch passed in.

I wonder whether there might be ways to make these kinds of programming errors with switches more difficult to make?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to somehow explicitly pass some (parent) context for a switch? @talex5

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was due to my (bad) suggestion in #408 (comment). In my branch I instead replaced this with a single switch and a promise (as @bikallem had it originally), which avoids the confusion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I was basically suggesting the same (use cancellation) in one of my comments.

@talex5
Copy link
Collaborator

talex5 commented Jan 25, 2023

One nit, I see that the documentation for @raise has been removed but we still raise Invalid_argument. Is this (the documentation) not required?

I often don't document Invalid_argument exceptions, as we don't (necessarily) want to promise to raise them, and we don't expect people to catch them. Usually I prefer to say that you have to ensure domains >= 0 when calling it, for example (but it seemed obvious in this case). But there's also no harm in documenting this here I think.

I left the test for max_connections allowing 0. It's not exactly wrong, so I guess we should allow it.

max_connections value at <=0 raises Invalid_argument as you suggested here (#408 (comment)).

OK; I don't have very strong opinions about this.

@polytypic
Copy link
Contributor

  • I brought back your stop parameter - I agree that it's a nicer way to do it.

@talex5 Hmm... I kind of find this unfortunate as being able to generally rely on cancellation sounds nice in principle. At any rate, it might be nice to have some "standard" way for servers to allow them to be stopped. One might have applications running multiple servers for different purposes/types of connections. So, some sort of standard shutdown mechanism could be nice.

@bikallem
Copy link
Contributor Author

I often don't document Invalid_argument exceptions, as we don't (necessarily) want to promise to raise them, and we don't expect people to catch them. Usually I prefer to say that you have to ensure domains >= 0 when calling it, for example (but it seemed obvious in this case). But there's also no harm in documenting this here I think.

@talex5 Yes, same here. IIUC Invalid_argument usually (mostly?) means programmer error rather than a runtime error that user should handle. However, stdlib seems to document it quite a bit (https://github.com/ocaml/ocaml/blob/trunk/stdlib/string.mli#L88) so I thought it was a good practice to follow. For future reference, do you prefer this not being documented? I don't mind either but want to be consistent.

@talex5
Copy link
Collaborator

talex5 commented Jan 25, 2023

I brought back your stop parameter - I agree that it's a nicer way to do it.

@talex5 Hmm... I kind of find this unfortunate as being able to generally rely on cancellation sounds nice in principle.

You can rely on cancellation to stop the server and all connections. That is useful if something failed and you want to stop everything and report the error. But graceful shutdown seems a bit different, as it's not an error. So you have to take the shutdown request, create an exception for it, use that to cancel, then catch the exception again and continue.

Everything needs to support cancellation by default because a crash might happen anywhere in a program and you don't want to be stuck waiting for an uncancellable operation before you can report the error. But graceful shutdown is something you need to support explicitly anyway.

At any rate, it might be nice to have some "standard" way for servers to allow them to be stopped.

I expect this ?stop argument will be the standard way. It's already used in some other projects, e.g. Dream.run has ?stop:unit promise. I used 'a Promise.t though to make it clear that the server only finishes by stopping, and if you don't pass ?stop then it never returns.

@talex5 talex5 force-pushed the run_server_multicore branch 2 times, most recently from 823eda1 to f15c2f8 Compare January 26, 2023 11:42
Runs an accept loop in one or more domains, with cancellation and
graceful shutdown, and an optional maximum number of concurrent
connections.

Co-authored-by: Thomas Leonard <[email protected]>
@talex5 talex5 force-pushed the run_server_multicore branch from f15c2f8 to 1b6bab5 Compare January 26, 2023 11:48
@talex5 talex5 merged commit 80f5352 into ocaml-multicore:main Jan 26, 2023
@bikallem
Copy link
Contributor Author

Thanks.

@bikallem bikallem deleted the run_server_multicore branch January 26, 2023 12:14
talex5 added a commit to talex5/opam-repository that referenced this pull request Feb 1, 2023
CHANGES:

New features:

- Add `Eio.Net.run_server` (@bikallem @talex5 ocaml-multicore/eio#408).
  Runs an accept loop in one or more domains, with cancellation and graceful shutdown,
  and an optional maximum number of concurrent connections.

- Add `Buf_read.BE` and `LE` parsers (@Cjen1 ocaml-multicore/eio#399).
  Parse numbers in various binary formats.

- Add `Eio.Buf_read.uint8` (@talex5 ocaml-multicore/eio#418).

Performance:

- Make `Eio.Condition` lock-free (@talex5 ocaml-multicore/eio#397 ocaml-multicore/eio#381).
  In addition to being faster, this allows using conditions in signal handlers.

- Make `Eio.Semaphore` lock-free (@talex5 @polytypic ocaml-multicore/eio#398).

- Make `Eio.Stream` lock-free when the capacity is zero (@talex5 ocaml-multicore/eio#413 ocaml-multicore/eio#411).

- Make `Eio.Promise` lock-free (@talex5 ocaml-multicore/eio#401).

Bug fixes:

- eio_linux: call `Uring.submit` as needed (@talex5 @bikallem ocaml-multicore/eio#428).
  Previously, we could fail to submit a job promptly because the SQE queue was full.

- Fix luv signals (@haesbaert ocaml-multicore/eio#412).
  `libuv` automatically retries polling if it gets `EINTR`, without giving OCaml signal handlers a chance to run.

- eio_luv: fix some resource leaks (@talex5 @patricoferris ocaml-multicore/eio#421).

- eio_luv: fix "unavailable signal" error on Windows (@talex5 ocaml-multicore/eio#420, reported by @nojb).

- Fix `Buf_write.BE.uint48` and `LE.uint48` (@adatario ocaml-multicore/eio#418).

Documentation:

- Add example programs (@talex5 ocaml-multicore/eio#389).

- Update network examples to use `run_server` (@talex5 ocaml-multicore/eio#417).

- Add a warning to the tutorial about `Fiber.first` (@talex5 ocaml-multicore/eio#394).

- Clarify the epoch used for `Eio.Time.now` (@bikallem ocaml-multicore/eio#395).

- Describe `secure_random` as an infinite source (@patricoferris ocaml-multicore/eio#426).

- Update README for OCaml 5 release (@talex5 ocaml-multicore/eio#384 ocaml-multicore/eio#391 ocaml-multicore/eio#393).

Other changes:

- Delay setting `SIGPIPE` handler until the `run` function is called (@talex5 ocaml-multicore/eio#420).

- Remove debug-level logging (@talex5 ocaml-multicore/eio#403).

- eio-luv: improve `process.md` test (@smondet ocaml-multicore/eio#414).

- Update to Dune 3 (@talex5 ocaml-multicore/eio#410).

- Remove test dependency on Astring (@talex5 ocaml-multicore/eio#402 ocaml-multicore/eio#404).

- Simplify cancellation logic (@talex5 ocaml-multicore/eio#396).

- time: `Mtime.Spand.to_s` has been deprecated in mtime 2.0.0 (@bikallem ocaml-multicore/eio#385).
talex5 added a commit to talex5/opam-repository that referenced this pull request Feb 1, 2023
CHANGES:

New features:

- Add `Eio.Net.run_server` (@bikallem @talex5 ocaml-multicore/eio#408).
  Runs an accept loop in one or more domains, with cancellation and graceful shutdown,
  and an optional maximum number of concurrent connections.

- Add `Buf_read.BE` and `LE` parsers (@Cjen1 ocaml-multicore/eio#399).
  Parse numbers in various binary formats.

- Add `Eio.Buf_read.uint8` (@talex5 ocaml-multicore/eio#418).

Performance:

- Make `Eio.Condition` lock-free (@talex5 ocaml-multicore/eio#397 ocaml-multicore/eio#381).
  In addition to being faster, this allows using conditions in signal handlers.

- Make `Eio.Semaphore` lock-free (@talex5 @polytypic ocaml-multicore/eio#398).

- Make `Eio.Stream` lock-free when the capacity is zero (@talex5 ocaml-multicore/eio#413 ocaml-multicore/eio#411).

- Make `Eio.Promise` lock-free (@talex5 ocaml-multicore/eio#401).

Bug fixes:

- eio_linux: call `Uring.submit` as needed (@talex5 @bikallem ocaml-multicore/eio#428).
  Previously, we could fail to submit a job promptly because the SQE queue was full.

- Fix luv signals (@haesbaert ocaml-multicore/eio#412).
  `libuv` automatically retries polling if it gets `EINTR`, without giving OCaml signal handlers a chance to run.

- eio_luv: fix some resource leaks (@talex5 @patricoferris ocaml-multicore/eio#421).

- eio_luv: fix "unavailable signal" error on Windows (@talex5 ocaml-multicore/eio#420, reported by @nojb).

- Fix `Buf_write.BE.uint48` and `LE.uint48` (@adatario ocaml-multicore/eio#418).

Documentation:

- Add example programs (@talex5 ocaml-multicore/eio#389).

- Update network examples to use `run_server` (@talex5 ocaml-multicore/eio#417).

- Add a warning to the tutorial about `Fiber.first` (@talex5 ocaml-multicore/eio#394).

- Clarify the epoch used for `Eio.Time.now` (@bikallem ocaml-multicore/eio#395).

- Describe `secure_random` as an infinite source (@patricoferris ocaml-multicore/eio#426).

- Update README for OCaml 5 release (@talex5 ocaml-multicore/eio#384 ocaml-multicore/eio#391 ocaml-multicore/eio#393).

Other changes:

- Delay setting `SIGPIPE` handler until the `run` function is called (@talex5 ocaml-multicore/eio#420).

- Remove debug-level logging (@talex5 ocaml-multicore/eio#403).

- eio-luv: improve `process.md` test (@smondet ocaml-multicore/eio#414).

- Update to Dune 3 (@talex5 ocaml-multicore/eio#410).

- Remove test dependency on Astring (@talex5 ocaml-multicore/eio#402 ocaml-multicore/eio#404).

- Simplify cancellation logic (@talex5 ocaml-multicore/eio#396).

- time: `Mtime.Spand.to_s` has been deprecated in mtime 2.0.0 (@bikallem ocaml-multicore/eio#385).
talex5 added a commit to talex5/opam-repository that referenced this pull request Feb 1, 2023
CHANGES:

New features:

- Add `Eio.Net.run_server` (@bikallem @talex5 ocaml-multicore/eio#408).
  Runs an accept loop in one or more domains, with cancellation and graceful shutdown,
  and an optional maximum number of concurrent connections.

- Add `Buf_read.BE` and `LE` parsers (@Cjen1 ocaml-multicore/eio#399).
  Parse numbers in various binary formats.

- Add `Eio.Buf_read.uint8` (@talex5 ocaml-multicore/eio#418).

Performance:

- Make `Eio.Condition` lock-free (@talex5 ocaml-multicore/eio#397 ocaml-multicore/eio#381).
  In addition to being faster, this allows using conditions in signal handlers.

- Make `Eio.Semaphore` lock-free (@talex5 @polytypic ocaml-multicore/eio#398).

- Make `Eio.Stream` lock-free when the capacity is zero (@talex5 ocaml-multicore/eio#413 ocaml-multicore/eio#411).

- Make `Eio.Promise` lock-free (@talex5 ocaml-multicore/eio#401).

Bug fixes:

- eio_linux: call `Uring.submit` as needed (@talex5 @bikallem ocaml-multicore/eio#428).
  Previously, we could fail to submit a job promptly because the SQE queue was full.

- Fix luv signals (@haesbaert ocaml-multicore/eio#412).
  `libuv` automatically retries polling if it gets `EINTR`, without giving OCaml signal handlers a chance to run.

- eio_luv: fix some resource leaks (@talex5 @patricoferris ocaml-multicore/eio#421).

- eio_luv: fix "unavailable signal" error on Windows (@talex5 ocaml-multicore/eio#420, reported by @nojb).

- Fix `Buf_write.BE.uint48` and `LE.uint48` (@adatario ocaml-multicore/eio#418).

Documentation:

- Add example programs (@talex5 ocaml-multicore/eio#389).

- Update network examples to use `run_server` (@talex5 ocaml-multicore/eio#417).

- Add a warning to the tutorial about `Fiber.first` (@talex5 ocaml-multicore/eio#394).

- Clarify the epoch used for `Eio.Time.now` (@bikallem ocaml-multicore/eio#395).

- Describe `secure_random` as an infinite source (@patricoferris ocaml-multicore/eio#426).

- Update README for OCaml 5 release (@talex5 ocaml-multicore/eio#384 ocaml-multicore/eio#391 ocaml-multicore/eio#393).

Other changes:

- Delay setting `SIGPIPE` handler until the `run` function is called (@talex5 ocaml-multicore/eio#420).

- Remove debug-level logging (@talex5 ocaml-multicore/eio#403).

- eio-luv: improve `process.md` test (@smondet ocaml-multicore/eio#414).

- Update to Dune 3 (@talex5 ocaml-multicore/eio#410).

- Remove test dependency on Astring (@talex5 ocaml-multicore/eio#402 ocaml-multicore/eio#404).

- Simplify cancellation logic (@talex5 ocaml-multicore/eio#396).

- time: `Mtime.Spand.to_s` has been deprecated in mtime 2.0.0 (@bikallem ocaml-multicore/eio#385).
talex5 added a commit to talex5/opam-repository that referenced this pull request Feb 1, 2023
CHANGES:

New features:

- Add `Eio.Net.run_server` (@bikallem @talex5 ocaml-multicore/eio#408).
  Runs an accept loop in one or more domains, with cancellation and graceful shutdown,
  and an optional maximum number of concurrent connections.

- Add `Buf_read.BE` and `LE` parsers (@Cjen1 ocaml-multicore/eio#399).
  Parse numbers in various binary formats.

- Add `Eio.Buf_read.uint8` (@talex5 ocaml-multicore/eio#418).

Performance:

- Make `Eio.Condition` lock-free (@talex5 ocaml-multicore/eio#397 ocaml-multicore/eio#381).
  In addition to being faster, this allows using conditions in signal handlers.

- Make `Eio.Semaphore` lock-free (@talex5 @polytypic ocaml-multicore/eio#398).

- Make `Eio.Stream` lock-free when the capacity is zero (@talex5 ocaml-multicore/eio#413 ocaml-multicore/eio#411).

- Make `Eio.Promise` lock-free (@talex5 ocaml-multicore/eio#401).

Bug fixes:

- eio_linux: call `Uring.submit` as needed (@talex5 @bikallem ocaml-multicore/eio#428).
  Previously, we could fail to submit a job promptly because the SQE queue was full.

- Fix luv signals (@haesbaert ocaml-multicore/eio#412).
  `libuv` automatically retries polling if it gets `EINTR`, without giving OCaml signal handlers a chance to run.

- eio_luv: fix some resource leaks (@talex5 @patricoferris ocaml-multicore/eio#421).

- eio_luv: fix "unavailable signal" error on Windows (@talex5 ocaml-multicore/eio#420, reported by @nojb).

- Fix `Buf_write.BE.uint48` and `LE.uint48` (@adatario ocaml-multicore/eio#418).

Documentation:

- Add example programs (@talex5 ocaml-multicore/eio#389).

- Update network examples to use `run_server` (@talex5 ocaml-multicore/eio#417).

- Add a warning to the tutorial about `Fiber.first` (@talex5 ocaml-multicore/eio#394).

- Clarify the epoch used for `Eio.Time.now` (@bikallem ocaml-multicore/eio#395).

- Describe `secure_random` as an infinite source (@patricoferris ocaml-multicore/eio#426).

- Update README for OCaml 5 release (@talex5 ocaml-multicore/eio#384 ocaml-multicore/eio#391 ocaml-multicore/eio#393).

Other changes:

- Delay setting `SIGPIPE` handler until the `run` function is called (@talex5 ocaml-multicore/eio#420).

- Remove debug-level logging (@talex5 ocaml-multicore/eio#403).

- eio-luv: improve `process.md` test (@smondet ocaml-multicore/eio#414).

- Update to Dune 3 (@talex5 ocaml-multicore/eio#410).

- Remove test dependency on Astring (@talex5 ocaml-multicore/eio#402 ocaml-multicore/eio#404).

- Simplify cancellation logic (@talex5 ocaml-multicore/eio#396).

- time: `Mtime.Spand.to_s` has been deprecated in mtime 2.0.0 (@bikallem ocaml-multicore/eio#385).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants