Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upkeep for Playground deployment #311

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

Conversation

shepmaster
Copy link
Member

These changes have been running in production for a while after I hand-applied them. The Ansible-specific changes are mostly to see that the managed changes match close enough to the manual changes, but hopefully will be helpful to others.

proxied:
- domain: "{{ vars_playground_domain }}"
to: "http://localhost:{{ vars_playground_env_ui_port }}"
to: "http://127.0.0.1:{{ vars_playground_env_ui_port }}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this any different, I wonder? 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check out the commit message

@@ -25,3 +25,8 @@
src: after-ssl-renew.sh
dest: /etc/ssl/letsencrypt/after-renew.d
mode: 0750

- name: create systemd override file
template:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ansible-lint suggests writing the FQDN name of the task: ansible.builtin.template.

It's probably worth considering enforcing ansible-lint on the CI to have a consistent style everywhere.

@@ -0,0 +1,2 @@
[Service]
LimitNOFILE={{ (worker_connections * 2) + 32 }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick question: what's the equation for? This multiplication and addition.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment added

Copy link
Member

@jdno jdno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me from the outside. The remark about FQDN is valid, but something we should fix for all Ansible roles at once.

The playground server only listens on an IPv4 address, not
IPv6. Sending requests to `localhost` routes to both the IPv4 **and**
IPv6, which results in a percentage of requests seemingly randomly
failing.
With the WebSocket functionality, we now have a number of latent
connections hanging out.
This allows us to influence the OOM killer's behavior, hopefully
killing our processes instead of something more important to the
system.
The default driver can be inefficient [1] as it reads / parses /
formats / writes a large JSON file over and over. Since all of the
playground's communication goes over stdin / stdout, that can be a lot
of junk logged!

The `local` driver should be more efficient.

[1]: docker/for-linux#641
This is expected to happen when the service restarts while a container
is running, as we don't have a graceful cleanup in the service. It can
also happen unexpectedly during a crash.

For some reason, this also happens when the service hasn't
restarted (naturally or unexpectedly) and I haven't had a chance to
hunt that down yet.
@shepmaster
Copy link
Member Author

Rebased now that #393 is merged and I added the last commit (the container GC script)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants