-
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serve images from ovh3 and ks1 #454
Comments
I look at it a bit and I'm not sure what is the best way to do it. One way is to have two sites on ks1: images.openfoodfacts.org and (non DNS) images-internal.openfoodfacts.org on ks1 and ovh3 served as http (to avoid problems with certbot), both of them falling back on off2 proxy if they don't have the image locally. But in this case the ks1 bandwith is a bottleneck (and it might be one currently). Another way I can imagine is emitting a redirect to one of the server at random (we would have images.openfoodfacts.org redirecting to either ks1-images.openfoodfacts.org or ovh3-images.openfoodfacts.org) this might be a bit weird to do that, and would imply a lot of redirects for the clients. |
I would do it at the application level. We create images1.off.org and images2, pointing to the two servers, and we have product opener shard the product images urls to have the fraction we want go to each server. This will work for all the images we serve on the web + all clients who request image urls. That should be the bulk of the traffic we care about. That would also help to prioritize real user traffic from our website and app. |
Wouldn't this be the ideal situation to deploy a load balancer? Set up an nginx and have the load balancer distribute them to either server. Or am I missing something? |
A load balancer means to duplicate the images on both servers, needing (too) much space. Wouldn't it be possible to separate images between odd and event numbers?
PROs:
CONs:
Here is some code that should do it: # Map to check if the number is odd or even
# Eg. of requested URL: https://images.openfoodfacts.org/images/products/301/780/050/9105/front_fr.3.200.jpg
map $uri $is_odd {
default 0; # Default to even
# search for an odd number `[13579]`,
# followed by a slash followed by any char except a slash until the end `(?=\/[^\/]*$)`
~[13579](?=\/[^\/]*$) 1; # Odd numbers end in 1, 3, 5, 7, 9
}
server {
location / {
# Conditional proxy pass based on the odd/even value
if ($is_odd) {
proxy_pass http://odd_server;
}
proxy_pass http://even_server;
}
} |
ks1 saturation is at the I/O level (see https://www.computel.fr/munin/openfoodfacts/ks1.openfoodfacts/index.html). Disks I/O can grow up to 90% utilization and ZFS cache is only hit around 60-70% if the time, while network is using around 40Mbps only. Almost no CPU is used. I would have a look at the type of images that are served, and why they are served. For example, low res versions could be directly served by ks1, reducing a lot the latency on a lot of small requests, while serving higher resolution from a reverse-proxy on ovh3 or by redirecting the requests to ovh3. |
Quick check on yerterday's log:
Redirecting full.jpg to OVH3 will potentially reduce I/O by 67% on ks1 while still serving 82% of the images from ks1.
|
Yes! we can try to manually find the good balance. Eg., starting with the full images ending by 1, 3 or 5: |
@CharlesNepote to answer your comment above: rsync is not an option (way too much ios), so yes we need to replicate images on both server. While it means we need a lot of disk space, thanks to ZFS replication it does not imply much IOs. Note that it can also be seen as a feature ;-) (more backups) |
ks1 is serving images but it is a bit saturated because it gets a lot of requests.
OVH3 also have images and could help serve images.
The text was updated successfully, but these errors were encountered: