Skip to content
This repository has been archived by the owner on Aug 30, 2022. It is now read-only.

Cannot do full build on Centos7 #27

Open
alanzablocki opened this issue Jun 20, 2019 · 28 comments
Open

Cannot do full build on Centos7 #27

alanzablocki opened this issue Jun 20, 2019 · 28 comments
Labels
area/full-deployment Related to the full deployment type, multiple containers in viya-visuals status/confirmed A problem has been successfully reproduced by others type/bug An unexpected result from a feature

Comments

@alanzablocki
Copy link

alanzablocki commented Jun 20, 2019

Describe the bug
I am unable to do a full build, after successfully doing programming-only single build. I see the message:

2019/06/20 18:54:24 Serving license and entitlement on sas-container-recipes-builder:1976 (172.17.0.2)
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x889372]

goroutine 23 [running]:
main.(*SoftwareOrder).TestRegistry(0xc0002c8000, 0xc00033c180, 0xc00033c120, 0xc00033c0c0)
/sas-container-recipes/order.go:1046 +0x1e2
created by main.NewSoftwareOrder
/sas-container-recipes/order.go:308 +0x7e3
exit status 2
sas-container-recipes-builder-19.05.0-20190620135121-no-git-sha

Is this complaining about the registry? Can it be azure container registry or must it be an actual dockerhub registry?

Thanks

To Reproduce
Steps to reproduce the behavior:
./build.sh --type full --docker-namespace sasfullxyz --docker-registry-url sasfullxyz.azurecr.io --zip /home/admin/sas-container-recipes-master/SAS_Viya_deployment_data.zip --addons "auth-demo ide-jupyter-python3"

Expected behavior
Get into the build step

Environment (please complete the applicable information):
[admin@RM-SAS-DOCKER-01 sas-container-recipes-master]$ docker version
Client:
Version: 18.09.6
API version: 1.39
Go version: go1.10.8
Git commit: 481bc77156
Built: Sat May 4 02:34:58 2019
OS/Arch: linux/amd64
Experimental: false

Server: Docker Engine - Community
Engine:
Version: 18.09.6
API version: 1.39 (minimum version 1.12)
Go version: go1.10.8
Git commit: 481bc77
Built: Sat May 4 02:02:43 2019
OS/Arch: linux/amd64
Experimental: false

@alanzablocki
Copy link
Author

alanzablocki commented Jun 20, 2019

After trying the commands 3 more times - I get to this point:
2019/06/20 19:07:47 Serving license and entitlement on sas-container-recipes-builder:1976 (172.17.0.2)
2019/06/20 19:07:57 Get https://sasfullxyz.azurecr.io: dial tcp: lookup sasfullzyz.azurecr.io on 192.168.6.125:53: read udp 172.15.0.2:46819->192.168.6.125:53: i/o timeout

exit status 1

But running again, once again stops at the Software Order error

@alanzablocki
Copy link
Author

alanzablocki commented Jun 20, 2019

Update No.2

After actually changing the registry to docker

./build.sh --type full --docker-namespace myprivdockerregistryname --docker-registry-url index.docker.io --zip /home/admin/sas-container-recipes-master/SAS_Viya_deployment_data.zip --addons "auth-demo ide-jupyter-python3"

I got into the build process (32 builds), and on build fails on:

2019/06/20 20:03:33 Starting 32 build processes ... (this may take several minutes)
2019/06/20 20:03:33 [TIP] System resource utilization can be seen by using the docker stats command.
2019/06/20 20:03:33 Starting Docker build: index.docker.io/sasviyasasdemo1/sas-viya-reportservices:19.05.0-20190620150204-no-git-sha ...
2019/06/20 20:03:33 Starting Docker build: index.docker.io/sasviyasasdemo1/sas-viya-cognitivecomputingservices:19.05.0-20190620150204-no-git-sha ...
2019/06/20 20:03:33 Starting Docker build: index.docker.io/sasviyasasdemo1/sas-viya-modelservices:19.05.0-20190620150204-no-git-sha ...
2019/06/20 20:03:33 Starting Docker build: index.docker.io/sasviyasasdemo1/sas-viya-operations:19.05.0-20190620150204-no-git-sha ...
panic: runtime error: index out of range

goroutine 135 [running]:
main.buildWorker(0x1, 0xc000bee000, 0xc00044a4e0, 0xc00044a540, 0xc00044a480)
/sas-container-recipes/order.go:688 +0x846
created by main.(*SoftwareOrder).Build
/sas-container-recipes/order.go:831 +0x3a1
exit status 2
sas-container-recipes-builder-19.05.0-20190620150204-no-git-sha
[admin@RM-SAS-DOCKER-01 sas-container-recipes-master]$

On second attempt:

2019/06/20 20:29:17 Starting 32 build processes ... (this may take several minutes)

2019/06/20 20:29:17 [TIP] System resource utilization can be seen by using the docker stats command.
2019/06/20 20:29:17 Starting Docker build: index.docker.io/sasviyasasdemo1/sas-viya-dataservices:19.05.0-20190620152746-no-git-sha ...
2019/06/20 20:29:17 Starting Docker build: index.docker.io/sasviyasasdemo1/sas-viya-operations:19.05.0-20190620152746-no-git-sha ...
2019/06/20 20:29:17 Starting Docker build: index.docker.io/sasviyasasdemo1/sas-viya-workflowmanager:19.05.0-20190620152746-no-git-sha ...
2019/06/20 20:29:17 Starting Docker build: index.docker.io/sasviyasasdemo1/sas-viya-httpproxy:19.05.0-20190620152746-no-git-sha ...
^[[B^[[B^[[B^[[Bpanic: runtime error: index out of range

goroutine 130 [running]:
main.buildWorker(0x3, 0xc000202060, 0xc00033c360, 0xc00033c420, 0xc00033c240)
/sas-container-recipes/order.go:688 +0x846
created by main.(*SoftwareOrder).Build
/sas-container-recipes/order.go:831 +0x3a1
exit status 2
sas-container-recipes-builder-19.05.0-20190620152746-no-git-sha

Any idea what might be causing this?

I am going to try and keep forcing the builds, because I think each time I start a build, the process starts on a different container, so it may be that I can build 31/32 and it will fail on say just one, perhaps isolating the issue?

Am I right about it being worth pushing it through?

@alanzablocki
Copy link
Author

Update 3.

Another fail this time with

panic: runtime error: index out of range

goroutine 92 [running]:
main.buildWorker(0x3, 0xc000666000, 0xc0004223c0, 0xc000422480, 0xc000422360)
/sas-container-recipes/order.go:688 +0x846
created by main.(*SoftwareOrder).Build
/sas-container-recipes/order.go:831 +0x3a1
exit status 2

Is this where it is having issues? Line 688 says
imageSize := imageInfo[0].Size

while 831 is a loop over order.WorkerCount

@Collinux Collinux added area/full-deployment Related to the full deployment type, multiple containers in viya-visuals status/confirmed A problem has been successfully reproduced by others type/bug An unexpected result from a feature labels Jun 21, 2019
@Collinux
Copy link
Contributor

Collinux commented Jun 21, 2019

@alanzablocki I'm working on an experimental branch with instructions on how to use ACR+AKS, along with fixes to some of the issues you've encountered. Want to try it out?

https://github.com/sassoftware/sas-container-recipes/tree/19m06-dev
https://github.com/sassoftware/sas-container-recipes/blob/19m06-dev/docs/wiki/Using-Kubernetes-Providers.md

We could work through the issues from there and I'll keep updating the branch.

@alanzablocki
Copy link
Author

@Collinux Awesome, thanks for letting me know I will try it over the weekend.
Cheers

@alanzablocki
Copy link
Author

Should I run docker image prune -a, to clean everything out, since I will use a new build branch?

@Collinux
Copy link
Contributor

There's no need to do an image prune since the former images builds do not affect future image builds.

If you don't have much disk space then it helps to do an occasional image prune. I use docker rmi $(docker images -qa) --force; docker system prune --all --force to clear out my system every few weeks.

@alanzablocki
Copy link
Author

Thanks, I actually started a build a few minutes ago to run in the background.

Error #1:
2019/06/21 16:03:49 Starting 32 build processes.
System resource utilization can be seen by using the docker stats command in a separate terminal.
Verbose logging for each container is inside the builds/full//log.txt file.

NAME STATUS LAYER
2019/06/21 16:03:49 microanalyticservice:19.05.0-20190621110125-4781c11 container build Error response from daemon: invalid reference format
exit status 1
sas-container-recipes-builder-19.05.0-20190621110125-4781c11

with command

[admin@RM-SAS-DOCKER-01 sas-container-recipes]$ ./build.sh --type full --docker-namespace sasfull --docker-registry-url https://sasfull.azurecr.io --zip /home/admin/sas-container-recipes/SAS_Viya_deployment_data.zip --addons "auth-demo ide-jupyter-python3"

Note: -n|--docker-namespace|--docker-registry-namespace has been deprecated. This function is no longer performed and its usage can be ignored.

===============================
Building Docker Build Container

@alanzablocki
Copy link
Author

I changed the command and run again:
./build.sh --type full --docker-registry-url sasfull.azurecr.io --zip /home/admin/sas-container-recipes/SAS_Viya_deployment_data.zip --addons "auth-demo ide-jupyter-python3"

It started to build and then...

2019/06/21 16:09:23 Starting 32 build processes.
System resource utilization can be seen by using the docker stats command in a separate terminal.
Verbose logging for each container is inside the builds/full//log.txt file.

NAME STATUS LAYER
microanalyticservice Building [=============================>] 10/19
sasdatasvrc Building [=============================>
] 10/19
reportservices Building [===============================>] 11/20
cognitivecomputingservices Building [===============================>
] 11/20
2019/06/21 16:10:56 cognitivecomputingservices:19.05.0-20190621110735-4781c11 container build error: cognitivecomputingservices: map[code:2 message:The command '/bin/sh -c ansible-playbook -vv /ansible/playbook.yml --extra-vars layer=sas-prerequisites --extra-vars PLAYBOOK_SRV=${PLAYBOOK_SRV} --extra-vars container_name=cognitivecomputingservices' returned a non-zero code: 2]

For error details, review the following log files:
builds/full-2019-06-21-16-07-40/sas-viya-cognitivecomputingservices/log.txt
builds/full-2019-06-21-16-07-40//build.log
If you cannot resolve the problem, create an issue on GitHub at the link below and attach these log files.
https://github.com/sassoftware/sas-container-recipes/issues

exit status 1

Log file attached
build.log
log.txt

@g8sman
Copy link
Contributor

g8sman commented Jun 21, 2019

Maybe a network blip? In log.txt

"Failure talking to yum: Cannot find a valid baseurl for repo: base/7/x86_64"

@alanzablocki
Copy link
Author

Hmm, I will try again. Any chance that I have run out "hits" to my SAS orders. I am not using a mirror url.

@alanzablocki
Copy link
Author

I think it was moving along but then stopped again - I can also see the size of my azure registry increased to 0.24GB, so things are going:

Here is the error now:
2019/06/21 17:13:14 Starting 32 build processes.
System resource utilization can be seen by using the docker stats command in a separate terminal.
Verbose logging for each container is inside the builds/full//log.txt file.

NAME STATUS LAYER
microanalyticservice Pushing [==========================================================] 19/19 (1.96 GB)
pgpoolc Building [==========================================================] 12/20
reportservices Building [===========================================>] 15/20
cognitivecomputingservices Pushing [==========================================================] 20/20 (1.78 GB)
pgpoolc Building [==================================>
_________] 12/20
2019/06/21 17:30:16 pgpoolc:19.05.0-20190621121145-4781c11 container build error: pgpoolc: map[code:2 message:The command '/bin/sh -c ansible-playbook -vv /ansible/playbook.yml --extra-vars layer=sas-java --extra-vars PLAYBOOK_SRV=${PLAYBOOK_SRV} --extra-vars container_name=pgpoolc' returned a non-zero code: 2]

For error details, review the following log files:
builds/full-2019-06-21-17-11-50/sas-viya-pgpoolc/log.txt
builds/full-2019-06-21-17-11-50//build.log
If you cannot resolve the problem, create an issue on GitHub at the link below and attach these log files.
https://github.com/sassoftware/sas-container-recipes/issues

exit status 1
sas-container-recipes-builder-19.05.0-20190621121145-4781c11

This time I cannot see any obvious errors in the logs.

And the logs:
log.txt
build.log

@alanzablocki
Copy link
Author

alanzablocki commented Jun 21, 2019

I run the same commands again twice and both times I got the registry errors but they are different:
a)
2019/06/21 17:42:33 Curl of Docker registry TLS URL failed.
Ensure your --docker-registry-url argument was entered correctly.
Get https://sasfullpy3r3.azurecr.io: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

exit status 1

b)
2019/06/21 17:43:59 Reading config from /home/sas/.docker/config.json
2019/06/21 17:43:59 Serving license and entitlement on sas-container-recipes-builder:1976 (172.17.0.2)
2019/06/21 17:44:09 Error response from daemon: Get https://registry-1.docker.io/v2/: dial tcp: lookup registry-1.docker.io on 192.168.6.123:53: read udp 192.168.6.108:53307->192.168.6.123:53: i/o timeout

exit status 1

I run it a third time and so far it is building , so all of the above seem to be passing issues, but they do stop a build.

@alanzablocki
Copy link
Author

Looks like I keep getting the talking to yum issue: here is another
2019/06/21 17:48:36 Starting 32 build processes.
System resource utilization can be seen by using the docker stats command in a separate terminal.
Verbose logging for each container is inside the builds/full//log.txt file.

NAME STATUS LAYER
computeservices Building [==========================================================] 12/20
pgpoolc Building [==========================================================] 14/20
datamining Building [==========================================================] 11/20
coreservices Building [==========================================================] 12/20
pgpoolc Building [========================================>] 14/20
computeservices Building [==================================>
] 12/20
coreservices Building [==================================>
] 12/20
datamining Building [===============================>
_________] 11/20
2019/06/21 17:57:05 datamining:19.05.0-20190621124702-4781c11 container build error: datamining: map[code:2 message:The command '/bin/sh -c ansible-playbook -vv /ansible/playbook.yml --extra-vars layer=sas-prerequisites --extra-vars PLAYBOOK_SRV=${PLAYBOOK_SRV} --extra-vars container_name=datamining' returned a non-zero code: 2]

For error details, review the following log files:
builds/full-2019-06-21-17-47-06/sas-viya-datamining/log.txt
builds/full-2019-06-21-17-47-06//build.log
If you cannot resolve the problem, create an issue on GitHub at the link below and attach these log files.
https://github.com/sassoftware/sas-container-recipes/issues

exit status 1

I will keep running one error after the other to try and beat this, rather than post more error messages.

log.txt

@alanzablocki
Copy link
Author

@Collinux It seems that the build stops on a number of SAS base install steps, with repo errors, failed 127.0.0.1 and no more mirrors to try messages. I am attaching the log from a run which connected to sas repositories hosted by sas.

We even tried using our http hosted sas_repos folder as a mirror-url to try to exclude connections to sas servers, but we see the same general errors.

log.txt

@erwangranger
Copy link

Hello @alanzablocki , this may or may not help but ...

in the log.txt, you have "HTTPS Error 410 - Gone".
That is exactly the error message that you get when you have exceeded the max download count on your SAS order.

I always recommend to use a local YUM mirror, to avoid this.

If you are using a local YUM mirror, I'm curious as to how you serve the HTTP content.
From the info you have, I see Starting 32 build processes. Which means that you are building 32 container images in parallel, which means your mirror will be getting hit more or less simultaneously by 32 processes at the same time.

In the past, when using a quick-and-dirty mirror served by Python SimpleHTTPServer, I have seen it be overwhelmed by those requests (simplehttpserver is single-threaded).
I now tend to use more robust solutions (nginx, pyhon complexhttpserver), or to simply reduce the number of images being built in parallel to avoid overwhelming the web server.

Finally, when using a YUM mirror, beware of hostname resolution. I've been burned before by using "http://localhost:9123", as that means something else when you are inside the container. Even some DNS names may not resolve from inside the container. I tend to use the ip of the mirror (http://10.0.1.23:9123/) as this has less variability.

I hope some of this helps.

@alanzablocki
Copy link
Author

example_mirror_error.txt
@erwangranger Thanks, good to have the confirmation that we reached a limit. I tried using the http server sas_repos folder on a singe auth-demo build and an auth-demo build with jupyter python addon, and those completed fine. The full build fails here, I am attaching a redacted log, can you let me know if you see anything obvious in there.

We will try your ip suggestion. How do you "reduce" the number of images being built?

@alanzablocki
Copy link
Author

P.S. @erwangranger Do you know what the maximum number of downloads is?

@erwangranger
Copy link

  1. To have your max # of download increased, you can contact SAS Tech Support. If you give them your order number, they can crank it up. By default, I think you get 5 downloads. Because of the way these containers are built, each of the 32 images is liable to count as 1. Hence my strong recommendation to use a mirror. :-)

  2. To reduce the number of images being built in parallel, you can add the --workers parameter to the build.sh (documented here )

  3. Based on your log, I find it strange that many RPMs install fine. The one that does not is called: sas-cpp-libstdc++6-6.0.95404-20180510.1525974525.x86_64.
    The error message is:
    "http://<our_address>/repos/shipped/va/104/va-104-x64_redhat_linux_6-yum/Packages/s/sas-cpp-libstdc%2B%2B6-6.0.95404-20180510.1525974525.x86_64.rpm: [Errno 14] HTTP Error 404
    This is just a theory, but I wonder if the character + is the source of the issue here.
    Here are the steps I'd suggest:

  • make sure that you can locate this file in the folder on the file system
  • make sure that you can navigate to this file with a web browser
  • in order to test that this file is accessible from the container, run something like:
    docker container run -it centos:7 curl http://<our_address>/repos/shipped/va/104/va-104-x64_redhat_linux_6-yum/Packages/s/sas-cpp-libstdc%2B%2B6-6.0.95404-20180510.1525974525.x86_64.rpm
    
  • if that does not work, then try with the + symbols instead of %2B :
    docker container run -it centos:7 curl http://<our_address>/repos/shipped/va/104/va-104-x64_redhat_linux_6-yum/Packages/s/sas-cpp-libstdc++6-6.0.95404-20180510.1525974525.x86_64.rpm
    
  • if that produces something different, I'd start looking at which software you are using the serve the HTTP YUM mirror, and that maybe it's having trouble with those characters.

I hope this helps.

@alanzablocki
Copy link
Author

alanzablocki commented Jun 24, 2019

aha! Cannot find the file using the browser, it gives a 404. It is the + sign then, and we solved it with, https://helpx.adobe.com/experience-manager/kb/CannotOpenAFileHavingSpecialCharactersInTheFilenameOnIIS.html

Now we can download the file via browser, and so will try the build again.
We had other file issues before but they were to do with MIME types missing on IIS, for both json and odd files.

@alanzablocki
Copy link
Author

@Collinux

We hit this yum related error

[container.go:431] &{failed: [127.0.0.1] (item=[u'sas-envesntl', u'sas-bootstrap-config', u'sas-runjavasvc']) => {"changed": false, "item": ["sas-envesntl", "sas-bootstrap-config", "sas-runjavasvc"], "msg": "\n\n One of the configured repositories failed (Unknown),\n and yum doesn't have enough cached data to continue. At this point the only\n safe thing yum can do is fail. There are a few ways to work "fix" this:\n\n 1. Contact the upstream for the repository and get them to fix the problem.\n\n 2. Reconfigure the baseurl/etc. for the repository, to point to a working\n upstream. This is most often useful if you are using a newer\n distribution release than is supported by the repository (and the\n packages for the previous distribution release still work).\n\n 3. Run the command with the repository temporarily disabled\n yum --disablerepo= ...\n\n 4. Disable the repository permanently, so yum won't use it by default. Yum\n will then just ignore the repository until you permanently enable it\n again or use --enablerepo for temporary usage:\n\n yum-config-manager --disable \n or\n subscription-manager repos --disable=\n\n 5. Configure the failing repository to be skipped, if it is unavailable.\n Note that yum will try to contact the repo. when it runs most commands,\n so will have to try and fail each time (and thus. yum will be be much\n slower). If it is a very temporary problem though, this is often a nice\n compromise:\n\n yum-config-manager --save --setopt=.skip_if_unavailable=true\n\nCannot find a valid baseurl for repo: base/7/x86_64\n", "rc": 1, "results": []} }

I found a few posts that "talk about it" but am unsure how to proceed. Can you advise?
post1 : https://unix.stackexchange.com/questions/345124/dont-work-yum-update-yum-doesnt-have-enough-cached-data-to-continue
post2: https://community.hortonworks.com/questions/225203/yum-doesnt-have-enough-cached-data-to-continue.html

We are now using these settings when buidling to avoid some of the error we talked about above; ./build.sh --type full --docker-registry-url sasfull.azurecr.io --mirror-url 'http:///sas_repos/' --zip /home/admin/sas-container-recipes/SAS_Viya_deployment_data.zip --addons "auth-demo ide-jupyter-python3" --workers 1

The error is from builds/full-2019-06-24-22-56-37/sas-viya-dataservices/log.txt

Thanks for your help

@Collinux
Copy link
Contributor

Collinux commented Jun 26, 2019

@alanzablocki you provided the argument --workers 1 so it's only allowing 1 core to build your 30+ containers. It seems like yum is unable to cache data with only 1 core available, perhaps try --workers 2 or --workers 4?

@alanzablocki
Copy link
Author

@Collinux we had 4 when we started the build , the default , since I did not use the --workers flag initially. We saw this same error with 4. I will try --workers 2. We went down all the way to one to eliminate the possibility that there were too many http requests being placed to our sas_repos mirror url. I am happy with one image at a time.
Whenever the build stops with the Failure talking to yum: Cannot find a valid baseurl for repo: base/7/x86_64 or the cache error, I run

yum clean metadata
yum clean all
yum update

Bizarrely, I was able to build as many as 3-4 images after doing that, rather than one. I am not convinced that is the solution though. Another day, the build would also keep going for a couple of images, and then stop, with the same errors (cache or talking to yum). Although it seems like maybe network errors, our network has been up all the time.

I am at 27 out of 32 images, so getting closer.

@alanzablocki
Copy link
Author

@Collinux Our full build finished sometime over the weekend, 113GB in /data for the 32 docker images (we cut it close as /data had 119GB of space). This takes up 50GB in Azure container registry. Next I will follow your AKS instructions to deploy to Kubernetes.

@alanzablocki
Copy link
Author

alanzablocki commented Jul 2, 2019

@Collinux Did something change in the how we access Jupyter? It is not available on port 8888 anymore, only on 8080, which is strange because usually the kernels die on anything other than 8888, like on 80 or 8080.

@TylerGillson
Copy link

TylerGillson commented Jan 10, 2020

I changed the command and run again:
./build.sh --type full --docker-registry-url sasfull.azurecr.io --zip /home/admin/sas-container-recipes/SAS_Viya_deployment_data.zip --addons "auth-demo ide-jupyter-python3"

It started to build and then...

2019/06/21 16:09:23 Starting 32 build processes.
System resource utilization can be seen by using the docker stats command in a separate terminal.
Verbose logging for each container is inside the builds/full//log.txt file.

NAME STATUS LAYER
microanalyticservice Building [=============================>] 10/19 sasdatasvrc Building [=============================>] 10/19
reportservices Building [===============================>] 11/20 cognitivecomputingservices Building [===============================>] 11/20
2019/06/21 16:10:56 cognitivecomputingservices:19.05.0-20190621110735-4781c11 container build error: cognitivecomputingservices: map[code:2 message:The command '/bin/sh -c ansible-playbook -vv /ansible/playbook.yml --extra-vars layer=sas-prerequisites --extra-vars PLAYBOOK_SRV=${PLAYBOOK_SRV} --extra-vars container_name=cognitivecomputingservices' returned a non-zero code: 2]

For error details, review the following log files:
builds/full-2019-06-21-16-07-40/sas-viya-cognitivecomputingservices/log.txt
builds/full-2019-06-21-16-07-40//build.log
If you cannot resolve the problem, create an issue on GitHub at the link below and attach these log files.
https://github.com/sassoftware/sas-container-recipes/issues

exit status 1

Log file attached
build.log
log.txt

FWIW, I encountered this exact same error for a 27 image build and ultimately solved it by passing "--workers 4".

Each time I tried the full build, one of the build's worker processes would inevitably fail - sometimes on the same SAS rpm as the previous attempt, sometimes on a different one. None of the rpm names contained "+" signs. I'm serving my YUM mirror from an nginx Docker container on a separate server, as follows:

docker container run \
    --restart=always \
    --name nginx_mirror \
    -v  /mirrorlocation/:/usr/share/nginx/html:ro \
    -p 9123:80 \
    -d \
    nginx

Conclusion
Throttling the build script's concurrency worked, but a better solution might be to increase the default keepalive_timeout in the docker container's NGINX configuration.

@erwangranger
Copy link

erwangranger commented Jan 13, 2020

Hi @TylerGillson , thanks for the useful info.

Couple things from me:

  1. The Build a YUM Mirror instructions are part of the Internal Workshop, so the link you posted will fail for most people. Could you edit your comment, and replace the link with the instruction:

    docker container run \
        --restart=always \
        --name nginx_mirror1 \
        -v  /mirrorlocation/:/usr/share/nginx/html:ro \
        -p 9123:80 \
        -d \
        nginx
  2. Do you know for a fact that increasing the keepalive_timeout value in NGINX is the key to success here? If so, what did you change it to, for it to work for you?

@TylerGillson
Copy link

Hi @TylerGillson , thanks for the useful info.

Couple things from me:

1. The Build a YUM Mirror instructions are part of the Internal Workshop, so the link you posted will fail for most people. Could you edit your comment, and replace the link with the instruction:
   ```shell
   docker container run \
       --restart=always \
       --name nginx_mirror1 \
       -v  /mirrorlocation/:/usr/share/nginx/html:ro \
       -p 9123:80 \
       -d \
       nginx
   ```

2. Do you know for a fact that increasing the `keepalive_timeout` value in NGINX is the key to success here? If so, what did you change it to, for it to work for you?

Hi @erwangranger , I tested my hypothesis today using a tuned NGINX container with twice the keepalive_timeout (130 instead of 65) and no --workers argument. build.sh ran to completion in 47m4s. I chose to double the keepalive_timeout arbitrarily, but that seems to have worked!

Here's the NGINX Dockerfile I used:

FROM nginx
RUN sed -i /etc/nginx/nginx.conf -e 's/keepalive_timeout  65/keepalive_timeout  130/g'

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/full-deployment Related to the full deployment type, multiple containers in viya-visuals status/confirmed A problem has been successfully reproduced by others type/bug An unexpected result from a feature
Projects
None yet
Development

No branches or pull requests

5 participants