Skip to content

ngx_http_f5_metrics_module.so module missing from nginx controller agent docker build #68

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
vincentmli opened this issue Nov 3, 2021 · 10 comments

Comments

@vincentmli
Copy link

the controller could discover the controller agent, but when adding gateway, it complains error from controller UI

Error: {"key": "instance:cilium-worker:unspecified::", "type": "failed", "Message": "Failed testing config before applying: nginx: [emerg] dlopen() \"/etc/nginx/modules/ngx_http_f5_metrics_module.so\" failed (/etc/nginx/modules/ngx_http_f5_metrics_module.so: cannot open shared object file: No such file or directory) in /etc/nginx/nginx.conf:7\nnginx: configuration file /etc/nginx/nginx.conf test failed\n", "contributions": ["gateway:test_gw::test_env:"]}

empty log in agent.log, this is not helpful as we expect to see something logged in agent.log

and indeed the ngx_http_f5_metrics_module.so is missing in /usr/lib/nginx/modules

# kubectl exec -it nginx-agent -- /bin/sh
# cd /etc/nginx
# ls
conf.d	fastcgi_params	mime.types  modules  nginx.conf  scgi_params  uwsgi_params

# ls -l modules
lrwxrwxrwx 1 root root 22 Sep  5 23:00 modules -> /usr/lib/nginx/modules
# ls -l /usr/lib/nginx/modules
total 3384
-rw-r--r-- 1 root root 874624 Oct 20 20:30 ngx_http_js_module-debug.so
-rw-r--r-- 1 root root 870528 Oct 20 20:30 ngx_http_js_module.so
-rw-r--r-- 1 root root 856288 Oct 20 20:30 ngx_stream_js_module-debug.so
-rw-r--r-- 1 root root 852192 Oct 20 20:30 ngx_stream_js_module.so
@brianehlert
Copy link
Collaborator

The module did not install.
The image was not complete.
Use the IP of the Controller machine as your target not the DNS name.

@vincentmli
Copy link
Author

PE Matthew mentioned this could be versioning mismatch between controller and agent which I agree

@vincentmli
Copy link
Author

upgrade controller to 3.21.0 resolved the issue

/opt/nginx-controller/helper.sh version
Installed version: 3.21.0
[sudo] password for admin: 
Running version: 3.21.0

kubectl logs nginx-agent | head -30

starting nginx ...
waiting for nginx workers ...
2021/11/04 00:13:37 [notice] 8#8: using the "epoll" event method
2021/11/04 00:13:37 [notice] 8#8: nginx/1.21.3 (nginx-plus-r25)
2021/11/04 00:13:37 [notice] 8#8: built by gcc 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04) 
2021/11/04 00:13:37 [notice] 8#8: OS: Linux 5.12.0-051200-generic
2021/11/04 00:13:37 [notice] 8#8: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2021/11/04 00:13:37 [notice] 8#8: start worker processes
2021/11/04 00:13:37 [notice] 8#8: start worker process 11
2021/11/04 00:13:37 [notice] 8#8: start worker process 12
2021/11/04 00:13:37 [notice] 8#8: start worker process 13
2021/11/04 00:13:37 [notice] 8#8: start worker process 14
2021/11/04 00:13:37 [notice] 8#8: start worker process 15
2021/11/04 00:13:37 [notice] 8#8: start worker process 16
2021/11/04 00:13:37 [notice] 8#8: start worker process 17
2021/11/04 00:13:37 [notice] 8#8: start worker process 18
updating /etc/controller-agent/agent.conf ...
 ---> using api_key = fad72ba0cde99e2fa62ef69e465912be
 ---> using instance_name = cilium-worker
starting controller-agent ...
time="Nov  4 2021 00:13:39.677" level="info" msg="Starting NGINX Controller (Go) Agent. Version: 3.21.8-392329409.release-3-21..." feature="main"
time="Nov  4 2021 00:13:39.683" level="info" msg="Number of NGINX instances discovered" count="1" feature="main"
time="Nov  4 2021 00:13:41.210" level="info" msg="Running 11 agent feature(s)"
time="Nov  4 2021 00:13:41.210" level="info" msg="Loading avrdmgmt"
time="Nov  4 2021 00:13:41.210" level="info" msg="Loading eventsmgr"
time="Nov  4 2021 00:13:41.210" level="info" msg="Loading cloudcfgmgmt"
time="Nov  4 2021 00:13:41.210" level="info" msg="Loading metrics-aggregator"
time="Nov  4 2021 00:13:41.210" level="info" msg="Loading meta"
time="Nov  4 2021 00:13:41.210" level="info" msg="Loading commander"
time="Nov  4 2021 00:13:41.210" level="info" msg="Loading nginxmgmt"

@mattdesmarais
Copy link

PE Matthew mentioned this could be versioning mismatch between controller and agent which I agree

N+ R25 support for the metrics module is in 3.20.1 and greater.

@vincentmli
Copy link
Author

PE Matthew mentioned this could be versioning mismatch between controller and agent which I agree

N+ R25 support for the metrics module is in 3.20.1 and greater.

@mattdesmarais @brianehlert
the problem is the docker build seems always pull the newest nginx plus version, in this case R25, but customer could be running an old controller version < 3.20.1, this result in mismatch, maybe the Dockerfile could be modified to allow build nginx plus version in image to match controller version? or add documents to always run most recent controller version in order to use docker build image?

@brianehlert
Copy link
Collaborator

If there is need for a documentation enhancement around versions ( here for example https://github.com/nginxinc/docker-nginx-controller#21-building-an-nginx-controller-enabled-image-with-nginx-plus ) - please fork create the pull request.

I thought version pinning of NGINX Plus was an option at one time. Since you are internal, please reach out to the PM team with your request.
However, modifying your clone of the examples and setting a version on line 30 here: https://github.com/nginxinc/docker-nginx-controller/blob/master/ubuntu/examples/nginx-plus/Dockerfile

Could satisfy your need to version pin nginx plus.

@vincentmli
Copy link
Author

@mattdesmarais Do we have some sort of matrix with controller version and nginx plus version match? give a command line environment variable with docker build command should do it, like https://vsupalov.com/docker-build-pass-environment-variables/

@janibashamd
Copy link

janibashamd commented Feb 21, 2022

Facing this issue with latest R26 N+ instance and 3.22.1 controller combination.

So it will be great if we can prioritize this issue..
bug1
Uploading bug2.JPG…

@sorinboia
Copy link

I am hitting the same thing when building with NAP even though the Dockerfile ARG mentions that version 25 should be installed the part that installs NAP has no version and by default installs the latest and oversides anything previously installed by the agent and the Nginx plus version.

@janibashamd
Copy link

janibashamd commented Mar 3, 2022

As per my knowledge latest controller release 3.22.1 supports only till R25 and dev updated there will be a new controller release 3.22.2 planned which supports R26..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants