Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consul health check breaks webservice-deviceserver? #20

Open
mkowalczyk88 opened this issue Dec 30, 2016 · 5 comments
Open

Consul health check breaks webservice-deviceserver? #20

mkowalczyk88 opened this issue Dec 30, 2016 · 5 comments

Comments

@mkowalczyk88
Copy link

Maybe I missed something obvious in configuration but the Device Server doesn't work for me. I use Ubuntu 16.04 and configured everything as described in Readme file of this project (running docker-composer). I've then tried to run tests from https://github.com/CreatorDev/creator-js-client and both have failed.

Here's my investigation:
Firstly, the assumption: Fabio can only route to services that are "passing" Consul's health check.
Now, accessing Webservice sometimes results with error 404 and sometimes with error 502. If the status of webservice-deviceserver in consul is "not passing" then Fabio reports "no route to host" and I'm getting 404. When the route is found (webserivce-deviceserver passes health check) I'm getting error 502 and Fabio reports http request error "EOF". Note, if the Consul's health check fails then Fabio's route to it is simply removed.
Consul checks webservice-deviceserver every 2 seconds by issuing GET request to "/". When sniffing with Wireshark I could observe that some requests are accepted by webservice and then proper JSON with "links" is returned, but sometimes the connection itself is refused (webservice responds with tcp RST packet). In that case, the "EOF" (which is not EOF but simply connection refused error) is seen in logs of Consul and Fabio.
When issuing POST request from creator-js-client tests similar problem is observed. If the route in Fabio is present then connection to webservice is closed just after establishing it (FIN packet just after ACK-SYN-ACK packets is sent). If the route is not present at all I'm getting error 404.
I did a little hack to verify the behavior. I've modified Registrar env variables in docker-composer.yml that way so it doesn't use request to "/" as a health check but instead a dummy script that always succeeds. When webservice is not spammed with health check requests suddenly the creator-js-client tests are passing. However, the first try of tests still fails (FIN packet sent from webservice just after connection) all other requests are handled properly.
I have no idea how the ASP Net works internally, so please have a look/comment. From my observation the conclusion is that the webservice-deviceserver refuses TCP connections some times, especially when it is spammed with Consul health checks.

@boyvinall
Copy link
Member

Hi @mkmk88, this sounds a little strange as we (obviously) don't get this behaviour. Couple of questions:

@mkowalczyk88
Copy link
Author

Hi @boyvinall. When I get 502 there is nothing special in docker logs of webservice-deviceserver. Maybe I can enable some additional debug logs there? Regarding CPU and RAM utilization: I didn't check, but I run this on native Ubuntu on a PC with 16GB of RAM and i7. There is nothing else significant run in the same time. Also, I'm pretty sure I did everything as described in https://github.com/CreatorDev/DeviceServer/blob/master/doc/devServerInstallation.md. Note, when I disable health check I'm able to use Device Server, so I guess all certificates etc. are set correctly. One thing i didn't mention, but not sure if relevant: I use self-signed certificates for nginx's SSL.
Tomorrow I'll do everything again from scratch on the other machine and let you know if I still see this problem.

@mkowalczyk88
Copy link
Author

Unfortunately, I observe the same behavior on the other machine (this time with Ubuntu 16.04 run as VM). I did everything as in devServerInstalation.md, however this time i couldn't also verify LWM2MServer.pem and LWM2MBootstrap.pem. On my native Ubuntu the "Verify the bootstrap and server certificates" step was OK. Anyway, I didn't run LWM2M stuff yet. The webservice-deviceserver problem I'm describing here seems to be something not related to those certificates.

@boyvinall
Copy link
Member

Hi @mkmk88, I realised this morning that the 502 is probably coming from nginx. I was about to spend a little time going through the setup notes once again but I got hit with some other bits, sorry. I'll try to work through the notes in the next day or so and let you know. Otherwise, the only thing I can think of at the moment is maybe the hostname you're using to hit the API is not the same as nginx is configured for?

@mkowalczyk88
Copy link
Author

The hostname is the same.. note, when I disable the health check - whole thing works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants