You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello I want to try and test Yatai with bentoml.
Idk if this is bentoml related or Yatai...
I containerize my bentofile and want to test the api.
After launching the localhost:3000 everything works fine...
But after calling /readyz my docker desktop crashed.
I have also the same problem on my Yatai instance after deploying a service.
Yatai Log
`[2023-06-14 13:40:01] [Pod] [governance-b5dcf944c-8rqdz] [Created] Created container main
[2023-06-14 13:40:01] [Pod] [governance-b5dcf944c-8rqdz] [Started] Started container main
[2023-06-14 13:40:06] [Pod] [governance-b5dcf944c-8rqdz] [Unhealthy] Liveness probe errored: rpc error: code = Unknown desc = container not running (b50e5f47871d15a73d1a10f593ffa07c42336a90aaf6406221c069d06a323250)
[2023-06-14 13:40:06] [Pod] [governance-b5dcf944c-8rqdz] [Unhealthy] Readiness probe errored: rpc error: code = Unknown desc = container not running (b50e5f47871d15a73d1a10f593ffa07c42336a90aaf6406221c069d06a323250)
[2023-06-14 13:40:17] [HorizontalPodAutoscaler] [governance] [FailedGetResourceMetric] failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
[2023-06-14 13:40:17] [HorizontalPodAutoscaler] [governance] [FailedComputeMetricsReplicas] invalid metrics (1 invalid out of 1), first error is: failed to get cpu resource metric value: failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
[2023-06-14 13:40:21] [Pod] [governance-runner-0-85465c6b86-h2r8t] [Pulled] Container image "127.0.0.1:5000/yatai-bentos:yatai.governance_classifier.hcgdqdakukp2yaav" already present on machine
[2023-06-14 13:40:21] [Pod] [governance-runner-0-85465c6b86-h2r8t] [Created] Created container main
[2023-06-14 13:40:21] [Pod] [governance-runner-0-85465c6b86-h2r8t] [Started] Started container main
[2023-06-14 13:40:26] [Pod] [governance-runner-0-85465c6b86-h2r8t] [Unhealthy] Readiness probe failed: Get "http://10.244.0.37:3000/readyz": dial tcp 10.244.0.37:3000: connect: connection refused
[2023-06-14 13:40:27] [Pod] [governance-runner-0-85465c6b86-bnddg] [Pulled] Container image "127.0.0.1:5000/yatai-bentos:yatai.governance_classifier.hcgdqdakukp2yaav" already present on machine
[2023-06-14 13:40:27] [Pod] [governance-runner-0-85465c6b86-bnddg] [Created] Created container main
[2023-06-14 13:40:27] [Pod] [governance-runner-0-85465c6b86-bnddg] [Started] Started container main
[2023-06-14 13:42:21] [Pod] [governance-runner-0-85465c6b86-h2r8t] [BackOff] Back-off restarting failed container main in pod governance-runner-0-85465c6b86-h2r8t_yatai(9f6cb8f8-e592-4fbb-aea3-89e969bdfc72)
[2023-06-14 13:42:31] [Pod] [governance-b5dcf944c-8rqdz] [BackOff] Back-off restarting failed container main in pod governance-b5dcf944c-8rqdz_yatai(89e948b9-effb-4f82-82e4-4bcd426a6b88)
[2023-06-14 13:42:32] [HorizontalPodAutoscaler] [governance] [FailedGetResourceMetric] failed to get cpu utilization: did not receive metrics for any ready pods
[2023-06-14 13:42:41] [Pod] [governance-runner-0-85465c6b86-bnddg] [BackOff] Back-off restarting failed container main in pod governance-runner-0-85465c6b86-bnddg_yatai(196d4f8f-88a4-4241-a19e-c6836289e3de)
[2023-06-14 13:45:55] [BentoDeployment] [governance] [GetDeployment] Getting Deployment yatai/governance-runner-0`
Docker Log
```2023-06-14T11:53:19+0000 [ERROR] [api_server:10] Exception in ASGI application
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/uvicorn/protocols/http/h11_impl.py", line 428, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "/usr/local/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__
return await self.app(scope, receive, send)
File "/usr/local/lib/python3.9/site-packages/uvicorn/middleware/message_logger.py", line 86, in __call__
raise exc from None
File "/usr/local/lib/python3.9/site-packages/uvicorn/middleware/message_logger.py", line 82, in __call__
await self.app(scope, inner_receive, inner_send)
File "/usr/local/lib/python3.9/site-packages/starlette/applications.py", line 122, in __call__
await self.middleware_stack(scope, receive, send)
File "/usr/local/lib/python3.9/site-packages/starlette/middleware/errors.py", line 184, in __call__
raise exc
File "/usr/local/lib/python3.9/site-packages/starlette/middleware/errors.py", line 162, in __call__
await self.app(scope, receive, _send)
File "/usr/local/lib/python3.9/site-packages/bentoml/_internal/server/http/traffic.py", line 26, in __call__
await self.app(scope, receive, send)
File "/usr/local/lib/python3.9/site-packages/bentoml/_internal/server/http/instruments.py", line 176, in __call__
await self.app(scope, receive, wrapped_send)
File "/usr/local/lib/python3.9/site-packages/opentelemetry/instrumentation/asgi/__init__.py", line 579, in __call__
await self.app(scope, otel_receive, otel_send)
File "/usr/local/lib/python3.9/site-packages/bentoml/_internal/server/http/access.py", line 126, in __call__
await self.app(scope, receive, wrapped_send)
File "/usr/local/lib/python3.9/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/usr/local/lib/python3.9/site-packages/starlette/_exception_handler.py", line 57, in wrapped_app
raise exc
File "/usr/local/lib/python3.9/site-packages/starlette/_exception_handler.py", line 46, in wrapped_app
await app(scope, receive, sender)
File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 727, in __call__
await route.handle(scope, receive, send)
File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 285, in handle
await self.app(scope, receive, send)
File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 74, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File "/usr/local/lib/python3.9/site-packages/starlette/_exception_handler.py", line 57, in wrapped_app
raise exc
File "/usr/local/lib/python3.9/site-packages/starlette/_exception_handler.py", line 46, in wrapped_app
await app(scope, receive, sender)
File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 69, in app
response = await func(request)
File "/usr/local/lib/python3.9/site-packages/bentoml/_internal/server/http_app.py", line 286, in readyz
runners_ready = all(await asyncio.gather(*runner_statuses))
File "/usr/local/lib/python3.9/site-packages/bentoml/_internal/runner/runner.py", line 156, in runner_handle_is_ready
return await self._runner_handle.is_ready(timeout)
File "/usr/local/lib/python3.9/site-packages/bentoml/_internal/runner/runner_handle/remote.py", line 304, in is_ready
async with self._client.get(
File "/usr/local/lib/python3.9/site-packages/aiohttp/client.py", line 1141, in __aenter__
self._resp = await self._coro
File "/usr/local/lib/python3.9/site-packages/aiohttp/client.py", line 560, in _request
await resp.start(conn)
File "/usr/local/lib/python3.9/site-packages/aiohttp/client_reqrep.py", line 914, in start
self._continue = None
File "/usr/local/lib/python3.9/site-packages/aiohttp/helpers.py", line 721, in __exit__
raise asyncio.TimeoutError from None
asyncio.exceptions.TimeoutError
2023-06-14T11:53:20+0000 [WARNING] [runner:governance:1] No training configuration found in save file, so the model was *not* compiled. Compile it manually.```
The text was updated successfully, but these errors were encountered:
Hello I want to try and test Yatai with bentoml.
Idk if this is bentoml related or Yatai...
I
containerize
my bentofile and want to test the api.After launching the
localhost:3000
everything works fine...But after calling
/readyz
my docker desktop crashed.I have also the same problem on my Yatai instance after deploying a service.
Yatai Log
`[2023-06-14 13:40:01] [Pod] [governance-b5dcf944c-8rqdz] [Created] Created container main [2023-06-14 13:40:01] [Pod] [governance-b5dcf944c-8rqdz] [Started] Started container main [2023-06-14 13:40:06] [Pod] [governance-b5dcf944c-8rqdz] [Unhealthy] Liveness probe errored: rpc error: code = Unknown desc = container not running (b50e5f47871d15a73d1a10f593ffa07c42336a90aaf6406221c069d06a323250) [2023-06-14 13:40:06] [Pod] [governance-b5dcf944c-8rqdz] [Unhealthy] Readiness probe errored: rpc error: code = Unknown desc = container not running (b50e5f47871d15a73d1a10f593ffa07c42336a90aaf6406221c069d06a323250) [2023-06-14 13:40:17] [HorizontalPodAutoscaler] [governance] [FailedGetResourceMetric] failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API [2023-06-14 13:40:17] [HorizontalPodAutoscaler] [governance] [FailedComputeMetricsReplicas] invalid metrics (1 invalid out of 1), first error is: failed to get cpu resource metric value: failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API [2023-06-14 13:40:21] [Pod] [governance-runner-0-85465c6b86-h2r8t] [Pulled] Container image "127.0.0.1:5000/yatai-bentos:yatai.governance_classifier.hcgdqdakukp2yaav" already present on machine [2023-06-14 13:40:21] [Pod] [governance-runner-0-85465c6b86-h2r8t] [Created] Created container main [2023-06-14 13:40:21] [Pod] [governance-runner-0-85465c6b86-h2r8t] [Started] Started container main [2023-06-14 13:40:26] [Pod] [governance-runner-0-85465c6b86-h2r8t] [Unhealthy] Readiness probe failed: Get "http://10.244.0.37:3000/readyz": dial tcp 10.244.0.37:3000: connect: connection refused [2023-06-14 13:40:27] [Pod] [governance-runner-0-85465c6b86-bnddg] [Pulled] Container image "127.0.0.1:5000/yatai-bentos:yatai.governance_classifier.hcgdqdakukp2yaav" already present on machine [2023-06-14 13:40:27] [Pod] [governance-runner-0-85465c6b86-bnddg] [Created] Created container main [2023-06-14 13:40:27] [Pod] [governance-runner-0-85465c6b86-bnddg] [Started] Started container main [2023-06-14 13:42:21] [Pod] [governance-runner-0-85465c6b86-h2r8t] [BackOff] Back-off restarting failed container main in pod governance-runner-0-85465c6b86-h2r8t_yatai(9f6cb8f8-e592-4fbb-aea3-89e969bdfc72) [2023-06-14 13:42:31] [Pod] [governance-b5dcf944c-8rqdz] [BackOff] Back-off restarting failed container main in pod governance-b5dcf944c-8rqdz_yatai(89e948b9-effb-4f82-82e4-4bcd426a6b88) [2023-06-14 13:42:32] [HorizontalPodAutoscaler] [governance] [FailedGetResourceMetric] failed to get cpu utilization: did not receive metrics for any ready pods [2023-06-14 13:42:41] [Pod] [governance-runner-0-85465c6b86-bnddg] [BackOff] Back-off restarting failed container main in pod governance-runner-0-85465c6b86-bnddg_yatai(196d4f8f-88a4-4241-a19e-c6836289e3de) [2023-06-14 13:45:55] [BentoDeployment] [governance] [GetDeployment] Getting Deployment yatai/governance-runner-0`Docker Log
```2023-06-14T11:53:19+0000 [ERROR] [api_server:10] Exception in ASGI application Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/uvicorn/protocols/http/h11_impl.py", line 428, in run_asgi result = await app( # type: ignore[func-returns-value] File "/usr/local/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__ return await self.app(scope, receive, send) File "/usr/local/lib/python3.9/site-packages/uvicorn/middleware/message_logger.py", line 86, in __call__ raise exc from None File "/usr/local/lib/python3.9/site-packages/uvicorn/middleware/message_logger.py", line 82, in __call__ await self.app(scope, inner_receive, inner_send) File "/usr/local/lib/python3.9/site-packages/starlette/applications.py", line 122, in __call__ await self.middleware_stack(scope, receive, send) File "/usr/local/lib/python3.9/site-packages/starlette/middleware/errors.py", line 184, in __call__ raise exc File "/usr/local/lib/python3.9/site-packages/starlette/middleware/errors.py", line 162, in __call__ await self.app(scope, receive, _send) File "/usr/local/lib/python3.9/site-packages/bentoml/_internal/server/http/traffic.py", line 26, in __call__ await self.app(scope, receive, send) File "/usr/local/lib/python3.9/site-packages/bentoml/_internal/server/http/instruments.py", line 176, in __call__ await self.app(scope, receive, wrapped_send) File "/usr/local/lib/python3.9/site-packages/opentelemetry/instrumentation/asgi/__init__.py", line 579, in __call__ await self.app(scope, otel_receive, otel_send) File "/usr/local/lib/python3.9/site-packages/bentoml/_internal/server/http/access.py", line 126, in __call__ await self.app(scope, receive, wrapped_send) File "/usr/local/lib/python3.9/site-packages/starlette/middleware/exceptions.py", line 62, in __call__ await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send) File "/usr/local/lib/python3.9/site-packages/starlette/_exception_handler.py", line 57, in wrapped_app raise exc File "/usr/local/lib/python3.9/site-packages/starlette/_exception_handler.py", line 46, in wrapped_app await app(scope, receive, sender) File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 727, in __call__ await route.handle(scope, receive, send) File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 285, in handle await self.app(scope, receive, send) File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 74, in app await wrap_app_handling_exceptions(app, request)(scope, receive, send) File "/usr/local/lib/python3.9/site-packages/starlette/_exception_handler.py", line 57, in wrapped_app raise exc File "/usr/local/lib/python3.9/site-packages/starlette/_exception_handler.py", line 46, in wrapped_app await app(scope, receive, sender) File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 69, in app response = await func(request) File "/usr/local/lib/python3.9/site-packages/bentoml/_internal/server/http_app.py", line 286, in readyz runners_ready = all(await asyncio.gather(*runner_statuses)) File "/usr/local/lib/python3.9/site-packages/bentoml/_internal/runner/runner.py", line 156, in runner_handle_is_ready return await self._runner_handle.is_ready(timeout) File "/usr/local/lib/python3.9/site-packages/bentoml/_internal/runner/runner_handle/remote.py", line 304, in is_ready async with self._client.get( File "/usr/local/lib/python3.9/site-packages/aiohttp/client.py", line 1141, in __aenter__ self._resp = await self._coro File "/usr/local/lib/python3.9/site-packages/aiohttp/client.py", line 560, in _request await resp.start(conn) File "/usr/local/lib/python3.9/site-packages/aiohttp/client_reqrep.py", line 914, in start self._continue = None File "/usr/local/lib/python3.9/site-packages/aiohttp/helpers.py", line 721, in __exit__ raise asyncio.TimeoutError from None asyncio.exceptions.TimeoutError 2023-06-14T11:53:20+0000 [WARNING] [runner:governance:1] No training configuration found in save file, so the model was *not* compiled. Compile it manually.```The text was updated successfully, but these errors were encountered: