Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in model deployment #2

Open
Luismbpr opened this issue Mar 12, 2024 · 3 comments
Open

Error in model deployment #2

Luismbpr opened this issue Mar 12, 2024 · 3 comments

Comments

@Luismbpr
Copy link

Luismbpr commented Mar 12, 2024

—————
Python == 3.11.8

mlflow == 2.10.2
mlserver == 1.5.0
mlserver-mlflow == 1.5.0
MarkupSafe == 2.1.5
numpy == 1.26.4
pandas == 2.2.1
scikit-learn == 1.4.1.post1
tqdm == 4.66.2
zenml == 0.55.5
—————
I have been following the code of the video lecture. The previous versions of the pipeline ran well. That was until trying to deploy the model.
I have made several virtual environments and used different stacks (deleted one stack and created another one and set that up (The latest stack used was: mlflow_customer_02.
I still cannot make the deployment work.

This is the main error:

ConnectionError: HTTPConnectionPool(host='127.0.0.1', port=8237): Max retries exceeded with 
url: /api/v1/runs/d1673d8a-89aa-42c0-a805-53d3fa8f99ac?hydrate=True (Caused by 
NewConnectionError('<urllib3.connection.HTTPConnection object at 0x28f196350>: Failed to 
establish a new connection: [Errno 61] Connection refused'))

Tried to do this as well and did not work:

% zenml down
% zenml disconnect
% zenml up
% python run_deployment.py --config deploy

—————
A summary of the steps retrieved to show that the pipeline works until the deployment phase:

% python run_deployment.py --config deploy
Initiating a new run for the pipeline: continuous_deployment_pipeline.
Reusing registered pipeline version: (version: 13).
Executing a new run.
Caching is disabled by default for continuous_deployment_pipeline.
Using user: default
Using stack: mlflow_stack_customer_02
  model_deployer: mlflow_customer_02
  experiment_tracker: mlflow_tracker_customer_02
  orchestrator: default
  artifact_store: default
Step ingest_df has started.
Ingesting data from /Users/luis/Documents/.../venv_0754_FCC_MLOPS_MLProd_Projects_311_01/data/olist_customers_dataset_copy01.csv
Step ingest_df has finished in 2.512s.
Step clean_df has started.
Data cleaning completed
Step clean_df has finished in 1.542s.
Step train_model has started.
Model training completed
Model Trained Successfully
Step train_model has finished in 3.099s.
Step evaluate_model has started.
Calculating MSE
MSE: 1.864077053397548
Calculating R2 Score
R2 Score: 0.017729030402295565
Calculating RMSE
RMSE: 1.3653120717980736
Step evaluate_model has finished in 0.683s.
Step deployment_trigger has started.
Step deployment_trigger has finished in 0.095s.
Caching disabled explicitly for mlflow_model_deployer_step.
Step mlflow_model_deployer_step has started.
Calling stop method...
stop method executed successfully.
Updating an existing MLflow deployment service: MLFlowDeploymentService[577b7471-9979-487c-94fb-cc6ede12b61d] (type: model-serving, flavor: mlflow)
Calling stop method...
stop method executed successfully.
Calling start method...
⠏ Starting service 'MLFlowDeploymentService[577b7471-9979-487c-94fb-cc6ede12b61d] (type: 
model-serving, flavor: mlflow)'.

File "/Users/luis/miniforge3/envs/venv_0754_FCC_MLOPS_MLProd_Projects_311_02/lib/python3.11/site-packages/zenml/services/service.py", line 461, in start
    raise RuntimeError(
RuntimeError: Failed to start service MLFlowDeploymentService[577b7471-9979-487c-94fb-cc6ede12b61d] (type: model-serving, flavor: mlflow)
  Administrative state: active
  Operational state: inactive
  Last status message: 'service daemon is not running'
For more information on the service status, please see the following log file: /Users/luis/Library/Application Support/zenml/local_stores/19914fc0-6d0d-41d4-bca6-4924211935c1/577b7471-9979-487c-94fb-cc6ede12b61d/service.log

Retrying (Retry(total=9, connect=5, read=None, redirect=None, status=None)) after connection broken by 'RemoteDisconnected('Remote end closed connection without response')': /api/v1/steps/feeec1ee-8f5e-41ae-87f2-d803fd045f31

(…)

Retrying (Retry(total=9, connect=5, read=None, redirect=None, status=None)) after connection broken by 'RemoteDisconnected('Remote end closed connection without response')': /api/v1/steps/feeec1ee-8f5e-41ae-87f2-d803fd045f31

ConnectionError: HTTPConnectionPool(host='127.0.0.1', port=8237): Max retries exceeded with 
url: /api/v1/runs/d1673d8a-89aa-42c0-a805-53d3fa8f99ac?hydrate=True (Caused by 
NewConnectionError('<urllib3.connection.HTTPConnection object at 0x28f196350>: Failed to 
establish a new connection: [Errno 61] Connection refused'))

—————
Below is more stack information
—————

% zenml stack describe
COMPONENT_TYPE COMPONENT_NAME
MODEL_DEPLOYER mlflow_customer_02
EXPERIMENT_TRACKER mlflow_tracker_customer_02
ORCHESTRATOR default
ARTIFACT_STORE default

'mlflow_stack_customer_02' stack (ACTIVE)
Stack 'mlflow_stack_customer_02' with id 'c314644e-6abc-45a8-b8fa-271fff858b6c' is
owned by user default.
Dashboard URL:
http://127.0.0.1:8237/workspaces/default/stacks/c314644e-6abc-45a8-b8fa-271fff858b
6c/configuration

—————

% zenml status

-----ZenML Server Status-----
Connected to a ZenML server: 'http://127.0.0.1:8237'
The active user is: 'default'
The active workspace is: 'default' (repository)
The active stack is: 'mlflow_stack_customer_02' (repository)
Active repository root: /Users/luis/Documents/.../venv_0754_FCC_MLOPS_MLProd_Projects_311_02
Using configuration from: '/Users/luis/Library/Application Support/zenml'
Local store files are located at: '/Users/luis/Library/Application
Support/zenml/local_stores'
The status of the local dashboard:

| ZenML server 'local' | |
| URL | http://127.0.0.1:8237 |
| STATUS | ✅ |
| STATUS_MESSAGE | |
| CONNECTED | ✅ |

—————

% zenml stack list
ACTIVE STACK NAME STACK ID OWNER MODEL_DEPLOYER EXPERIMENT_TRACKER ORCHESTRATOR ARTIFACT_STORE
👉 mlflow_stack_customer_02 c314644e-6abc-45a8-b8fa-271fff858b6c default mlflow_customer_02 mlflow_tracker_customer_02 default default
default aeff7473-997f-47a9-87fd-9d771f7543b6 - default default
mlflow_stack_customer 6a772157-30ec-463b-999f-10299ce3ec95 default mlflow_customer mlflow_tracker_customer default default

—————

% zenml logs
INFO:     127.0.0.1:50527 - "GET 
/api/v1/steps?hydrate=False&sort_by=created&logical_operator=and&page=1&size=20&scope_workspac
e=fd2a5d49-22cc-4dc8-a986-fa27bc93b88d&pipeline_run_id=d1673d8a-89aa-42c0-a805-53d3fa8f99ac 
HTTP/1.1" 200 OK

INFO:     127.0.0.1:50527 - "POST /api/v1/steps HTTP/1.1" 200 OK
objc[5368]: +[__NSCFConstantString initialize] may have been in progress in another thread 
when fork() was called.

objc[5368]: +[__NSCFConstantString initialize] may have been in progress in another thread 
when fork() was called. We cannot safely call it or ignore it in the fork() child process. 
Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.

—————

@Luismbpr
Copy link
Author

Luismbpr commented Mar 15, 2024

—————
Python == 3.9.18 -> Seems to be working

mlflow == 2.10.2
mlserver == 1.5.0
mlserver-mlflow == 1.5.0
MarkupSafe == 2.1.5
numpy == 1.26.4
pandas == 2.2.1
scikit-learn == 1.4.1.post1
tqdm == 4.66.2
zenml == 0.55.5

—————

1.1) I did try to install those versions (first by bash pip install -r requirements.txt) and did not work.
1.2) Then tried installing one by one and also could not do it. Pip installer did not let me install those versions

  1. I did the zenml disconnect, zenml down, zenml up many times and never got it to work.

  2. Tried creating different stacks, experiment-trackers, model-deployers and set them up to be the ones working. Tried this many times

  3. Something that seemed to work but not entirely sure was using those two pieces of code on the
    https://stackoverflow.com/questions/52671926/rails-may-have-been-in-progress-in-another-thread-when-fork-was-called

Was appending these two lines of code on the .zshrc file

% vim ~/.zshrc 
appending those two lines of code:
 ## for MLOPS deployment
export DISABLE_SPRING=true
export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES
% source ~/.zshrc

Then creating a new stack, experiment-tracker, model-deployer and setting them.

I am still not sure what was the piece that made it work. I have not finished the course (almost done now) but so far it seems to be working, or at least not displaying any errors.

Note: I found that stackoverflow post since the zenml logs were giving me a similar error to what one of the users from that post was having

This was a copy from that Stack Overflow post:
bjc[81924]: +[__NSPlaceholderDictionary initialize] may have been in progress in another thread when fork() was called.
objc[81924]: +[__NSPlaceholderDictionary initialize] may have been in progress in another thread when fork() was called.

Side Note:

@typhonshambo
Copy link

Hi there i was facing the same issue Make sure

  • you have setup the virtual-env correctly
  • now run this commands in virtual-env terminal
zenml disconnect
zenml down
zenml clear
  • After the zenml server is finally cleared and down
  • Close the terminal and open two different terminals one to run your python file and another to run your zenml server, make sure that you activate your virtual-env in both.
  • I would highly recommend using external terminals like command prompt (windows) or terminal (mac) to run the zenml server, and for running the python file you can use the normal vs-code terminal or any other IDE's that you using
  • After you have setup all the terminals properly
  • run the run_pipeline.py file from vscode, after its completely done with all the operations
  • Go to that another external terminal and run your zenml server by zenml up
  • Don't run zenml up before the python file

Following this resolved my error :

ConnectionError: HTTPConnectionPool(host='127.0.0.1', port=8237): Max retries exceeded with 
url: /api/v1/runs/d1673d8a-89aa-42c0-a805-53d3fa8f99ac?hydrate=True (Caused by 
NewConnectionError('<urllib3.connection.HTTPConnection object at 0x28f196350>: Failed to 
establish a new connection: [Errno 61] Connection refused'))

Another thing

Make sure you don't have pandas and numpy in your requirements.txt as it already comes with zenml, so reinstalling might cause some version issue

@Luismbpr
Copy link
Author

Thank you for the info. As mentioned above I solved it and I think it was due to appending this on the .zshrc file:

export DISABLE_SPRING=true
export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES

Although I am not entirely sure if those were the solutions since I did everything you mentioned previously as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants