-
Notifications
You must be signed in to change notification settings - Fork 908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding more client fail #4678
Comments
Hi Camille, Thanks for raising this. Could you please test this example using the newer The code that you are using is not longer supported. |
Hi, Thanks for your answer. The code you gave me run perfectly (with python 3.10). How can I adapt the pytorch example to run these separately and see if my original bug reproduces ? |
Hi, You could test this example: https://github.com/adap/flower/tree/main/examples/app-pytorch. It depends on the newer Flower versions and does what I think you want to accomplish. Note that you should try to run it in "Deployment Mode" as specified in README. Then, you can also launch this in a real-world setting if that is an end goal. However, it also works just to open up different terminals. It this solves your problem, please feel free to close this issue. |
Thanks for your answer
And yes, my end goal is to deploy flower for experiments on different machines or different docker containers if you want more context. |
Hm that is strange. Let me test it later today and get back to you asap. |
Hi Camille, I am still working on this, rest assure I did not forget about this issue. |
Hi William, Thanks ! I'm still trying to solve it too, if I make any advance, I will share it. |
Hi Camille, Thank you for your patience. Here are 2 alternative examples on how to run Steps (Quick Way)Ensure that you have Thereafter, you can start the flower-superlink --insecure Then, we launch the clients which consists of the flower-supernode --insecure \
--superlink="127.0.0.1:9092" \
--clientappio-api-address="0.0.0.0:9094" \
--node-config="num-partitions=2 partition-id=0" Then, launch the second one using a different port for the flower-supernode --insecure \
--superlink="127.0.0.1:9092" \
--clientappio-api-address="0.0.0.0:9095" \
--node-config="num-partitions=2 partition-id=1"
Then run federation `flwr run . <FEDERATION NAME>. --stream` This command initiates the federated learning run using the configuration specified in your pyproject.toml under the [tool.flwr.federations.embedded-federation](https://github.com/adap/flower/blob/47d7228b04b6c755860bdbe7e9276d5b500fc04a/examples/embedded-devices/pyproject.toml) section. This can be changed to
```bash
[tool.flwr.federations.local-deployment]
address = "127.0.0.1:9093"
insecure = true You will need to configure the data pipeline to make sure that it fits this deployment.Prerequisites (Docker):
Steps (Docker Example)You can then create a new Flower project tailored for PyTorch: flwr new quickstart-docker --framework PyTorch --username flower
# and naviate to the project directory
cd quickstart-docker What you then want to do is establishing a Docker bridge network named docker network create --driver bridge flwr-network When this is done, you can deploy the superlink which coordinates communication between the server and clients. docker run --rm \
-p 9091:9091 -p 9092:9092 -p 9093:9093 \
--network flwr-network \
--name superlink \
--detach \
flwr/superlink:1.13.1 \
--insecure \
--isolation process Then, what you want to do is to initiate the supernodes, each representing a client in the federated learning setup. For the first SuperNode: docker run --rm \
-p 9094:9094 \
--network flwr-network \
--name supernode-1 \
--detach \
flwr/supernode:1.13.1 \
--insecure \
--superlink superlink:9092 \
--node-config "partition-id=0 num-partitions=2" \
--clientappio-api-address 0.0.0.0:9094 \
--isolation process For the second SuperNode: docker run --rm \
-p 9095:9095 \
--network flwr-network \
--name supernode-2 \
--detach \
flwr/supernode:1.13.1 \
--insecure \
--superlink superlink:9092 \
--node-config "partition-id=1 num-partitions=2" \
--clientappio-api-address 0.0.0.0:9095 \
--isolation process When that is complete, you can build and run a ServerApp. Create a Dockerfile named FROM flwr/serverapp:1.13.1
WORKDIR /app
COPY pyproject.toml .
RUN sed -i 's/.*flwr\[simulation\].*//' pyproject.toml \
&& python -m pip install -U --no-cache-dir .
ENTRYPOINT ["flwr-serverapp"] then build the ServerApp image: docker build -f serverapp.Dockerfile -t flwr_serverapp:0.0.1 . Thereafter, run the ServerApp container: docker run --rm \
--network flwr-network \
--name serverapp \
--detach \
flwr_serverapp:0.0.1 \
--insecure \
--serverappio-api-address superlink:9091 After you have built the ServerApp, you can create a Dockerfile named FROM flwr/clientapp:1.13.1
WORKDIR /app
COPY pyproject.toml .
RUN sed -i 's/.*flwr\[simulation\].*//' pyproject.toml \
&& python -m pip install -U --no-cache-dir .
ENTRYPOINT ["flwr-clientapp"] Build the ClientApp image: docker build -f clientapp.Dockerfile -t flwr_clientapp:0.0.1 . Run the ClientApp containers in other terminals, connecting them to the respective SuperNodes. For the first ClientApp: docker run --rm \
--network flwr-network \
--detach \
flwr_clientapp:0.0.1 \
--insecure \
--clientappio-api-address supernode-1:9094 For the second ClientApp: docker run --rm \
--network flwr-network \
--detach \
flwr_clientapp:0.0.1 \
--insecure \
--clientappio-api-address supernode-2:9095 Now, you want to execute the federated learning run. Add the following configuration to [tool.flwr.federations.local-deployment]
address = "127.0.0.1:9093"
insecure = true Then, initiate the federated learning process: flwr run . local-deployment --stream To implement changes, you can update the code accordingly Python files accordingly. Then, rebuild the Docker images and restart the services to apply the changes. Hope this helps! If so, please go ahead and close this issue. |
Hi @Camille-Molinier, All the best |
Hi William, I'll go back to you when I've completed this. |
Describe the bug
Hello,
It seeks to run a flower server and some clients on different terminal. I use a FedAvg server with all default parameter and the client implemented on the tutorial in the doc.
If I run two clients, the process goes well, but if I run three clients, the last one fail.
The only way to avoid the error is to set min_availbable_clients or min_fit_clients to three, but now I can't run four clients.
Can someone help me to understand this error ?
Steps/Code to Reproduce
server.py :
client.py :
Dependencies :
Expected Results
I want to add random number of clients on separated terminal
Actual Results
The text was updated successfully, but these errors were encountered: