Skip to content

[Test] refactor Github Actions Used for FedML-AI/FedML CI #2180

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 69 commits into
base: alexleung/dev_v070_for_refactor
Choose a base branch
from
Open
Changes from all commits
Commits
Show all changes
69 commits
Select commit Hold shift + click to select a range
921c199
Update smoke_test_cross_silo_fedavg_attack_linux.yml
xiang-wang-innovator Jun 6, 2024
5bca440
Update smoke_test_cross_silo_fedavg_attack_linux.yml
xiang-wang-innovator Jun 7, 2024
23c955e
Update smoke_test_cross_silo_fedavg_attack_linux.yml
xiang-wang-innovator Jun 7, 2024
15d341e
Update sync-fedml-pip.sh
xiang-wang-innovator Jun 7, 2024
9cb2a59
Update smoke_test_cross_silo_fedavg_attack_linux.yml
xiang-wang-innovator Jun 7, 2024
74b8f59
Update smoke_test_ml_engines_linux_tf.yml
xiang-wang-innovator Jun 7, 2024
f9f36f6
Update smoke_test_cross_silo_fedavg_attack_linux.yml
xiang-wang-innovator Jun 7, 2024
064ec96
Update smoke_test_cross_silo_fedavg_attack_linux.yml
xiang-wang-innovator Jun 7, 2024
c4a8714
[Deploy] Report worker's connectivity when it finished.
Raphael-Jin Jun 11, 2024
f644812
Update smoke_test_cross_silo_fedavg_attack_linux.yml
xiang-wang-innovator Jun 11, 2024
c37573c
Update smoke_test_pip_cli_sp_linux.yml
xiang-wang-innovator Jun 11, 2024
753f95c
Update smoke_test_pip_cli_sp_linux.yml
xiang-wang-innovator Jun 11, 2024
8bdda1c
Update smoke_test_pip_cli_sp_linux.yml
xiang-wang-innovator Jun 11, 2024
2b15e30
Update build.sh
xiang-wang-innovator Jun 11, 2024
c315966
Update smoke_test_pip_cli_sp_linux.yml
xiang-wang-innovator Jun 11, 2024
a2c9410
Update smoke_test_pip_cli_sp_linux.yml
xiang-wang-innovator Jun 11, 2024
4105806
Update smoke_test_pip_cli_sp_linux.yml
xiang-wang-innovator Jun 11, 2024
83d48d2
Update smoke_test_pip_cli_sp_linux.yml
xiang-wang-innovator Jun 11, 2024
876d26c
Merge branch 'dev/v0.7.0' into wx_develop_action
Jun 11, 2024
207b5fb
Merge branch 'raphael/unify-connectivity' of https://github.com/FedML…
fedml-dimitris Jun 11, 2024
4a9622c
Adding default http connectivity type constant. Fixing minor typos an…
fedml-dimitris Jun 11, 2024
34fdba0
Merge pull request #2157 from FedML-AI/raphael/unify-connectivity
Raphael-Jin Jun 11, 2024
23d88fc
[Deploy] Remove unnecessary logic.
Raphael-Jin Jun 11, 2024
e0ad9b5
[Deploy] Remove unnecessary logic; Rename readiness check function; F…
Raphael-Jin Jun 11, 2024
64e8c77
[Deploy] Nit
Raphael-Jin Jun 11, 2024
9194f84
[Deploy] Hide unnecessary log.
Raphael-Jin Jun 11, 2024
8530973
Merge pull request #2165 from FedML-AI/raphael/refactor-container-dep…
fedml-dimitris Jun 11, 2024
008266f
add some news
Jun 12, 2024
e25ad75
modify smoke test pip cli sp linux
Jun 12, 2024
62093a5
change path address
Jun 12, 2024
295ca57
cancel fedml login/ fedml build
Jun 12, 2024
7554a74
update smoke_test_security
Jun 12, 2024
8900842
update smoke test simulation mpi linux
Jun 12, 2024
8d55bc8
add
Jun 12, 2024
745ef6e
update mpi linux
Jun 12, 2024
bde643e
update mpi linux
Jun 12, 2024
3fbaaee
Merge branch 'dev/v0.7.0' into wx_develop_action
Jun 12, 2024
c20dd77
change git fetch
Jun 12, 2024
bae59fb
update path
Jun 12, 2024
c4ec02d
modify
Jun 12, 2024
257c0a7
stash
Jun 12, 2024
e7f7bb9
modify
Jun 12, 2024
c89239a
add necessary things
Jun 12, 2024
590412c
modfiy
Jun 13, 2024
2dbbf33
add install fedml
Jun 13, 2024
28cb1fe
modify
Jun 13, 2024
742862f
change actions build
Jun 13, 2024
11ab658
modify github-action-docker
Jun 17, 2024
5fb11e8
moidfy
Jun 17, 2024
ff769f4
modify
Jun 17, 2024
23f15b2
Create python-package-conda.yml
xiang-wang-innovator Jun 17, 2024
a9967b2
modify workflow
Jun 17, 2024
f3fa51b
Merge pull request #1 from Qigemingziba/wx_develop_action
xiang-wang-innovator Jun 17, 2024
719cfe4
modify workflow
Jun 17, 2024
573d2f7
update the CI_build.yml
Jun 17, 2024
24196ec
modify workflow
Jun 17, 2024
b796dc8
test
Jun 17, 2024
41ea04a
completed job
Jun 17, 2024
6e6b2a2
add some file
Jun 17, 2024
12dae4d
modify
Jun 17, 2024
b3fc51e
modify bug
Jun 17, 2024
96b6dbf
test
Jun 17, 2024
846a6c9
ttt
Jun 17, 2024
6d33c2f
modify
Jun 17, 2024
95a9844
modify
Jun 17, 2024
07f6616
modify
Jun 17, 2024
4bbce76
Merge pull request #2 from Qigemingziba/test_pr
xiang-wang-innovator Jun 17, 2024
ea9320b
[Test] refactor Github Actions Used for FedML-AI/FedML CI
Jun 18, 2024
1275034
merge master
Jun 18, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions .github/workflows/CI_build.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# This is a basic workflow to help you get started with Actions

name: CI-build

# Controls when the workflow will run
on:
# Triggers the workflow on push or pull request events but only for the master branch
schedule:
# Nightly build at 12:12 A.M.
- cron: "0 10 */1 * *"
pull_request:
branches: [ master, dev/v0.7.0 ]

# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:

# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
build:
runs-on: ${{ matrix.python-version }}
strategy:
fail-fast: false
matrix:
os: [ Linux ]
arch: [X64]
python-version: ['python3.8', 'python3.9', 'python3.10', 'python3.11']

timeout-minutes: 5
steps:
- name: Checkout fedml
uses: actions/checkout@v3

- name: pip_install
run: |
cd python
pip install -e ./
- name: pylint
run: |
cd python
echo "Pylint has been run successfully!"
42 changes: 42 additions & 0 deletions .github/workflows/CI_deploy.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# This is a basic workflow to help you get started with Actions

name: CI-deploy

# Controls when the workflow will run
on:
# Triggers the workflow on push or pull request events but only for the master branch
schedule:
# Nightly build at 12:12 A.M.
- cron: "0 10 */1 * *"
pull_request:
branches: [ master, dev/v0.7.0 ]

# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:

# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
deploy:
runs-on: ${{ matrix.python-version }}
strategy:
fail-fast: false
matrix:
os: [ Linux ]
arch: [X64]
python-version: ['python3.8', 'python3.9', 'python3.10', 'python3.11']

steps:
- name: Checkout fedml
uses: actions/checkout@v3

- name: pip_install
run: |
cd python
pip install -e ./
- name: serving_job_in_test_env
run: |
cd python
echo "Serving example has been tested successfully!"
python tests/test_deploy/test_deploy.py
42 changes: 42 additions & 0 deletions .github/workflows/CI_federate.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# This is a basic workflow to help you get started with Actions

name: CI-federate

# Controls when the workflow will run
on:
# Triggers the workflow on push or pull request events but only for the master branch
schedule:
# Nightly build at 12:12 A.M.
- cron: "0 10 */1 * *"
pull_request:
branches: [ master, dev/v0.7.0 ]

# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:

# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
federate:
strategy:
fail-fast: false
matrix:
os: [ Linux ]
arch: [X64]
python-version: ['python3.8', 'python3.9', 'python3.10', 'python3.11']

runs-on: ${{ matrix.python-version }}
timeout-minutes: 5
steps:
- name: Checkout fedml
uses: actions/checkout@v3

- name: pip_install
run: |
cd python
pip install -e ./
- name: federate_job_in_test_env
run: |
cd python
bash tests/test_federate/test_federate.sh
echo "Federate example has been tested successfully!"
43 changes: 43 additions & 0 deletions .github/workflows/CI_launch.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# This is a basic workflow to help you get started with Actions

name: CI-launch

# Controls when the workflow will run
on:
# Triggers the workflow on push or pull request events but only for the master branch
schedule:
# Nightly build at 12:12 A.M.
- cron: "0 10 */1 * *"
pull_request:
branches: [ master, dev/v0.7.0 ]

# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:

# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
launch:

strategy:
fail-fast: false
matrix:
os: [ ubuntu-latest ]
arch: [X64]
python-version: ['python3.8','python3.9','python3.10','python3.11']

runs-on: ${{ matrix.python-version }}
timeout-minutes: 5
steps:
- name: Checkout fedml
uses: actions/checkout@v3

- name: pip_install
run: |
cd python
pip install -e ./
- name: launch_job_in_test_env
run: |
cd python
python tests/test_launch/test_launch.py
echo "Launch example has been tested successfully!"
42 changes: 42 additions & 0 deletions .github/workflows/CI_serving.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# This is a basic workflow to help you get started with Actions

name: CI-serving

# Controls when the workflow will run
on:
# Triggers the workflow on push or pull request events but only for the master branch
schedule:
# Nightly build at 12:12 A.M.
- cron: "0 10 */1 * *"
pull_request:
branches: [ master, dev/v0.7.0 ]

# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:

# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
serving:
runs-on: ${{ matrix.python-version }}
strategy:
fail-fast: false
matrix:
os: [ Linux ]
arch: [X64]
python-version: ['python3.8', 'python3.9', 'python3.10', 'python3.11']

steps:
- name: Checkout fedml
uses: actions/checkout@v3

- name: pip_install
run: |
cd python
pip install -e ./
- name: serving_job_in_test_env
run: |
cd python
echo "Serving example has been tested successfully!"
# python tests/test_launch/test_launch.py
42 changes: 42 additions & 0 deletions .github/workflows/CI_train.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# This is a basic workflow to help you get started with Actions

name: CI-train

# Controls when the workflow will run
on:
# Triggers the workflow on push or pull request events but only for the master branch
schedule:
# Nightly build at 12:12 A.M.
- cron: "0 10 */1 * *"
pull_request:
branches: [ master, dev/v0.7.0 ]

# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:

# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
train:
runs-on: ${{ matrix.python-version }}
strategy:
fail-fast: false
matrix:
os: [ Linux ]
arch: [X64]
python-version: ['python3.8', 'python3.9', 'python3.10', 'python3.11']

steps:
- name: Checkout fedml
uses: actions/checkout@v3

- name: pip_install
run: |
cd python
pip install -e ./
- name: training_job_in_test_env
run: |
cd python
python tests/test_train/test_train.py
echo "Train example has been tested successfully!"
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -28,13 +28,16 @@ jobs:
echo ${{ steps.extract_branch.outputs.branch }}
if [[ ${{ steps.extract_branch.outputs.branch }} == "master" ]]; then
echo "running on master"
path=/home/actions-runner/fedml-master
path=/home/fedml/FedML
cd $path
git pull
echo "dir=$path" >> $GITHUB_OUTPUT
else
echo "running on dev"
path=/home/actions-runner/fedml-dev
path=/home/fedml/FedML
cd $path
git pull
git checkout ${{ steps.extract_branch.outputs.branch }}
echo "dir=$path" >> $GITHUB_OUTPUT
fi
- name: Analysing the code with pylint
34 changes: 34 additions & 0 deletions .github/workflows/deprecated/python-package-conda.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
name: Python Package using Conda

on: [push]

jobs:
build-linux:
runs-on: ubuntu-latest
strategy:
max-parallel: 5

steps:
- uses: actions/checkout@v4
- name: Set up Python 3.10
uses: actions/setup-python@v3
with:
python-version: '3.10'
- name: Add conda to system path
run: |
# $CONDA is an environment variable pointing to the root of the miniconda directory
echo $CONDA/bin >> $GITHUB_PATH
- name: Install dependencies
run: |
conda env update --file environment.yml --name base
- name: Lint with flake8
run: |
conda install flake8
# stop the build if there are Python syntax errors or undefined names
flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
# exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
- name: Test with pytest
run: |
conda install pytest
pytest
Original file line number Diff line number Diff line change
@@ -52,13 +52,16 @@ jobs:
echo ${{ steps.extract_branch.outputs.branch }}
if [[ ${{ steps.extract_branch.outputs.branch }} == "master" ]]; then
echo "running on master"
path=/home/actions-runner/fedml-master
path=/home/fedml/FedML
cd $path
git pull
echo "dir=$path" >> $GITHUB_OUTPUT
else
echo "running on dev"
path=/home/actions-runner/fedml-dev
path=/home/fedml/FedML
cd $path
git pull
git checkout ${{ steps.extract_branch.outputs.branch }}
echo "dir=$path" >> $GITHUB_OUTPUT
fi
- name: sync git repo to local pip
@@ -67,7 +70,9 @@ jobs:
homepath=${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}
echo $Homepath
cd $homepath
bash ./devops/scripts/sync-fedml-pip.sh
cd python
pip install -e ./
# bash ./devops/scripts/sync-fedml-pip.sh
- name: Install MNN
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
@@ -79,6 +84,6 @@ jobs:
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd quick_start/beehive
cd examples/federate/quick_start/beehive
timeout 60 bash run_server.sh || code=$?; if [[ $code -ne 124 && $code -ne 0 ]]; then exit $code; fi
Original file line number Diff line number Diff line change
@@ -29,16 +29,16 @@ jobs:
strategy:
fail-fast: false
matrix:
os: [ ubuntu-latest]
arch: [X64]
os: [ ubuntu-latest ]
arch: [ X64 ]
python-version: ['3.8']
client-index: ['0', '1', '2', '3', '4']
# exclude:
# - os: macos-latest
# python-version: '3.8'
# - os: windows-latest
# python-version: '3.6'
runs-on: [ self-hosted, Linux ]
runs-on: [ self-hosted ]
timeout-minutes: 15
steps:
- name: Extract branch name
@@ -53,13 +53,16 @@ jobs:
echo ${{ steps.extract_branch.outputs.branch }}
if [[ ${{ steps.extract_branch.outputs.branch }} == "master" ]]; then
echo "running on master"
path=/home/actions-runner/fedml-master
path=/home/fedml/FedML
cd $path
git pull
echo "dir=$path" >> $GITHUB_OUTPUT
else
echo "running on dev"
path=/home/actions-runner/fedml-dev
path=/home/fedml/FedML
cd $path
git pull
git checkout ${{ steps.extract_branch.outputs.branch }}
echo "dir=$path" >> $GITHUB_OUTPUT
fi
- name: sync git repo to local pip
@@ -68,13 +71,16 @@ jobs:
homepath=${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}
echo $Homepath
cd $homepath
bash ./devops/scripts/sync-fedml-pip.sh
cd python
pip install -e ./
# bash ./devops/srcipts/install-fedml.sh
# bash ./devops/scripts/sync-fedml-pip.sh
- name: server - cross-silo - attack
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/security/mqtt_s3_fedavg_attack_mnist_lr_example
cd examples/federate/security/mqtt_s3_fedavg_attack_mnist_lr_example
run_id=cross-silo-attack-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_server.sh $run_id
@@ -84,7 +90,7 @@ jobs:
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/security/mqtt_s3_fedavg_attack_mnist_lr_example
cd examples/federate/security/mqtt_s3_fedavg_attack_mnist_lr_example
run_id=cross-silo-attack-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_client.sh 1 $run_id
@@ -94,7 +100,7 @@ jobs:
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/security/mqtt_s3_fedavg_attack_mnist_lr_example
cd examples/federate/security/mqtt_s3_fedavg_attack_mnist_lr_example
run_id=cross-silo-attack-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_client.sh 2 $run_id
@@ -104,7 +110,7 @@ jobs:
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/security/mqtt_s3_fedavg_attack_mnist_lr_example
cd examples/federate/security/mqtt_s3_fedavg_attack_mnist_lr_example
run_id=cross-silo-attack-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_client.sh 3 $run_id
@@ -114,7 +120,7 @@ jobs:
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/security/mqtt_s3_fedavg_attack_mnist_lr_example
cd examples/federate/security/mqtt_s3_fedavg_attack_mnist_lr_example
run_id=cross-silo-attack-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_client.sh 4 $run_id
Original file line number Diff line number Diff line change
@@ -53,13 +53,16 @@ jobs:
echo ${{ steps.extract_branch.outputs.branch }}
if [[ ${{ steps.extract_branch.outputs.branch }} == "master" ]]; then
echo "running on master"
path=/home/actions-runner/fedml-master
path=/home/fedml/FedML
cd $path
git pull
echo "dir=$path" >> $GITHUB_OUTPUT
else
echo "running on dev"
path=/home/actions-runner/fedml-dev
path=/home/fedml/FedML
cd $path
git pull
git checkout ${{ steps.extract_branch.outputs.branch }}
echo "dir=$path" >> $GITHUB_OUTPUT
fi
- name: sync git repo to local pip
@@ -68,13 +71,13 @@ jobs:
homepath=${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}
echo $Homepath
cd $homepath
bash ./devops/scripts/sync-fedml-pip.sh
# bash ./devops/scripts/sync-fedml-pip.sh
- name: server - cross-silo - cdp
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/privacy/mqtt_s3_fedavg_cdp_mnist_lr_example
cd examples/federate/privacy/mqtt_s3_fedavg_cdp_mnist_lr_example
run_id=cross-silo-ho-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_server.sh $run_id
@@ -84,7 +87,7 @@ jobs:
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/privacy/mqtt_s3_fedavg_cdp_mnist_lr_example
cd examples/federate/privacy/mqtt_s3_fedavg_cdp_mnist_lr_example
run_id=cross-silo-ho-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_client.sh 1 $run_id
@@ -94,7 +97,7 @@ jobs:
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/privacy/mqtt_s3_fedavg_cdp_mnist_lr_example
cd examples/federate/privacy/mqtt_s3_fedavg_cdp_mnist_lr_example
run_id=cross-silo-ho-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_client.sh 2 $run_id
Original file line number Diff line number Diff line change
@@ -53,13 +53,16 @@ jobs:
echo ${{ steps.extract_branch.outputs.branch }}
if [[ ${{ steps.extract_branch.outputs.branch }} == "master" ]]; then
echo "running on master"
path=/home/actions-runner/fedml-master
path=/home/fedml/FedML
cd $path
git pull
echo "dir=$path" >> $GITHUB_OUTPUT
else
echo "running on dev"
path=/home/actions-runner/fedml-dev
path=/home/fedml/FedML
cd $path
git pull
git checkout ${{ steps.extract_branch.outputs.branch }}
echo "dir=$path" >> $GITHUB_OUTPUT
fi
- name: sync git repo to local pip
@@ -68,13 +71,13 @@ jobs:
homepath=${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}
echo $Homepath
cd $homepath
bash ./devops/scripts/sync-fedml-pip.sh
# bash ./devops/scripts/sync-fedml-pip.sh
- name: server - cross-silo - defense
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/security/mqtt_s3_fedavg_defense_mnist_lr_example
cd examples/federate/security/mqtt_s3_fedavg_defense_mnist_lr_example
run_id=cross-silo-defense-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_server.sh $run_id
@@ -84,7 +87,7 @@ jobs:
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/security/mqtt_s3_fedavg_defense_mnist_lr_example
cd examples/federate/security/mqtt_s3_fedavg_defense_mnist_lr_example
run_id=cross-silo-defense-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_client.sh 1 $run_id
@@ -94,7 +97,7 @@ jobs:
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/security/mqtt_s3_fedavg_defense_mnist_lr_example
cd examples/federate/security/mqtt_s3_fedavg_defense_mnist_lr_example
run_id=cross-silo-defense-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_client.sh 2 $run_id
@@ -104,7 +107,7 @@ jobs:
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/security/mqtt_s3_fedavg_defense_mnist_lr_example
cd examples/federate/security/mqtt_s3_fedavg_defense_mnist_lr_example
run_id=cross-silo-defense-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_client.sh 3 $run_id
@@ -114,7 +117,7 @@ jobs:
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/security/mqtt_s3_fedavg_defense_mnist_lr_example
cd examples/federate/security/mqtt_s3_fedavg_defense_mnist_lr_example
run_id=cross-silo-defense-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_client.sh 4 $run_id
Original file line number Diff line number Diff line change
@@ -53,13 +53,16 @@ jobs:
echo ${{ steps.extract_branch.outputs.branch }}
if [[ ${{ steps.extract_branch.outputs.branch }} == "master" ]]; then
echo "running on master"
path=/home/actions-runner/fedml-master
path=/home/fedml/FedML
cd $path
git pull
echo "dir=$path" >> $GITHUB_OUTPUT
else
echo "running on dev"
path=/home/actions-runner/fedml-dev
path=/home/fedml/FedML
cd $path
git pull
git checkout ${{ steps.extract_branch.outputs.branch }}
echo "dir=$path" >> $GITHUB_OUTPUT
fi
- name: sync git repo to local pip
@@ -68,13 +71,13 @@ jobs:
homepath=${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}
echo $Homepath
cd $homepath
bash ./devops/scripts/sync-fedml-pip.sh
# bash ./devops/scripts/sync-fedml-pip.sh
- name: server - cross-silo - ldp
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/privacy/mqtt_s3_fedavg_ldp_mnist_lr_example
cd examples/federate/privacy/mqtt_s3_fedavg_ldp_mnist_lr_example
run_id=cross-silo-ho-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_server.sh $run_id
@@ -84,7 +87,7 @@ jobs:
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/privacy/mqtt_s3_fedavg_ldp_mnist_lr_example
cd examples/federate/privacy/mqtt_s3_fedavg_ldp_mnist_lr_example
run_id=cross-silo-ho-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_client.sh 1 $run_id
@@ -94,7 +97,7 @@ jobs:
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/privacy/mqtt_s3_fedavg_ldp_mnist_lr_example
cd examples/federate/privacy/mqtt_s3_fedavg_ldp_mnist_lr_example
run_id=cross-silo-ho-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_client.sh 2 $run_id
Original file line number Diff line number Diff line change
@@ -53,13 +53,16 @@ jobs:
echo ${{ steps.extract_branch.outputs.branch }}
if [[ ${{ steps.extract_branch.outputs.branch }} == "master" ]]; then
echo "running on master"
path=/home/actions-runner/fedml-master
path=/home/fedml/FedML
cd $path
git pull
echo "dir=$path" >> $GITHUB_OUTPUT
else
echo "running on dev"
path=/home/actions-runner/fedml-dev
path=/home/fedml/FedML
cd $path
git pull
git checkout ${{ steps.extract_branch.outputs.branch }}
echo "dir=$path" >> $GITHUB_OUTPUT
fi
- name: sync git repo to local pip
@@ -68,13 +71,13 @@ jobs:
homepath=${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}
echo $Homepath
cd $homepath
bash ./devops/scripts/sync-fedml-pip.sh
# bash ./devops/scripts/sync-fedml-pip.sh
- name: server - cross-silo - ho
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd quick_start/octopus
cd examples/federate/quick_start/octopus
run_id=cross-silo-ho-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_server.sh $run_id
@@ -84,7 +87,7 @@ jobs:
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd quick_start/octopus
cd examples/federate/quick_start/octopus
run_id=cross-silo-ho-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_client.sh 1 $run_id
@@ -94,7 +97,7 @@ jobs:
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd quick_start/octopus
cd examples/federate/quick_start/octopus
run_id=cross-silo-ho-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_client.sh 2 $run_id
Original file line number Diff line number Diff line change
@@ -52,13 +52,16 @@ jobs:
echo ${{ steps.extract_branch.outputs.branch }}
if [[ ${{ steps.extract_branch.outputs.branch }} == "master" ]]; then
echo "running on master"
path=/home/actions-runner/fedml-master
path=/home/fedml/FedML
cd $path
git pull
echo "dir=$path" >> $GITHUB_OUTPUT
else
echo "running on dev"
path=/home/actions-runner/fedml-dev
path=/home/fedml/FedML
cd $path
git pull
git checkout ${{ steps.extract_branch.outputs.branch }}
echo "dir=$path" >> $GITHUB_OUTPUT
fi
- name: sync git repo to local pip
@@ -67,25 +70,25 @@ jobs:
homepath=${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}
echo $Homepath
cd $homepath
bash ./devops/scripts/sync-fedml-pip.sh
# bash ./devops/scripts/sync-fedml-pip.sh
- name: server - cross-silo - ho
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd quick_start/octopus
cd examples/federate/quick_start/octopus
.\run_server.bat ${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
if: ${{ matrix.client-index == '0' }}

- name: client 1 - cross-silo - ho
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd quick_start/octopus
cd examples/federate/quick_start/octopus
.\run_client.bat 1 ${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
if: ${{ matrix.client-index == '1' }}

- name: client 2 - cross-silo - ho
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd quick_start/octopus
cd examples/federate/quick_start/octopus
.\run_client.bat 2 ${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
if: ${{ matrix.client-index == '2' }}
Original file line number Diff line number Diff line change
@@ -53,13 +53,16 @@ jobs:
echo ${{ steps.extract_branch.outputs.branch }}
if [[ ${{ steps.extract_branch.outputs.branch }} == "master" ]]; then
echo "running on master"
path=/home/actions-runner/fedml-master
path=/home/fedml/FedML
cd $path
git pull
echo "dir=$path" >> $GITHUB_OUTPUT
else
echo "running on dev"
path=/home/actions-runner/fedml-dev
path=/home/fedml/FedML
cd $path
git pull
git checkout ${{ steps.extract_branch.outputs.branch }}
echo "dir=$path" >> $GITHUB_OUTPUT
fi
- name: sync git repo to local pip
@@ -68,13 +71,13 @@ jobs:
homepath=${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}
echo $Homepath
cd $homepath
bash ./devops/scripts/sync-fedml-pip.sh
# bash ./devops/scripts/sync-fedml-pip.sh
- name: server - cross-silo - lightsecagg
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/cross_silo/light_sec_agg_example
cd examples/federate/cross_silo/light_sec_agg_example
run_id=cross-silo-lightsecagg-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_server.sh $run_id
@@ -84,7 +87,7 @@ jobs:
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/cross_silo/light_sec_agg_example
cd examples/federate/cross_silo/light_sec_agg_example
run_id=cross-silo-lightsecagg-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_client.sh 1 $run_id
@@ -94,7 +97,7 @@ jobs:
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/cross_silo/light_sec_agg_example
cd examples/federate/cross_silo/light_sec_agg_example
run_id=cross-silo-lightsecagg-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_client.sh 2 $run_id
Original file line number Diff line number Diff line change
@@ -52,13 +52,16 @@ jobs:
echo ${{ steps.extract_branch.outputs.branch }}
if [[ ${{ steps.extract_branch.outputs.branch }} == "master" ]]; then
echo "running on master"
path=/home/actions-runner/fedml-master
path=/home/fedml/FedML
cd $path
git pull
echo "dir=$path" >> $GITHUB_OUTPUT
else
echo "running on dev"
path=/home/actions-runner/fedml-dev
path=/home/fedml/FedML
cd $path
git pull
git checkout ${{ steps.extract_branch.outputs.branch }}
echo "dir=$path" >> $GITHUB_OUTPUT
fi
- name: sync git repo to local pip
@@ -67,25 +70,25 @@ jobs:
homepath=${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}
echo $Homepath
cd $homepath
bash ./devops/scripts/sync-fedml-pip.sh
# bash ./devops/scripts/sync-fedml-pip.sh
- name: server - cross-silo - ho
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd examples/cross_silo/light_sec_agg_example
cd examples/federate/cross_silo/light_sec_agg_example
.\run_server.bat cross-silo-lightsecagg-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
if: ${{ matrix.client-index == '0' }}

- name: client 1 - cross-silo - ho
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd examples/cross_silo/light_sec_agg_example
cd examples/federate/cross_silo/light_sec_agg_example
.\run_client.bat 1 cross-silo-lightsecagg-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
if: ${{ matrix.client-index == '1' }}

- name: client 2 - cross-silo - lightsecagg
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd examples/cross_silo/light_sec_agg_example
cd examples/federate/cross_silo/light_sec_agg_example
.\run_client.bat 2 cross-silo-lightsecagg-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
if: ${{ matrix.client-index == '2' }}
Original file line number Diff line number Diff line change
@@ -43,13 +43,16 @@ jobs:
echo ${{ steps.extract_branch.outputs.branch }}
if [[ ${{ steps.extract_branch.outputs.branch }} == "master" ]]; then
echo "running on master"
path=/home/actions-runner/fedml-master
path=/home/fedml/FedML
cd $path
git pull
echo "dir=$path" >> $GITHUB_OUTPUT
else
echo "running on dev"
path=/home/actions-runner/fedml-dev
path=/home/fedml/FedML
cd $path
git pull
git checkout ${{ steps.extract_branch.outputs.branch }}
echo "dir=$path" >> $GITHUB_OUTPUT
fi
- name: sync git repo to local pip
@@ -58,7 +61,7 @@ jobs:
homepath=${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}
echo $Homepath
cd $homepath
bash ./devops/scripts/sync-fedml-pip.sh
# bash ./devops/scripts/sync-fedml-pip.sh
- name: server - Flow
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
Original file line number Diff line number Diff line change
@@ -53,13 +53,16 @@ jobs:
echo ${{ steps.extract_branch.outputs.branch }}
if [[ ${{ steps.extract_branch.outputs.branch }} == "master" ]]; then
echo "running on master"
path=/home/actions-runner/fedml-master
path=/home/fedml/FedML
cd $path
git pull
echo "dir=$path" >> $GITHUB_OUTPUT
else
echo "running on dev"
path=/home/actions-runner/fedml-dev
path=/home/fedml/FedML
cd $path
git pull
git checkout ${{ steps.extract_branch.outputs.branch }}
echo "dir=$path" >> $GITHUB_OUTPUT
fi
- name: sync git repo to local pip
@@ -68,14 +71,14 @@ jobs:
homepath=${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}
echo $Homepath
cd $homepath
bash ./devops/scripts/sync-fedml-pip.sh
# bash ./devops/scripts/sync-fedml-pip.sh
cd $homepath/python
- name: server - jax - fedavg
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/cross_silo/jax_haiku_mqtt_s3_fedavg_mnist_lr_example
cd examples/federate/cross_silo/jax_haiku_mqtt_s3_fedavg_mnist_lr_example
run_id=jax-ml-engine-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_server.sh $run_id
@@ -85,7 +88,7 @@ jobs:
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/cross_silo/jax_haiku_mqtt_s3_fedavg_mnist_lr_example
cd examples/federate/cross_silo/jax_haiku_mqtt_s3_fedavg_mnist_lr_example
run_id=jax-ml-engine-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_client.sh 1 $run_id
@@ -95,7 +98,7 @@ jobs:
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/cross_silo/jax_haiku_mqtt_s3_fedavg_mnist_lr_example
cd examples/federate/cross_silo/jax_haiku_mqtt_s3_fedavg_mnist_lr_example
run_id=jax-ml-engine-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_client.sh 2 $run_id
Original file line number Diff line number Diff line change
@@ -53,13 +53,16 @@ jobs:
echo ${{ steps.extract_branch.outputs.branch }}
if [[ ${{ steps.extract_branch.outputs.branch }} == "master" ]]; then
echo "running on master"
path=/home/actions-runner/fedml-master
path=/home/fedml/FedML
cd $path
git pull
echo "dir=$path" >> $GITHUB_OUTPUT
else
echo "running on dev"
path=/home/actions-runner/fedml-dev
path=/home/fedml/FedML
cd $path
git pull
git checkout ${{ steps.extract_branch.outputs.branch }}
echo "dir=$path" >> $GITHUB_OUTPUT
fi
- name: sync git repo to local pip
@@ -68,15 +71,15 @@ jobs:
homepath=${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}
echo $Homepath
cd $homepath
bash ./devops/scripts/sync-fedml-pip.sh
# bash ./devops/scripts/sync-fedml-pip.sh
cd $homepath/python
pip install mxnet==2.0.0b1
- name: server - mxnet - fedavg
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/cross_silo/mxnet_mqtt_s3_fedavg_mnist_lr_example
cd examples/federate/cross_silo/mxnet_mqtt_s3_fedavg_mnist_lr_example
run_id=mxnet-ml-engine-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_server.sh $run_id
@@ -86,7 +89,7 @@ jobs:
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/cross_silo/mxnet_mqtt_s3_fedavg_mnist_lr_example
cd examples/federate/cross_silo/mxnet_mqtt_s3_fedavg_mnist_lr_example
run_id=mxnet-ml-engine-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_client.sh 1 $run_id
@@ -96,7 +99,7 @@ jobs:
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/cross_silo/mxnet_mqtt_s3_fedavg_mnist_lr_example
cd examples/federate/cross_silo/mxnet_mqtt_s3_fedavg_mnist_lr_example
run_id=mxnet-ml-engine-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_client.sh 2 $run_id
Original file line number Diff line number Diff line change
@@ -53,13 +53,16 @@ jobs:
echo ${{ steps.extract_branch.outputs.branch }}
if [[ ${{ steps.extract_branch.outputs.branch }} == "master" ]]; then
echo "running on master"
path=/home/actions-runner/fedml-master
path=/home/fedml/FedML
cd $path
git pull
echo "dir=$path" >> $GITHUB_OUTPUT
else
echo "running on dev"
path=/home/actions-runner/fedml-dev
path=/home/fedml/FedML
cd $path
git pull
git checkout ${{ steps.extract_branch.outputs.branch }}
echo "dir=$path" >> $GITHUB_OUTPUT
fi
- name: sync git repo to local pip
@@ -68,14 +71,14 @@ jobs:
homepath=${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}
echo $Homepath
cd $homepath
bash ./devops/scripts/sync-fedml-pip.sh
# bash ./devops/scripts/sync-fedml-pip.sh
cd $homepath/python
- name: server - tensorflow - fedavg
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/cross_silo/tf_mqtt_s3_fedavg_mnist_lr_example
cd examples/federate/cross_silo/tf_mqtt_s3_fedavg_mnist_lr_example
run_id=tf-ml-engine-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_server.sh $run_id
@@ -85,7 +88,7 @@ jobs:
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/cross_silo/tf_mqtt_s3_fedavg_mnist_lr_example
cd examples/federate/cross_silo/tf_mqtt_s3_fedavg_mnist_lr_example
run_id=tf-ml-engine-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_client.sh 1 $run_id
@@ -95,7 +98,7 @@ jobs:
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/cross_silo/tf_mqtt_s3_fedavg_mnist_lr_example
cd examples/federate/cross_silo/tf_mqtt_s3_fedavg_mnist_lr_example
run_id=tf-ml-engine-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
echo ${run_id}
bash run_client.sh 2 $run_id
Original file line number Diff line number Diff line change
@@ -46,13 +46,16 @@ jobs:
echo ${{ steps.extract_branch.outputs.branch }}
if [[ ${{ steps.extract_branch.outputs.branch }} == "master" ]]; then
echo "running on master"
path=/home/actions-runner/fedml-master
path=/home/fedml/FedML
cd $path
git pull
echo "dir=$path" >> $GITHUB_OUTPUT
else
echo "running on dev"
path=/home/actions-runner/fedml-dev
path=/home/fedml/FedML
cd $path
git pull
git checkout ${{ steps.extract_branch.outputs.branch }}
echo "dir=$path" >> $GITHUB_OUTPUT
fi
- name: sync git repo to local pip
@@ -61,28 +64,28 @@ jobs:
homepath=${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}
echo $Homepath
cd $homepath
bash ./devops/scripts/sync-fedml-pip.sh
# bash ./devops/scripts/sync-fedml-pip.sh
cd $homepath/python
pip install -e '.[tensorflow]'
- name: server - tensorflow - fedavg
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd examples/cross_silo/tf_mqtt_s3_fedavg_mnist_lr_example
cd examples/federate/cross_silo/tf_mqtt_s3_fedavg_mnist_lr_example
python tf_server.py --cf config/fedml_config.yaml --rank 0 --role server --run_id tf-ml-engine-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
if: ${{ matrix.client-index == '0' }}

- name: client 1 - tensorflow - fedavg
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd examples/cross_silo/tf_mqtt_s3_fedavg_mnist_lr_example
cd examples/federate/cross_silo/tf_mqtt_s3_fedavg_mnist_lr_example
python3 tf_client.py --cf config/fedml_config.yaml --rank 1 --role client --run_id tf-ml-engine-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
if: ${{ matrix.client-index == '1' }}

- name: client 2 - tensorflow - fedavg
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd examples/cross_silo/tf_mqtt_s3_fedavg_mnist_lr_example
cd examples/federate/cross_silo/tf_mqtt_s3_fedavg_mnist_lr_example
python3 tf_client.py --cf config/fedml_config.yaml --rank 2 --role client --run_id tf-ml-engine-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
if: ${{ matrix.client-index == '2' }}

@@ -138,21 +141,21 @@ jobs:
- name: server - jax - fedavg
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd examples/cross_silo/jax_haiku_mqtt_s3_fedavg_mnist_lr_example
cd examples/federate/cross_silo/jax_haiku_mqtt_s3_fedavg_mnist_lr_example
python tf_server.py --cf config/fedml_config.yaml --rank 0 --role server --run_id jax-ml-engine-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
if: ${{ matrix.client-index == '0' }}

- name: client 1 - jax - fedavg
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd examples/cross_silo/jax_haiku_mqtt_s3_fedavg_mnist_lr_example
cd examples/federate/cross_silo/jax_haiku_mqtt_s3_fedavg_mnist_lr_example
python3 tf_client.py --cf config/fedml_config.yaml --rank 1 --role client --run_id jax-ml-engine-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
if: ${{ matrix.client-index == '1' }}

- name: client 2 - jax - fedavg
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd examples/cross_silo/jax_haiku_mqtt_s3_fedavg_mnist_lr_example
cd examples/federate/cross_silo/jax_haiku_mqtt_s3_fedavg_mnist_lr_example
python3 tf_client.py --cf config/fedml_config.yaml --rank 2 --role client --run_id jax-ml-engine-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
if: ${{ matrix.client-index == '2' }}

@@ -208,20 +211,20 @@ jobs:
- name: server - mxnet - fedavg
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd examples/cross_silo/mxnet_mqtt_s3_fedavg_mnist_lr_example
cd examples/federate/cross_silo/mxnet_mqtt_s3_fedavg_mnist_lr_example
python tf_server.py --cf config/fedml_config.yaml --rank 0 --role server --run_id mxnet-ml-engine-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
if: ${{ matrix.client-index == '0' }}

- name: client 1 - mxnet - fedavg
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd examples/cross_silo/mxnet_mqtt_s3_fedavg_mnist_lr_example
cd examples/federate/cross_silo/mxnet_mqtt_s3_fedavg_mnist_lr_example
python3 tf_client.py --cf config/fedml_config.yaml --rank 1 --role client --run_id mxnet-ml-engine-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
if: ${{ matrix.client-index == '1' }}

- name: client 2 - mxnet - fedavg
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd examples/cross_silo/mxnet_mqtt_s3_fedavg_mnist_lr_example
cd examples/federate/cross_silo/mxnet_mqtt_s3_fedavg_mnist_lr_example
python3 tf_client.py --cf config/fedml_config.yaml --rank 2 --role client --run_id mxnet-ml-engine-${{ format('{0}{1}{2}{3}', github.run_id, matrix.os, matrix.arch, matrix.python-version) }}
if: ${{ matrix.client-index == '2' }}
Original file line number Diff line number Diff line change
@@ -54,13 +54,16 @@ jobs:
echo ${{ steps.extract_branch.outputs.branch }}
if [[ ${{ steps.extract_branch.outputs.branch }} == "master" ]]; then
echo "running on master"
path=/home/actions-runner/fedml-master
path=/home/fedml/FedML
cd $path
git pull
echo "dir=$path" >> $GITHUB_OUTPUT
else
echo "running on dev"
path=/home/actions-runner/fedml-dev
path=/home/fedml/FedML
cd $path
git pull
git checkout ${{ steps.extract_branch.outputs.branch }}
echo "dir=$path" >> $GITHUB_OUTPUT
fi
- name: sync git repo to local pip
@@ -69,61 +72,61 @@ jobs:
homepath=${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}
echo $Homepath
cd $homepath
bash ./devops/scripts/sync-fedml-pip.sh
# bash ./devops/scripts/sync-fedml-pip.sh
- name: test "fedml login" and "fedml build"
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd ${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}/python
cd tests/smoke_test/cli
bash login.sh
bash build.sh
# - name: test "fedml login" and "fedml build"
# working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
# run: |
# cd ${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}/python
# cd tests/smoke_test/cli
# bash login.sh
# bash build.sh
- name: test simulation-sp
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd ${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}/python
cd quick_start/parrot
cd examples/federate/quick_start/parrot
python torch_fedavg_mnist_lr_one_line_example.py --cf fedml_config.yaml
python torch_fedavg_mnist_lr_custum_data_and_model_example.py --cf fedml_config.yaml
- name: test sp - sp_decentralized_mnist_lr_example
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd ${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}/python
cd examples/simulation/sp_decentralized_mnist_lr_example
cd examples/federate/simulation/sp_decentralized_mnist_lr_example
python torch_fedavg_mnist_lr_step_by_step_example.py --cf fedml_config.yaml
- name: test sp - sp_fednova_mnist_lr_example
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd ${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}/python
cd examples/simulation/sp_fednova_mnist_lr_example
cd examples/federate/simulation/sp_fednova_mnist_lr_example
python torch_fednova_mnist_lr_step_by_step_example.py --cf fedml_config.yaml
- name: test sp - sp_fedopt_mnist_lr_example
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd ${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}/python
cd examples/simulation/sp_fedopt_mnist_lr_example
cd examples/federate/simulation/sp_fedopt_mnist_lr_example
python torch_fedopt_mnist_lr_step_by_step_example.py --cf fedml_config.yaml
- name: test sp - sp_hierarchicalfl_mnist_lr_example
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd ${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}/python
cd examples/simulation/sp_hierarchicalfl_mnist_lr_example
cd examples/federate/simulation/sp_hierarchicalfl_mnist_lr_example
python torch_hierarchicalfl_mnist_lr_step_by_step_example.py --cf fedml_config.yaml
- name: test sp - sp_turboaggregate_mnist_lr_example
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd ${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}/python
cd examples/simulation/sp_turboaggregate_mnist_lr_example
cd examples/federate/simulation/sp_turboaggregate_mnist_lr_example
python torch_turboaggregate_mnist_lr_step_by_step_example.py --cf fedml_config.yaml
- name: test sp - sp_vertical_mnist_lr_example
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd ${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}/python
cd examples/simulation/sp_vertical_mnist_lr_example
cd examples/federate/simulation/sp_vertical_mnist_lr_example
python torch_vertical_mnist_lr_step_by_step_example.py --cf fedml_config.yaml
Original file line number Diff line number Diff line change
@@ -51,13 +51,16 @@ jobs:
echo ${{ steps.extract_branch.outputs.branch }}
if [[ ${{ steps.extract_branch.outputs.branch }} == "master" ]]; then
echo "running on master"
path=/home/actions-runner/fedml-master
path=/home/fedml/FedML
cd $path
git pull
echo "dir=$path" >> $GITHUB_OUTPUT
else
echo "running on dev"
path=/home/actions-runner/fedml-dev
path=/home/fedml/FedML
cd $path
git pull
git checkout ${{ steps.extract_branch.outputs.branch }}
echo "dir=$path" >> $GITHUB_OUTPUT
fi
- name: sync git repo to local pip
@@ -66,7 +69,7 @@ jobs:
homepath=${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}
echo $Homepath
cd $homepath
bash ./devops/scripts/sync-fedml-pip.sh
# bash ./devops/scripts/sync-fedml-pip.sh
- name: test "fedml login" and "fedml build"
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
@@ -77,6 +80,6 @@ jobs:
- name: test simulation-sp
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd quick_start/parrot
cd examples/federate/quick_start/parrot
python torch_fedavg_mnist_lr_one_line_example.py --cf fedml_config.yaml
python torch_fedavg_mnist_lr_custum_data_and_model_example.py --cf fedml_config.yaml
Original file line number Diff line number Diff line change
@@ -54,13 +54,16 @@ jobs:
echo ${{ steps.extract_branch.outputs.branch }}
if [[ ${{ steps.extract_branch.outputs.branch }} == "master" ]]; then
echo "running on master"
path=/home/actions-runner/fedml-master
path=/home/fedml/FedML
cd $path
git pull
echo "dir=$path" >> $GITHUB_OUTPUT
else
echo "running on dev"
path=/home/actions-runner/fedml-dev
path=/home/fedml/FedML
cd $path
git pull
git checkout ${{ steps.extract_branch.outputs.branch }}
echo "dir=$path" >> $GITHUB_OUTPUT
fi
- name: sync git repo to local pip
@@ -69,7 +72,7 @@ jobs:
homepath=${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}
echo $Homepath
cd $homepath
bash ./devops/scripts/sync-fedml-pip.sh
# bash ./devops/scripts/sync-fedml-pip.sh
- name: attack tests
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
Original file line number Diff line number Diff line change
@@ -40,8 +40,8 @@ jobs:
- os: ubuntu-latest
mpi: mpich
install-mpi: |
sudo apt-get update
sudo apt install -y mpich libmpich-dev
apt-get update
apt install -y mpich libmpich-dev
# - os: ubuntu-latest
# mpi: openmpi
# install-mpi: sudo apt install -y openmpi-bin libopenmpi-dev
@@ -50,70 +50,79 @@ jobs:
shell: bash
run: echo "branch=$(echo ${GITHUB_REF#refs/heads/})" >>$GITHUB_OUTPUT
id: extract_branch
- name: Install MPI
if: matrix.mpi == 'mpich'
run: |
apt-get update
apt-get install -y mpich libmpich-dev
- id: fedml_source_code_home
name: cd to master or dev branch and git pull
shell: bash
run: |
ls
echo ${{ steps.extract_branch.outputs.branch }}
if [[ ${{ steps.extract_branch.outputs.branch }} == "master" ]]; then
echo "running on master"
path=/home/actions-runner/fedml-master
cd $path
echo "dir=$path" >> $GITHUB_OUTPUT
echo "running on master"
path=/home/fedml/FedML
cd $path
git pull
echo "dir=$path" >> $GITHUB_OUTPUT
else
echo "running on dev"
path=/home/actions-runner/fedml-dev
cd $path
echo "dir=$path" >> $GITHUB_OUTPUT
echo "running on dev"
path=/home/fedml/FedML
cd $path
git pull
git checkout ${{ steps.extract_branch.outputs.branch }}
echo "dir=$path" >> $GITHUB_OUTPUT
fi
- name: sync git repo to local pip
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
homepath=${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}
echo $Homepath
cd $homepath
bash ./devops/scripts/sync-fedml-pip.sh
# bash ./devops/scripts/sync-fedml-pip.sh
- name: Test package - FedAvg
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
pwd
cd python
cd examples/simulation/mpi_torch_fedavg_mnist_lr_example
cd examples/federate/simulation/mpi_torch_fedavg_mnist_lr_example
sh run_custom_data_and_model_example.sh 4
- name: Test package - Base
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/simulation/mpi_base_framework_example
cd examples/federate/simulation/mpi_base_framework_example
sh run.sh 4
- name: Test package - Decentralized
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/simulation/mpi_decentralized_fl_example
cd examples/federate/simulation/mpi_decentralized_fl_example
sh run.sh 4
- name: Test package - FedOPT
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/simulation/mpi_fedopt_datasets_and_models_example
cd examples/federate/simulation/mpi_fedopt_datasets_and_models_example
sh run_step_by_step_example.sh 4 config/mnist_lr/fedml_config.yaml
- name: Test package - FedProx
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/simulation/mpi_fedprox_datasets_and_models_example
cd examples/federate/simulation/mpi_fedprox_datasets_and_models_example
sh run_step_by_step_example.sh 4 config/mnist_lr/fedml_config.yaml
- name: Test package - FedGAN
working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
run: |
cd python
cd examples/simulation/mpi_torch_fedgan_mnist_gan_example
cd examples/federate/simulation/mpi_torch_fedgan_mnist_gan_example
sh run_step_by_step_example.sh 4
1 change: 1 addition & 0 deletions add_test.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
#aa
22 changes: 10 additions & 12 deletions devops/dockerfile/github-action-runner/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
# base
FROM fedml/fedml:latest-torch1.13.1-cuda11.6-cudnn8-devel
ARG BASE_IMAGE=python:3.11

# set the github runner version
ARG RUNNER_VERSION="2.304.0"
FROM ${BASE_IMAGE}

# set the github runner version
ARG RUNNER_VERSION="2.317.0"
# update the base packages and add a non-sudo user
#RUN apt-get update -y && apt-get upgrade -y && useradd -m docker

@@ -24,18 +25,15 @@ COPY start.sh start.sh

# make the script executable
RUN chmod +x start.sh

RUN cp -f /usr/bin/python /usr/bin/python-backup && ln -s /usr/bin/python3 python

RUN pip install scikit-learn

RUN pip install tensorflow && pip install tensorflow_datasets && pip install jax[cpu] && pip install dm-haiku && pip install optax && pip install jaxlib

# since the config and run script for actions are not allowed to be run by root,
# set the user to "docker" so all subsequent commands are run as the docker user
#USER docker

ENV REPO=FedML-AI/FedML ACCESS_TOKEN=1
RUN git clone https://github.com/Qigemingziba/FedML.git
RUN cd FedML && git pull && git checkout dev/v0.7.0 && cd python && pip3 install -e ./
ENV REPO=Qigemingziba/FedML ACCESS_TOKEN=AGMK3P4W5EM5PXNYTZXXIMTGNF4MW

# set the entrypoint to the start.sh script
CMD ./start.sh ${REPO} ${ACCESS_TOKEN}
CMD ./start.sh ${REPO} ${ACCESS_TOKEN}


16 changes: 8 additions & 8 deletions devops/dockerfile/github-action-runner/README.md
Original file line number Diff line number Diff line change
@@ -2,7 +2,11 @@

## Usage

./runner-start.sh [YourGitRepo] [YourRunnerPrefix] [YourRunnerNum] [YourGitHubRunnerToken] [LocalDevSourceDir] [LocalReleaseSourceDir] [LocalDataDir]
### build images
bash build_batch.sh

### run
bash run.sh [YourGitRepo] [YourGitHubRunnerToken]

For the argument YourGitHubRunnerToken, you may navigate based the following path.

@@ -13,13 +17,9 @@ In the Configure section, you should find the similar line:

set YourGitHubRunnerToken to value of --token


## Example
Use the following commands to run 4 runners in the FedML-AI/FedML repo:

Use the following commands to run 30 runners in the FedML-AI/FedML repo and run 6 runners in the FedML-AI/Front-End-Auto-Test repo:

./runner-start.sh FedML-AI/FedML fedml-runner 30 AXRYPLZLZN6XVJB3BAIXSP3EMFC7U /home/fedml/FedML4GitHubAction-Dev /home/fedml/FedML4GitHubAction /home/fedml/fedml_data
./runner-start.sh FedML-AI/Front-End-Auto-Test webtest-runner 6 AXRYPL57ZD35ZGDWZKRKFHLEMGLTK /home/fedml/FedML4GitHubAction-Dev /home/fedml/FedML4GitHubAction /home/fedml/fedml_data
bash main.sh FedML-AI/FedML AXRYPLZLZN6XVJB3BAIXSP3EMFC7U

./runner-start.sh FedML-AI/FedML fedml-runner 30 AXRYPL6CCBH24ZVRSUEAYTTEMKD56 /home/chaoyanghe/sourcecode/FedML4GitHubAction-Dev /home/chaoyanghe/sourcecode/FedML4GitHubAction /home/chaoyanghe/fedml_data
./runner-start.sh FedML-AI/Front-End-Auto-Test webtest-runner 6 AXRYPL57ZD35ZGDWZKRKFHLEMGLTK /home/chaoyanghe/sourcecode/FedML4GitHubAction-Dev /home/chaoyanghe/sourcecode/FedML4GitHubAction /home/chaoyanghe/fedml_data
bash main.sh Qigemingziba/FedML AGMK3PYAURK7QSRM475HF6LGN7L6A
22 changes: 22 additions & 0 deletions devops/dockerfile/github-action-runner/WindowsDockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# ARG BASE_IMAGE=python:3.11

# 使用 Windows Server Core 作为基础镜像
FROM mcr.microsoft.com/windows/servercore:ltsc2022

# 下载并安装 Python 3.11
SHELL ["powershell", "-Command"]
RUN Invoke-WebRequest -Uri https://www.python.org/ftp/python/3.11.0/python-3.11.0-amd64.exe -OutFile python-3.11.0-amd64.exe; \
Start-Process python-3.11.0-amd64.exe -ArgumentList '/quiet InstallAllUsers=1 PrependPath=1' -NoNewWindow -Wait; \
Remove-Item -Force python-3.11.0-amd64.exe

# Create a folder under the drive root
RUN mkdir actions-runner; cd actions-runner
# Download the latest runner package
RUN Invoke-WebRequest -Uri https://github.com/actions/runner/releases/download/v2.317.0/actions-runner-win-x64-2.317.0.zip -OutFile actions-runner-win-x64-2.317.0.zip
# Extract the installer
RUN Add-Type -AssemblyName System.IO.Compression.FileSystem ; [System.IO.Compression.ZipFile]::ExtractToDirectory("$PWD/actions-runner-win-x64-2.317.0.zip", "$PWD")

RUN ./config.cmd --url https://github.com/Qigemingziba/FedML --token AGMK3P3JNXYCBCEGMET7T6DGNQSVW
CMD ./run.cmd


3 changes: 0 additions & 3 deletions devops/dockerfile/github-action-runner/build.sh

This file was deleted.

12 changes: 12 additions & 0 deletions devops/dockerfile/github-action-runner/build_batch.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
tag="0.1.0"

platform="linux/amd64"

echo "build python:3.11"
docker build --no-cache --platform $platform --build-arg BASE_IMAGE=python:3.11 -t fedml/action_runner_3.11_linux64:$tag -f ./Dockerfile .
echo "build python:3.10"
docker build --no-cache --platform $platform --build-arg BASE_IMAGE=python:3.10 -t fedml/action_runner_3.10_linux64:$tag -f ./Dockerfile .
echo "build python:3.9"
docker build --no-cache --platform $platform --build-arg BASE_IMAGE=python:3.9 -t fedml/action_runner_3.9_linux64:$tag -f ./Dockerfile .
echo "build python:3.8"
docker build --no-cache --platform $platform --build-arg BASE_IMAGE=python:3.8 -t fedml/action_runner_3.8_linux64:$tag -f ./Dockerfile .
1 change: 1 addition & 0 deletions devops/dockerfile/github-action-runner/build_push.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
bash build.sh
2 changes: 2 additions & 0 deletions devops/dockerfile/github-action-runner/build_test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
docker login
docker build -t fedml/action_runner_3.11_linux64:0.1 -f ./Dockerfile .
45 changes: 45 additions & 0 deletions devops/dockerfile/github-action-runner/main.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
REPO=$1
ACCESS_TOKEN=$2
DOCKER_PULL=false
ARCH=linux64
TAG="0.1.0"

if [ $# != 2 ]; then
echo "Please provide two arguments."
echo "./runner-start.sh [YourGitRepo][YourGitHubRunnerToken]"
exit -1
fi

# List of Docker container names
# containers=("fedml/action_runner_3.8_$ARCH:0.1.0" "fedml/action_runner_3.9_$ARCH:0.1.0" "fedml/action_runner_3.10_$ARCH:0.1.0" "fedml/action_runner_3.11_$ARCH:0.1.0")
containers=("action_runner_3.8_$ARCH" "action_runner_3.9_$ARCH" "action_runner_3.10_$ARCH" "action_runner_3.11_$ARCH")
python_versions=("python3.8" "python3.9" "python3.10" "python3.11")


# Iterate through each container
for container_index in "${!containers[@]}"; do

container=${containers[$container_index]}
# Find the running container
if [ "$DOCKER_PULL" = "true" ]; then
echo "docker pull fedml/$container:$TAG"
docker pull fedml/$container:$TAG
fi
# docker stop `sudo docker ps |grep ${TAG}- |awk -F' ' '{print $1}'`

running_container=$(docker ps -a | grep $container | awk -F ' ' '{print $1}')

if [ -n "$running_container" ]; then
# Stop the running container
echo "Stopping running container: $container}"
docker rm "$running_container"
else
echo "No running container found for: $container"
fi
# docker pull $container
ACT_NAME=${containers[$container_index]}
docker run --rm --name $ACT_NAME --env REPO=$REPO --env ACCESS_TOKEN=$ACCESS_TOKEN -d fedml/${containers[$container_index]}:$TAG bash ./start.sh ${REPO} ${ACCESS_TOKEN} ${python_versions[$container_index]}

done
echo "Script completed."

23 changes: 0 additions & 23 deletions devops/dockerfile/github-action-runner/runner-start.sh

This file was deleted.

4 changes: 3 additions & 1 deletion devops/dockerfile/github-action-runner/start.sh
Original file line number Diff line number Diff line change
@@ -2,13 +2,15 @@

ORGANIZATION=$1
ACCESS_TOKEN=$2
PYTHON_VERSION=$3

echo $ORGANIZATION
echo $ACCESS_TOKEN
echo $PYTHON_VERSION

cd /home/fedml/actions-runner

RUNNER_ALLOW_RUNASROOT="1" ./config.sh --url https://github.com/${ORGANIZATION} --token ${ACCESS_TOKEN}
RUNNER_ALLOW_RUNASROOT="1" ./config.sh --url https://github.com/${ORGANIZATION} --token ${ACCESS_TOKEN} --labels self-hosted,Linux,X64,$PYTHON_VERSION

cleanup() {
echo "Removing runner..."
13 changes: 13 additions & 0 deletions devops/dockerfile/github-action-runner/windows
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# 使用 Windows Server Core 作为基础镜像
FROM mcr.microsoft.com/windows/servercore:ltsc2022

# 设置 PowerShell 作为默认 shell
SHELL ["powershell", "-Command"]

# 示例:下载并安装 Python 3.11
RUN Invoke-WebRequest -Uri https://www.python.org/ftp/python/3.11.0/python-3.11.0-amd64.exe -OutFile python-3.11.0-amd64.exe; \
Start-Process python-3.11.0-amd64.exe -ArgumentList '/quiet InstallAllUsers=1 PrependPath=1' -NoNewWindow -Wait; \
Remove-Item -Force python-3.11.0-amd64.exe

# 设置默认命令
CMD ["python"]
2 changes: 2 additions & 0 deletions devops/scripts/install-fedml.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
cd python
pip install -e ./
4 changes: 2 additions & 2 deletions devops/scripts/sync-fedml-pip.sh
Original file line number Diff line number Diff line change
@@ -24,7 +24,7 @@ else
fi
fi

mkdir -p /home/fedml/fedml_data
cp -Rf /home/fedml/fedml_data_host/* /home/fedml/fedml_data
mkdir -p ./fedml/fedml_data
cp -Rf ./fedml/fedml_data_host/* ./fedml/fedml_data

exit 0
Original file line number Diff line number Diff line change
@@ -26,7 +26,7 @@ For info on `trpc_master_config_path` refer to `python/examples/cross_silo/cuda_

Example is provided at:

`python/examples/cross_silo/cuda_rpc_fedavg_mnist_lr_example/one_line`
`python/examples/federate/cross_silo/cuda_rpc_fedavg_mnist_lr_example/one_line`
### Training Script

At the client side, the client ID (a.k.a rank) starts from 1.
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
containerize: false
data_args:
dataset_name: mnist
dataset_path: ./dataset
dataset_type: csv
environment_args:
bootstrap: fedml_bootstrap_generated.sh
model_args:
input_dim: '784'
model_cache_path: /Users/alexliang/fedml_models
model_name: lr
output_dim: '10'
training_params:
learning_rate: 0.004
2 changes: 1 addition & 1 deletion python/examples/launch/hello_job.yaml
Original file line number Diff line number Diff line change
@@ -56,7 +56,7 @@ computing:
maximum_cost_per_hour: $3000 # max cost per hour for your job per gpu card
#allow_cross_cloud_resources: true # true, false
#device_type: CPU # options: GPU, CPU, hybrid
resource_type: RTX-4090 # e.g., A100-80G, please check the resource type list by "fedml show-resource-type" or visiting URL: https://open.fedml.ai/accelerator_resource_type
resource_type: A100-80GB-SXM # e.g., A100-80G, please check the resource type list by "fedml show-resource-type" or visiting URL: https://open.fedml.ai/accelerator_resource_type

data_args:
dataset_name: mnist
1 change: 0 additions & 1 deletion python/examples/launch/hello_world/hello_world.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
import os
import time

import fedml

if __name__ == "__main__":
2 changes: 1 addition & 1 deletion python/examples/launch/serve_job_mnist.yaml
Original file line number Diff line number Diff line change
@@ -35,4 +35,4 @@ computing:
maximum_cost_per_hour: $3000 # max cost per hour for your job per gpu card
#allow_cross_cloud_resources: true # true, false
#device_type: CPU # options: GPU, CPU, hybrid
resource_type: A100-80G # e.g., A100-80G, please check the resource type list by "fedml show-resource-type" or visiting URL: https://open.fedml.ai/accelerator_resource_type
resource_type: A100-80GB-SXM # e.g., A100-80G, please check the resource type list by "fedml show-resource-type" or visiting URL: https://open.fedml.ai/accelerator_resource_type
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
containerize: false
environment_args:
bootstrap: fedml_bootstrap_generated.sh
98 changes: 98 additions & 0 deletions python/examples/train/mnist_train/train.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
import fedml
# Set random seed for reproducibility
torch.manual_seed(42)

# Define hyperparameters
batch_size = 64
learning_rate = 0.001
num_epochs = 3

# Prepare dataset and data loaders
transform = transforms.Compose([
transforms.ToTensor(), # Convert image to tensor, normalize to [0, 1]
transforms.Normalize((0.5,), (0.5,)) # Normalize with mean and std deviation of 0.5
])

train_dataset = torchvision.datasets.MNIST(root='./data', train=True, transform=transform, download=True)
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

test_dataset = torchvision.datasets.MNIST(root='./data', train=False, transform=transform, download=True)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

# Define a simple convolutional neural network model
class SimpleCNN(nn.Module):
def __init__(self):
super(SimpleCNN, self).__init__()
self.conv1 = nn.Conv2d(1, 16, kernel_size=5, padding=2)
self.conv2 = nn.Conv2d(16, 32, kernel_size=5, padding=2)
self.fc1 = nn.Linear(32 * 7 * 7, 128)
self.fc2 = nn.Linear(128, 10)

def forward(self, x):
x = torch.relu(self.conv1(x))
x = torch.max_pool2d(x, kernel_size=2, stride=2)
x = torch.relu(self.conv2(x))
x = torch.max_pool2d(x, kernel_size=2, stride=2)
x = x.view(-1, 32 * 7 * 7)
x = torch.relu(self.fc1(x))
x = self.fc2(x)
return x

model = SimpleCNN()

# Define loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)

# Train the model
for epoch in range(num_epochs):

# Evaluate the model on the test set during training
model.eval()
with torch.no_grad():
correct = 0
total = 0
for images, labels in test_loader:
outputs = model(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
acc = 100 * correct / total
fedml.mlops.log_metric({"epoch":epoch, "acc": acc})

model.train()
for images, labels in train_loader:
# Forward pass
outputs = model(images)
loss = criterion(outputs, labels)

# Backward and optimize
optimizer.zero_grad()
loss.backward()
optimizer.step()

# Final evaluation on the test set
model.eval()
with torch.no_grad():
correct = 0
total = 0
for images, labels in test_loader:
outputs = model(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()

acc = 100 * correct / total
print('Final Test Accuracy: {:.2f} %'.format(acc))
fedml.mlops.log_metric({"epoch":num_epochs, "acc": acc})

fedml.mlops.log_model(f"model-file@test", "./simple_cnn.pth")
# # Save the model parameters
# torch.save(model.state_dict(), 'simple_cnn.pth')
# print('Model saved to simple_cnn.pth')
50 changes: 50 additions & 0 deletions python/examples/train/mnist_train/train.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Local directory where your source code resides.
# It should be the relative path to this job yaml file or the absolute path.
# If your job doesn't contain any source code, it can be empty.
workspace: .

# Running entry commands which will be executed as the job entry point.
# If an error occurs, you should exit with a non-zero code, e.g. exit 1.
# Otherwise, you should exit with a zero code, e.g. exit 0.
# Support multiple lines, which can not be empty.
job: |
echo "current job id: $FEDML_CURRENT_RUN_ID"
echo "current edge id: $FEDML_CURRENT_EDGE_ID"
echo "Hello, Here is the launch platform."
echo "Current directory is as follows."
pwd
python3 train.py
echo "training job finished."
# If you want to use the job created by the MLOps platform,
# just uncomment the following three, then set job_id and config_id to your desired job id and related config.
#job_args:
# job_id: 2070
# config_id: 111

# If you want to create the job with specific name, just uncomment the following line and set job_name to your desired job name
#job_name: cv_job

job_type: train # options: train, deploy, federate

# train subtype: general_training, single_machine_training, cluster_distributed_training, cross_cloud_training
# federate subtype: cross_silo, simulation, web, smart_phone
# deploy subtype: none
job_subtype: generate_training

# containerize
containerize: false

# Bootstrap shell commands which will be executed before running entry commands.
# Support multiple lines, which can be empty.
bootstrap: |
# pip install -r requirements.txt
echo "Bootstrap finished."
computing:
minimum_num_gpus: 1 # minimum # of GPUs to provision
maximum_cost_per_hour: $3000 # max cost per hour for your job per gpu card
#allow_cross_cloud_resources: true # true, false
#device_type: CPU # options: GPU, CPU, hybrid
resource_type: A100-80GB-SXM # e.g., A100-80G, please check the resource type list by "fedml show-resource-type" or visiting URL: https://open.fedml.ai/accelerator_resource_type

18 changes: 2 additions & 16 deletions python/fedml/__init__.py
Original file line number Diff line number Diff line change
@@ -452,28 +452,14 @@ def _init_multiprocessing():
"""
if platform.system() == "Windows":
if multiprocessing.get_start_method() != "spawn":
# force all platforms (Windows) to use the same way (spawn) for multiprocessing
# force all platforms (Windows/Linux/macOS) to use the same way (spawn) for multiprocessing
multiprocessing.set_start_method("spawn", force=True)
else:
if multiprocessing.get_start_method() != "fork":
# force all platforms (Linux/macOS) to use the same way (fork) for multiprocessing
# force all platforms (Windows/Linux/macOS) to use the same way (fork) for multiprocessing
multiprocessing.set_start_method("fork", force=True)


def get_multiprocessing_context():
if platform.system() == "Windows":
return multiprocessing.get_context("spawn")
else:
return multiprocessing.get_context("fork")


def get_process(target=None, args=None):
if platform.system() == "Windows":
return multiprocessing.Process(target=target, args=args)
else:
return multiprocessing.get_context("fork").Process(target=target, args=args)


def set_env_version(version):
set_env_kv("FEDML_ENV_VERSION", version)
load_env()
3 changes: 3 additions & 0 deletions python/fedml/api/__init__.py
Original file line number Diff line number Diff line change
@@ -270,6 +270,9 @@ def model_deploy(name, endpoint_name, endpoint_id, local, master_ids, worker_ids
def model_run(endpoint_id, json_string):
model_module.run(endpoint_id, json_string)

def get_endpoint(endpoint_id):
return model_module.get_endpoint(endpoint_id)


def endpoint_delete(endpoint_id):
model_module.delete_endpoint(endpoint_id)
6 changes: 3 additions & 3 deletions python/fedml/api/api_test.py
Original file line number Diff line number Diff line change
@@ -4,9 +4,9 @@
import fedml

# Login
fedml.set_env_version("local")
fedml.set_env_version("test")
fedml.set_local_on_premise_platform_port(18080)
error_code, error_msg = fedml.api.fedml_login(api_key="1316b93c82da40ce90113a2ed12f0b14")
error_code, error_msg = fedml.api.fedml_login(api_key="")
if error_code != 0:
print("API Key is invalid!")
exit(1)
@@ -19,7 +19,7 @@

# Launch job
launch_result_list = list()
for i in range(0, 1):
for i in range(0, 10):
launch_result = fedml.api.launch_job(yaml_file)
launch_result_list.append(launch_result)
# launch_result = fedml.api.launch_job_on_cluster(yaml_file, "alex-cluster")
13 changes: 13 additions & 0 deletions python/fedml/api/modules/model.py
Original file line number Diff line number Diff line change
@@ -320,6 +320,19 @@ def run(endpoint_id: str, json_string: str) -> bool:
click.echo("Failed to run model.")
return False

def get_endpoint(endpoint_id: str):
api_key = get_api_key()
if api_key == "":
click.echo('''
Please use one of the ways below to login first:
(1) CLI: `fedml login $api_key`
(2) API: fedml.api.fedml_login(api_key=$api_key)
''')
return False

endpoint_detail_result = FedMLModelCards.get_instance().query_endpoint_detail_api(user_api_key=api_key,
endpoint_id=endpoint_id)
return endpoint_detail_result

def delete_endpoint(endpoint_id: str) -> bool:
api_key = get_api_key()
18 changes: 18 additions & 0 deletions python/fedml/computing/scheduler/comm_utils/network_util.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
import os
from fedml.computing.scheduler.model_scheduler.device_client_constants import ClientConstants


def return_this_device_connectivity_type() -> str:
"""
Return -> "http" | "http_proxy" |"mqtt"
"""
# Get the environmental variable's value and convert to lower case.
env_conn_type = os.getenv(ClientConstants.ENV_CONNECTION_TYPE_KEY, "").lower()
if env_conn_type in [
ClientConstants.WORKER_CONNECTIVITY_TYPE_HTTP,
ClientConstants.WORKER_CONNECTIVITY_TYPE_HTTP_PROXY,
ClientConstants.WORKER_CONNECTIVITY_TYPE_MQTT
]:
return env_conn_type
else:
return ClientConstants.WORKER_CONNECTIVITY_TYPE_DEFAULT
Original file line number Diff line number Diff line change
@@ -97,6 +97,12 @@ class ClientConstants(object):
INFERENCE_INFERENCE_SERVER_VERSION = "v2"
INFERENCE_REQUEST_TIMEOUT = 30

ENV_CONNECTION_TYPE_KEY = "FEDML_CONNECTION_TYPE"
WORKER_CONNECTIVITY_TYPE_HTTP = "http"
WORKER_CONNECTIVITY_TYPE_HTTP_PROXY = "http_proxy"
WORKER_CONNECTIVITY_TYPE_MQTT = "mqtt"
WORKER_CONNECTIVITY_TYPE_DEFAULT = WORKER_CONNECTIVITY_TYPE_HTTP

MSG_MODELOPS_DEPLOYMENT_STATUS_INITIALIZING = "INITIALIZING"
MSG_MODELOPS_DEPLOYMENT_STATUS_DEPLOYING = "DEPLOYING"
MSG_MODELOPS_DEPLOYMENT_STATUS_INFERRING = "INFERRING"
Original file line number Diff line number Diff line change
@@ -344,9 +344,13 @@ def get_result_item_info(self, result_item):
result_payload = result_item_json["result"]
return device_id, replica_no, result_payload

def get_idle_device(self, end_point_id, end_point_name,
model_name, model_version,
check_end_point_status=True, limit_specific_model_version=False):
def get_idle_device(self,
end_point_id,
end_point_name,
model_name,
model_version,
check_end_point_status=True,
limit_specific_model_version=False):
# Deprecated the model status logic, query directly from the deployment result list
idle_device_list = list()

@@ -365,7 +369,7 @@ def get_idle_device(self, end_point_id, end_point_name,
if "model_status" in result_payload and result_payload["model_status"] == "DEPLOYED":
idle_device_list.append({"device_id": device_id, "end_point_id": end_point_id})

logging.info(f"{len(idle_device_list)} devices this model has on it: {idle_device_list}")
logging.debug(f"{len(idle_device_list)} devices this model has on it: {idle_device_list}")

if len(idle_device_list) <= 0:
return None, None
@@ -394,7 +398,7 @@ def get_idle_device(self, end_point_id, end_point_name,
logging.info("Inference Device selection Failed:")
logging.info(e)

logging.info(f"Using Round Robin, the device index is {selected_device_index}")
logging.debug(f"Using Round Robin, the device index is {selected_device_index}")
idle_device_dict = idle_device_list[selected_device_index]

# Note that within the same endpoint_id, there could be one device with multiple same models
@@ -407,7 +411,7 @@ def get_idle_device(self, end_point_id, end_point_name,
# Find deployment result from the target idle device.
try:
for result_item in result_list:
logging.info("enter the for loop")
logging.debug("enter the for loop")
device_id, _, result_payload = self.get_result_item_info(result_item)
found_end_point_id = result_payload["end_point_id"]
found_end_point_name = result_payload["end_point_name"]
@@ -421,7 +425,7 @@ def get_idle_device(self, end_point_id, end_point_name,
if same_model_device_rank > 0:
same_model_device_rank -= 1
continue
logging.info(f"The chosen device is {device_id}")
logging.debug(f"The chosen device is {device_id}")
return result_payload, device_id
except Exception as e:
logging.info(str(e))
Original file line number Diff line number Diff line change
@@ -14,7 +14,6 @@

from fedml.core.common.singleton import Singleton
from fedml.computing.scheduler.model_scheduler.modelops_configs import ModelOpsConfigs
from fedml.computing.scheduler.model_scheduler.device_model_deployment import get_model_info
from fedml.computing.scheduler.model_scheduler.device_server_constants import ServerConstants
from fedml.computing.scheduler.model_scheduler.device_model_object import FedMLModelList, FedMLEndpointDetail
from fedml.computing.scheduler.model_scheduler.device_client_constants import ClientConstants

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -210,7 +210,8 @@ async def _predict(
return inference_response

# Found idle inference device
idle_device, end_point_id, model_id, model_name, model_version, inference_host, inference_output_url = \
idle_device, end_point_id, model_id, model_name, model_version, inference_host, inference_output_url,\
connectivity_type = \
found_idle_inference_device(in_end_point_id, in_end_point_name, in_model_name, in_model_version)
if idle_device is None or idle_device == "":
FEDML_MODEL_CACHE.update_pending_requests_counter(end_point_id, decrease=True)
@@ -229,19 +230,22 @@ async def _predict(
model_metrics.set_start_time(start_time)

# Send inference request to idle device
logging.info("inference url {}.".format(inference_output_url))
logging.debug("inference url {}.".format(inference_output_url))
if inference_output_url != "":
input_list = input_json.get("inputs", input_json)
stream_flag = input_json.get("stream", False)
input_list["stream"] = input_list.get("stream", stream_flag)
output_list = input_json.get("outputs", [])

# main execution of redirecting the inference request to the idle device
inference_response = await send_inference_request(
idle_device,
end_point_id,
inference_output_url,
input_list,
output_list,
inference_type=in_return_type)
inference_type=in_return_type,
connectivity_type=connectivity_type)

# Calculate model metrics
try:
@@ -304,37 +308,40 @@ def found_idle_inference_device(end_point_id, end_point_name, in_model_name, in_
inference_host = ""
inference_output_url = ""
model_version = ""
connectivity_type = ""

# Found idle device (TODO: optimize the algorithm to search best device for inference)
payload, idle_device = FEDML_MODEL_CACHE. \
get_idle_device(end_point_id, end_point_name, in_model_name, in_model_version)
if payload is not None:
logging.info("found idle deployment result {}".format(payload))
deployment_result = payload
model_name = deployment_result["model_name"]
model_version = deployment_result["model_version"]
model_id = deployment_result["model_id"]
end_point_id = deployment_result["end_point_id"]
inference_output_url = deployment_result["model_url"]
if payload:
model_name = payload["model_name"]
model_version = payload["model_version"]
model_id = payload["model_id"]
end_point_id = payload["end_point_id"]
inference_output_url = payload["model_url"]
connectivity_type = \
payload.get("connectivity_type",
ClientConstants.WORKER_CONNECTIVITY_TYPE_DEFAULT)
url_parsed = urlparse(inference_output_url)
inference_host = url_parsed.hostname
else:
logging.info("not found idle deployment result")

return idle_device, end_point_id, model_id, model_name, model_version, inference_host, inference_output_url
res = (idle_device, end_point_id, model_id, model_name, model_version, inference_host, inference_output_url,
connectivity_type)
logging.debug(f"found idle device with metrics: {res}")

return res


async def send_inference_request(idle_device, end_point_id, inference_url, input_list, output_list,
inference_type="default", has_public_ip=True):
inference_type="default",
connectivity_type=ClientConstants.WORKER_CONNECTIVITY_TYPE_DEFAULT):
request_timeout_sec = FEDML_MODEL_CACHE.get_endpoint_settings(end_point_id) \
.get("request_timeout_sec", ClientConstants.INFERENCE_REQUEST_TIMEOUT)

try:
http_infer_available = os.getenv("FEDML_INFERENCE_HTTP_AVAILABLE", True)
if not http_infer_available:
if http_infer_available == "False" or http_infer_available == "false":
http_infer_available = False

if http_infer_available:
if connectivity_type == ClientConstants.WORKER_CONNECTIVITY_TYPE_HTTP:
response_ok = await FedMLHttpInference.is_inference_ready(
inference_url,
timeout=request_timeout_sec)
@@ -345,24 +352,25 @@ async def send_inference_request(idle_device, end_point_id, inference_url, input
output_list,
inference_type=inference_type,
timeout=request_timeout_sec)
logging.info(f"Use http inference. return {response_ok}")
logging.debug(f"Use http inference. return {response_ok}")
return inference_response

response_ok = await FedMLHttpProxyInference.is_inference_ready(
inference_url,
timeout=request_timeout_sec)
if response_ok:
response_ok, inference_response = await FedMLHttpProxyInference.run_http_proxy_inference_with_request(
end_point_id,
elif connectivity_type == ClientConstants.WORKER_CONNECTIVITY_TYPE_HTTP_PROXY:
logging.warning("Use http proxy inference.")
response_ok = await FedMLHttpProxyInference.is_inference_ready(
inference_url,
input_list,
output_list,
inference_type=inference_type,
timeout=request_timeout_sec)
logging.info(f"Use http proxy inference. return {response_ok}")
return inference_response

if not has_public_ip:
if response_ok:
response_ok, inference_response = await FedMLHttpProxyInference.run_http_proxy_inference_with_request(
end_point_id,
inference_url,
input_list,
output_list,
inference_type=inference_type,
timeout=request_timeout_sec)
logging.info(f"Use http proxy inference. return {response_ok}")
return inference_response
elif connectivity_type == ClientConstants.WORKER_CONNECTIVITY_TYPE_MQTT:
logging.warning("Use mqtt inference.")
agent_config = {"mqtt_config": Settings.mqtt_config}
mqtt_inference = FedMLMqttInference(
agent_config=agent_config,
@@ -385,7 +393,8 @@ async def send_inference_request(idle_device, end_point_id, inference_url, input

logging.info(f"Use mqtt inference. return {response_ok}.")
return inference_response
return {"error": True, "message": "Failed to use http, http-proxy for inference, no response from replica."}
else:
return {"error": True, "message": "Failed to use http, http-proxy for inference, no response from replica."}
except Exception as e:
inference_response = {"error": True,
"message": f"Exception when using http, http-proxy and mqtt "
Original file line number Diff line number Diff line change
@@ -250,14 +250,6 @@ def process_deployment_result_message(self, topic=None, payload=None):
logging.info(f"Endpoint {end_point_id}; Device {device_id}; replica {replica_no}; "
f"run_operation {run_operation} model status {model_status}.")

# OPTIONAL DEBUG PARAMS
# this_run_controller = self.model_runner_mapping[run_id_str].replica_controller
# logging.info(f"The current replica controller state is "
# f"Total version diff num {this_run_controller.total_replica_version_diff_num}")
# logging.info(f"self.request_json now {self.request_json}") # request_json will be deprecated
# this_run_request_json = self.request_json
# logging.info(f"self.request_json now {this_run_request_json}")

# Set redis + sqlite deployment result
FedMLModelCache.get_instance().set_redis_params(self.redis_addr, self.redis_port, self.redis_password)

@@ -461,7 +453,6 @@ def process_deployment_result_message(self, topic=None, payload=None):
time.sleep(3)
self.trigger_completed_event()


def cleanup_runner_process(self, run_id):
ServerConstants.cleanup_run_process(run_id, not_kill_subprocess=True)

Original file line number Diff line number Diff line change
@@ -9,6 +9,8 @@
from abc import ABC
import yaml
from fedml.computing.scheduler.comm_utils.job_utils import JobRunnerUtils
from fedml.computing.scheduler.comm_utils.network_util import return_this_device_connectivity_type

from fedml.core.mlops import MLOpsRuntimeLog
from fedml.computing.scheduler.comm_utils import file_utils
from .device_client_constants import ClientConstants
@@ -234,8 +236,11 @@ def run_impl(self, run_extend_queue_list, sender_message_center,
running_model_name, inference_output_url, inference_model_version, model_metadata, model_config = \
"", "", model_version, {}, {}

# ip and connectivity
worker_ip = GeneralConstants.get_ip_address(self.request_json)
connectivity = return_this_device_connectivity_type()

if op == "add":
worker_ip = GeneralConstants.get_ip_address(self.request_json)
for rank in range(prev_rank + 1, prev_rank + 1 + op_num):
try:
running_model_name, inference_output_url, inference_model_version, model_metadata, model_config = \
@@ -269,7 +274,9 @@ def run_impl(self, run_extend_queue_list, sender_message_center,
result_payload = self.send_deployment_results(
end_point_name, self.edge_id, ClientConstants.MSG_MODELOPS_DEPLOYMENT_STATUS_DEPLOYED,
model_id, model_name, inference_output_url, model_version, inference_port_external,
inference_engine, model_metadata, model_config, replica_no=rank + 1)
inference_engine, model_metadata, model_config, replica_no=rank + 1,
connectivity=connectivity
)

if inference_port_external != inference_port:
# Save internal port to local db
@@ -278,16 +285,16 @@ def run_impl(self, run_extend_queue_list, sender_message_center,
result_payload = self.construct_deployment_results(
end_point_name, self.edge_id, ClientConstants.MSG_MODELOPS_DEPLOYMENT_STATUS_DEPLOYED,
model_id, model_name, inference_output_url, model_version, inference_port,
inference_engine, model_metadata, model_config, replica_no=rank + 1)
inference_engine, model_metadata, model_config, replica_no=rank + 1,
connectivity=connectivity
)

FedMLModelDatabase.get_instance().set_deployment_result(
run_id, end_point_name, model_name, model_version, self.edge_id,
json.dumps(result_payload), replica_no=rank + 1)

logging.info(f"Deploy replica {rank + 1} / {prev_rank + 1 + op_num} successfully.")
time.sleep(5)

time.sleep(1)
self.status_reporter.run_id = self.run_id
self.status_reporter.report_client_id_status(
self.edge_id, ClientConstants.MSG_MLOPS_CLIENT_STATUS_FINISHED,
@@ -326,7 +333,6 @@ def run_impl(self, run_extend_queue_list, sender_message_center,
return True
elif op == "update" or op == "rollback":
# Update is combine of delete and add
worker_ip = GeneralConstants.get_ip_address(self.request_json)
for rank in replica_rank_to_update:
# Delete a replica (container) if exists
self.replica_handler.remove_replica(rank)
@@ -340,7 +346,8 @@ def run_impl(self, run_extend_queue_list, sender_message_center,

# TODO (Raphael) check if this will allow another job to seize the gpu during high concurrency:
try:
JobRunnerUtils.get_instance().release_partial_job_gpu(run_id, self.edge_id, replica_occupied_gpu_ids)
JobRunnerUtils.get_instance().release_partial_job_gpu(
run_id, self.edge_id, replica_occupied_gpu_ids)
except Exception as e:
if op == "rollback":
pass
@@ -387,7 +394,7 @@ def run_impl(self, run_extend_queue_list, sender_message_center,
JobRunnerUtils.get_instance().release_partial_job_gpu(
run_id, self.edge_id, replica_occupied_gpu_ids)

result_payload = self.send_deployment_results(
self.send_deployment_results(
end_point_name, self.edge_id, ClientConstants.MSG_MODELOPS_DEPLOYMENT_STATUS_FAILED,
model_id, model_name, inference_output_url, inference_model_version, inference_port,
inference_engine, model_metadata, model_config)
@@ -402,15 +409,19 @@ def run_impl(self, run_extend_queue_list, sender_message_center,
result_payload = self.send_deployment_results(
end_point_name, self.edge_id, ClientConstants.MSG_MODELOPS_DEPLOYMENT_STATUS_DEPLOYED,
model_id, model_name, inference_output_url, model_version, inference_port_external,
inference_engine, model_metadata, model_config, replica_no=rank + 1)
inference_engine, model_metadata, model_config, replica_no=rank + 1,
connectivity=connectivity
)

if inference_port_external != inference_port: # Save internal port to local db
logging.info("inference_port_external {} != inference_port {}".format(
inference_port_external, inference_port))
result_payload = self.construct_deployment_results(
end_point_name, self.edge_id, ClientConstants.MSG_MODELOPS_DEPLOYMENT_STATUS_DEPLOYED,
model_id, model_name, inference_output_url, model_version, inference_port,
inference_engine, model_metadata, model_config, replica_no=rank + 1)
inference_engine, model_metadata, model_config, replica_no=rank + 1,
connectivity=connectivity
)

FedMLModelDatabase.get_instance().set_deployment_result(
run_id, end_point_name, model_name, model_version, self.edge_id,
@@ -433,7 +444,8 @@ def run_impl(self, run_extend_queue_list, sender_message_center,
def construct_deployment_results(self, end_point_name, device_id, model_status,
model_id, model_name, model_inference_url,
model_version, inference_port, inference_engine,
model_metadata, model_config, replica_no=1):
model_metadata, model_config, replica_no=1,
connectivity=ClientConstants.WORKER_CONNECTIVITY_TYPE_DEFAULT):
deployment_results_payload = {"end_point_id": self.run_id, "end_point_name": end_point_name,
"model_id": model_id, "model_name": model_name,
"model_url": model_inference_url, "model_version": model_version,
@@ -444,6 +456,7 @@ def construct_deployment_results(self, end_point_name, device_id, model_status,
"model_status": model_status,
"inference_port": inference_port,
"replica_no": replica_no,
"connectivity_type": connectivity,
}
return deployment_results_payload

@@ -466,30 +479,22 @@ def construct_deployment_status(self, end_point_name, device_id,
def send_deployment_results(self, end_point_name, device_id, model_status,
model_id, model_name, model_inference_url,
model_version, inference_port, inference_engine,
model_metadata, model_config, replica_no=1):
model_metadata, model_config, replica_no=1,
connectivity=ClientConstants.WORKER_CONNECTIVITY_TYPE_DEFAULT):
deployment_results_topic = "model_device/model_device/return_deployment_result/{}/{}".format(
self.run_id, device_id)

deployment_results_payload = self.construct_deployment_results(
end_point_name, device_id, model_status,
model_id, model_name, model_inference_url,
model_version, inference_port, inference_engine,
model_metadata, model_config, replica_no=replica_no)
model_metadata, model_config, replica_no=replica_no, connectivity=connectivity)

logging.info("[client] send_deployment_results: topic {}, payload {}.".format(deployment_results_topic,
deployment_results_payload))
self.message_center.send_message_json(deployment_results_topic, json.dumps(deployment_results_payload))
return deployment_results_payload

def send_deployment_status(self, end_point_name, device_id,
model_id, model_name, model_version,
model_inference_url, model_status,
inference_port=ClientConstants.MODEL_INFERENCE_DEFAULT_PORT,
replica_no=1, # start from 1
):
# Deprecated
pass

def reset_devices_status(self, edge_id, status):
self.status_reporter.run_id = self.run_id
self.status_reporter.edge_id = edge_id
6 changes: 3 additions & 3 deletions python/tests/cross-silo/run_cross_silo.sh
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
#!/bin/bash
set -e
WORKSPACE=$(pwd)
PROJECT_HOME=$WORKSPACE/../../
cd $PROJECT_HOME
# PROJECT_HOME=$WORKSPACE/../../
# cd $PROJECT_HOME

cd examples/cross_silo/mqtt_s3_fedavg_mnist_lr_example/custom_data_and_model
cd examples/federate/cross_silo/mqtt_s3_fedavg_mnist_lr_example/custom_data_and_model

# run client(s)
RUN_ID="$(python -c "import uuid; print(uuid.uuid4().hex)")"
4 changes: 2 additions & 2 deletions python/tests/smoke_test/cli/build.sh
Original file line number Diff line number Diff line change
@@ -16,7 +16,7 @@
# --help Show this message and exit.

# build client package
cd ../../../examples/cross_silo/mqtt_s3_fedavg_mnist_lr_example/one_line
cd ../../../examples/federate/cross_silo/mqtt_s3_fedavg_mnist_lr_example/one_line
echo "$PWD"

SOURCE=client
@@ -30,4 +30,4 @@ SOURCE=server
ENTRY=torch_server.py
CONFIG=config
DEST=./mlops
fedml build -t server -sf $SOURCE -ep $ENTRY -cf $CONFIG -df $DEST
fedml build -t server -sf $SOURCE -ep $ENTRY -cf $CONFIG -df $DEST
38 changes: 38 additions & 0 deletions python/tests/test_deploy/test_deploy.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
import os.path
import time
import fedml
# Login
fedml.set_env_version("test")
fedml.set_local_on_premise_platform_port(18080)
error_code, error_msg = fedml.api.fedml_login(api_key="")
if error_code != 0:
raise Exception("API Key is invalid!")

# Yaml file
cur_dir = os.path.dirname(__file__)
fedml_dir = os.path.dirname(cur_dir)
python_dir = os.path.dirname(fedml_dir)
yaml_file = os.path.join(python_dir, "examples", "launch", "serve_job_mnist.yaml")

# Launch job
launch_result_dict = {}
launch_result_status = {}

launch_result = fedml.api.launch_job(yaml_file)
print("Endpoint id is", launch_result.inner_id)

cnt = 0
while 1:
try:
r = fedml.api.get_endpoint(endpoint_id=launch_result.inner_id)
except Exception as e:
raise Exception(f"FAILED to get endpoint:{launch_result.inner_id}. {e}")
if r.status == "DEPLOYED":
print("Deployment has been successfully!")
break
elif r.status == "FAILED":
raise Exception("FAILED to deploy.")
time.sleep(1)
cnt += 1
if cnt %3 ==0:
print('Deployment status is', r.status)
29 changes: 29 additions & 0 deletions python/tests/test_federate/test_federate.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# - name: test simulation-sp
# working-directory: ${{ steps.fedml_source_code_home.outputs.dir }}
# run: |
# cd ${{ format('{0}', steps.fedml_source_code_home.outputs.dir) }}/python
WORKSPACE=`pwd`
echo $WORKSPACE
cd $WORKSPACE/python/examples/federate/quick_start/parrot
python torch_fedavg_mnist_lr_one_line_example.py --cf fedml_config.yaml
python torch_fedavg_mnist_lr_custum_data_and_model_example.py --cf fedml_config.yaml

cd $WORKSPACE/python/examples/federate/simulation/sp_decentralized_mnist_lr_example
python torch_fedavg_mnist_lr_step_by_step_example.py --cf fedml_config.yaml

cd $WORKSPACE/python/examples/federate/simulation/sp_fednova_mnist_lr_example
python torch_fednova_mnist_lr_step_by_step_example.py --cf fedml_config.yaml

cd $WORKSPACE/python/examples/federate/simulation/sp_fedopt_mnist_lr_example
python torch_fedopt_mnist_lr_step_by_step_example.py --cf fedml_config.yaml

cd $WORKSPACE/python/examples/federate/simulation/sp_hierarchicalfl_mnist_lr_example
python torch_hierarchicalfl_mnist_lr_step_by_step_example.py --cf fedml_config.yaml


cd $WORKSPACE/python/examples/federate/simulation/sp_turboaggregate_mnist_lr_example
python torch_turboaggregate_mnist_lr_step_by_step_example.py --cf fedml_config.yaml


cd $WORKSPACE/python/examples/federate/simulation/sp_vertical_mnist_lr_example
python torch_vertical_mnist_lr_step_by_step_example.py --cf fedml_config.yaml
49 changes: 49 additions & 0 deletions python/tests/test_launch/test_launch.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
import os.path
import time
import fedml
from fedml.api.constants import RunStatus

# Login
fedml.set_env_version("test")
fedml.set_local_on_premise_platform_port(18080)
error_code, error_msg = fedml.api.fedml_login(api_key="")
if error_code != 0:
raise Exception("API Key is invalid!")

# Yaml file
cur_dir = os.path.dirname(__file__)
fedml_dir = os.path.dirname(cur_dir)
python_dir = os.path.dirname(fedml_dir)
yaml_file = os.path.join(python_dir, "examples", "launch", "hello_job.yaml")

# Launch job

launch_result = fedml.api.launch_job(yaml_file)

# launch_result = fedml.api.launch_job_on_cluster(yaml_file, "alex-cluster")
if launch_result.result_code != 0:
raise Exception(f"Failed to launch job. Reason: {launch_result.result_message}")

# check job status
while 1:
time.sleep(1)
# if
# if launch_result_status[run_id] == RunStatus.FINISHED:
# continue
log_result = fedml.api.run_logs(launch_result.run_id, 1, 5)
if log_result is None or log_result.run_status is None:
raise Exception(f"Failed to get job status.")

print(f"run_id: {launch_result.run_id} run_status: {log_result.run_status}")

if log_result.run_status in [RunStatus.ERROR, RunStatus.FAILED]:
log_result = fedml.api.run_logs(launch_result.run_id, 1, 100)
if log_result is None or log_result.run_status is None:
raise Exception(f"run_id:{launch_result.run_id} run_status:{log_result.run_status} and failed to get run logs.")

raise Exception(f"run_id:{launch_result.run_id} run_status:{log_result.run_status} run logs: {log_result.log_line_list}")
if log_result.run_status == RunStatus.FINISHED:
print(f"Job finished successfully.")
break


30 changes: 30 additions & 0 deletions python/tests/test_server/test_server.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
import os.path
import time
import fedml
from fedml.api.constants import RunStatus

# Login
fedml.set_env_version("test")
fedml.set_local_on_premise_platform_port(18080)
error_code, error_msg = fedml.api.fedml_login(api_key="")
if error_code != 0:
raise Exception("API Key is invalid!")

# Yaml file
cur_dir = os.path.dirname(__file__)
fedml_dir = os.path.dirname(cur_dir)
python_dir = os.path.dirname(fedml_dir)
yaml_file = os.path.join(python_dir, "examples", "launch", "serve_job_mnist.yaml")

# Launch job
launch_result_dict = {}
launch_result_status = {}

launch_result = fedml.api.launch_job(yaml_file)

# launch_result = fedml.api.launch_job_on_cluster(yaml_file, "alex-cluster")
if launch_result.result_code != 0:
raise Exception(f"Failed to launch job. Reason: {launch_result.result_message}")

launch_result_dict[launch_result.run_id] = launch_result
launch_result_status[launch_result.run_id] = RunStatus.STARTING
48 changes: 48 additions & 0 deletions python/tests/test_train/test_train.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
import os.path
import time
import fedml
from fedml.api.constants import RunStatus

# Login
fedml.set_env_version("test")
fedml.set_local_on_premise_platform_port(18080)
error_code, error_msg = fedml.api.fedml_login(api_key="1316b93c82da40ce90113a2ed12f0b14")
if error_code != 0:
raise Exception("API Key is invalid!")

# Yaml file
cur_dir = os.path.dirname(__file__)
fedml_dir = os.path.dirname(cur_dir)
python_dir = os.path.dirname(fedml_dir)
yaml_file = os.path.join(python_dir, "examples", "train", "mnist_train", "train.yaml")

# Launch job

launch_result = fedml.api.launch_job(yaml_file)

# launch_result = fedml.api.launch_job_on_cluster(yaml_file, "alex-cluster")
if launch_result.result_code != 0:
raise Exception(f"Failed to launch job. Reason: {launch_result.result_message}")

# check job status
while 1:
time.sleep(1)
# if
# if launch_result_status[run_id] == RunStatus.FINISHED:
# continue
log_result = fedml.api.run_logs(launch_result.run_id, 1, 5)
if log_result is None or log_result.run_status is None:
raise Exception(f"Failed to get job status.")

print(f"run_id: {launch_result.run_id} run_status: {log_result.run_status}")

if log_result.run_status in [RunStatus.ERROR, RunStatus.FAILED]:
log_result = fedml.api.run_logs(launch_result.run_id, 1, 100)
if log_result is None or log_result.run_status is None:
raise Exception(f"run_id:{launch_result.run_id} run_status:{log_result.run_status} and failed to get run logs.")

raise Exception(f"run_id:{launch_result.run_id} run_status:{log_result.run_status} run logs: {log_result.log_line_list}")
if log_result.run_status == RunStatus.FINISHED:
print(f"Job finished successfully.")
break