Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Deepspeed integration [WIP] #62

Closed

Conversation

mayank31398
Copy link

No description provided.

@mayank31398 mayank31398 changed the title Add Deepspeed integration Add Deepspeed integration [WIP] Jun 14, 2023
jeffra and others added 19 commits June 20, 2023 20:17
* Checkpoint conversion tools

* Fix formatting

* 1) Provide args in converted checkpoint
2) Reshape TP and PP degrees

* Fix typo

* Fix link

* Tweak tag

* Fix converted TP and PP sizes

* For release mode

* Update README

* Nested embedding dicts
Iteration folder
latest checkpoint version file
* add direct meg-ds to hf format script (NVIDIA#110)

* add direct meg-ds to hf format script (part2) (NVIDIA#111)

* add direct meg-ds to hf format script

* split into 2 function

* update the usage doc

* make scripts executable

* add shebang

Co-authored-by: Stas Bekman <[email protected]>
Co-authored-by: Stas Bekman <[email protected]>
GuanhuaWang and others added 7 commits June 21, 2023 19:38
* add checkpoint measurement

* Update CODEOWNERS

* add TFLOP per sec support

* remove duplicate tflops calculation

* remove unnecessary comment

* remove comments

* remove comment
* fix a bug when run on bf16+pp

* add a space to fix the tab error
…VIDIA#106)

use int64_t instead of int32_t to avoid integer overflow.

Signed-off-by: yulu.jia <[email protected]>
Co-authored-by: yulu.jia <[email protected]>
@mayank31398 mayank31398 deleted the deepspeed branch August 10, 2024 18:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants