Basic-UI-for-GPT-J-6B-with-low-vram

A repository to run GPT-J-6B on low vram systems by using both ram, vram and pinned memory.

There seem to be some issues with the weights in the drive link. There seems to be some performance loss, most likely because of poor 16 bit conversion.

How to run :

Use - pip install git+https://github.com/finetuneanon/transformers@gpt-neo-localattention3
Use the link - https://drive.google.com/file/d/1tboTvohQifN6f1JiSV8hnciyNKvj9pvm/view?usp=sharing to dowload the model that has been saved as described here - https://github.com/arrmansa/saving-and-loading-large-models-pytorch

Timing (2000 token context)

1

system -

16 gb ddr4 ram . 1070 8gb gpu.
23 blocks on ram (ram_blocks = 23) out of which 18 are on shared/pinned memory (max_shared_ram_blocks = 18).

timing -

single run of the model(inputs) takes 6.5 seconds.
35 seconds to generate 25 tokens at 2000 context. (1.4 seconds/token)

2

system -

16 gb ddr4 ram . 1060 6gb gpu.
26 blocks on ram (ram_blocks = 26) out of which 18 are on shared/pinned memory (max_shared_ram_blocks = 18).

timing -

40 seconds to generate 25 tokens at 2000 context. (1.6 seconds/token)

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
GPT-J-6B-Low-Vram-UI.ipynb		GPT-J-6B-Low-Vram-UI.ipynb
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Basic-UI-for-GPT-J-6B-with-low-vram

There seem to be some issues with the weights in the drive link. There seems to be some performance loss, most likely because of poor 16 bit conversion.

How to run :

Timing (2000 token context)

1

system -

timing -

2

system -

timing -

About

Releases

Packages

Languages

License

arrmansa/Basic-UI-for-GPT-J-6B-with-low-vram

Folders and files

Latest commit

History

Repository files navigation

Basic-UI-for-GPT-J-6B-with-low-vram

There seem to be some issues with the weights in the drive link. There seems to be some performance loss, most likely because of poor 16 bit conversion.

How to run :

Timing (2000 token context)

1

system -

timing -

2

system -

timing -

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages