Drastically improved load speed of llama when low_resource=False #370

cmdr64 · 2023-10-10T20:12:24Z

Added an argument low_cpu_mem_usage=True that drastically reduces load time of the model. Before I made this change, loading Llama took more than a minute. Now it loads in seconds.

Tested on NVIDIA 4090 Founder's Edition.

…mem_usage=True, drastically speeding up load time (cuts more than a minute off of load)

For non low_resource llama loading code, I added an argument low_cpu_…

5ab62fe

…mem_usage=True, drastically speeding up load time (cuts more than a minute off of load)

cmdr64 force-pushed the load-llama-faster branch from 17a1e35 to 5ab62fe Compare October 11, 2023 00:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Drastically improved load speed of llama when low_resource=False #370

Drastically improved load speed of llama when low_resource=False #370

cmdr64 commented Oct 10, 2023

Drastically improved load speed of llama when low_resource=False #370

Are you sure you want to change the base?

Drastically improved load speed of llama when low_resource=False #370

Conversation

cmdr64 commented Oct 10, 2023