Have fun!

Welcome to the source code repository for ChatGPDB! Here you'll see how the sausage is made. We use the following technology:

Django - Implements and runs the webserver
Websockets - Provides the snappy interface during compute-intensive, slow (relative to a typical request lifetime) model inference.
Huggingface - Provides the Python library (transformers) and model repository to download and use pre-trained machine learning models like GPT.

Local setup

Think ChatGPDB is cool? Want to set it up yourself? Read the instructions below to find out how.

Clone the GPT2-Large pretrained model by OpenAI. The model is available at HuggingFace. Use the following command inside this repository:

$ git clone https://huggingface.co/gpt2-large

Use git lfs to download the model files (you may have to install this git extension if the command fails). This may take awhile as git-lfs needs to download about 15 GB of model files (depending on how git is configured, LFS may be invoked as part of step 1).

$ cd gpt2-large
$ git lfs pull

Create the conda environment with the necessary packages. Note that this environment builds packages capable of accelerating GPT2 inference using NVidia GPUs in a CUDA environment. It is known to work on Linux, but does not work as-is on Mac. Create the environment with:

$ conda env create -f environment.yaml

Activate the new environment using conda activate chatgpdb-dev
Launch the server using python manage.py runserver and off you go!

GPU support

Have an NVidia GPU capable of accelerating PyTorch model inference? Great! The default configuration is already set up to take advantage.

Don't have an NVidia GPU but want to try it out, anyway? Change the RUN_CUDA variable inside chatgpdb/settings.py file to False.

Note that many LLMs are large (many millions to many billions of parameters), meaning that large memory GPUs are often required for inference.

Response length

Want a longer or shorter response from GPT? Longer responses take longer to generate (unsurprisingly), but may also be more entertaining. To tune how long of a sequence the model should generate, set the CHATGPDB_RESPONSE_WORD_COUNT environment variable to the desired integer value and launch the web server.

Have fun!

Brought to you by Jason Swails and Thomas Watson

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.vscode		.vscode
chat		chat
chatgpdb		chatgpdb
.gitignore		.gitignore
README.md		README.md
environment.yaml		environment.yaml
manage.py		manage.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Local setup

GPU support

Response length

Have fun!

About

Releases

Packages

Contributors 2

Languages

swails/chatgpdb

Folders and files

Latest commit

History

Repository files navigation

Local setup

GPU support

Response length

Have fun!

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages