-
Drop us a line
info@elvt.io
-
Give us a call
(202) 945-4833
-
USA Based
Mon - Fri
Incredible Large Language Models (LLM) have been released by technology companies including OpenAI ChatGPT, Facebook Llama, and Google Bard. LLMs can be used to generate text, summarize text, question answering, and more, but the focus has largely been focused on their ability to answer questions with surprisingly strong answers.
These models have grown to 100+ billion parameters and are trained on hundreds of gigabytes of text data. This much data, the number of parameters, and advanced modeling architectures make for very convincing AI models. Although, the models do still have their limitations and are marked with warnings on their potential biases and hallucinations.
How do we run such large models? Turns out it can be pretty hard because the model parameter counts are so large which makes loading them require dozens of gigabytes of memory. But what if we want to run them on our own machines?
We can use a few tricks and variations of the models in order to run them on a consumer desktop and pop it into a web app for easy use. Here we are going to use the Facebook Llama 7B model which is the smallest variation of the Llama model. In addition, we are going to quantize the weights down to 4 bits and change the batch size to one. The quantization reduces the precision of the weights while still maintaining most of the performance. The reduction in batch size increases the latency and limits the scalability of the model by removing the parallel processing.
Finally we will use the open source Dalai software which will quantize and serve the model in our browser!
1.Install packages for the model
sudo apt update sudo apt upgrade sudo apt install g++ build-essential python3.10 python3.10-venv
2.Update ~/.bashrc with
alias python=python3
3.Update the current terminal or restart it
source ~/.bashrc
4.Install nvm
curl -o- <https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.3/install.sh> | bash export NVM_DIR="$([ -z "${XDG_CONFIG_HOME-}" ] && printf %s "${HOME}/.nvm" || printf %s "${XDG_CONFIG_HOME}/nvm")" [ -s "$NVM_DIR/nvm.sh" ] && \\. "$NVM_DIR/nvm.sh" # This loads nvm
5.Install node with nvm
nvm install 18.15.0
6.Install Dalai and run it
npx dalai llama
npx dalai serve
7.Go to localhost:3000 and start using your very own language model!
There you have it, your own personal language model! There is much more to explore about LLMs and their use in improving the efficiency of workers and integrating them with existing products. Follow us to learn more… like how you can use the new ChatGPT plugins that directly integrate with external knowledge by giving the model access to web browsing, code interpreting, and retrieval of self hosted knowledge bases!