Instant Language Model Web App on Your Home Desktophttps://elvt.io/wp-content/uploads/2023/08/Header-1.png17201000ELVT ConsultingELVT Consultinghttps://elvt.io/wp-content/uploads/2023/08/Header-1.png
By: Travis Harrison
Incredible Large Language Models (LLM) have been released by technology companies including OpenAI ChatGPT, Facebook Llama, and Google Bard. LLMs can be used to generate text, summarize text, question answering, and more, but the focus has largely been focused on their ability to answer questions with surprisingly strong answers.
These models have grown to 100+ billion parameters and are trained on hundreds of gigabytes of text data. This much data, the number of parameters, and advanced modeling architectures make for very convincing AI models. Although, the models do still have their limitations and are marked with warnings on their potential biases and hallucinations.
How do we run such large models? Turns out it can be pretty hard because the model parameter counts are so large which makes loading them require dozens of gigabytes of memory. But what if we want to run them on our own machines?
We can use a few tricks and variations of the models in order to run them on a consumer desktop and pop it into a web app for easy use. Here we are going to use the Facebook Llama 7B model which is the smallest variation of the Llama model. In addition, we are going to quantize the weights down to 4 bits and change the batch size to one. The quantization reduces the precision of the weights while still maintaining most of the performance. The reduction in batch size increases the latency and limits the scalability of the model by removing the parallel processing.
Finally we will use the open source Dalai software which will quantize and serve the model in our browser!
There you have it, your own personal language model! There is much more to explore about LLMs and their use in improving the efficiency of workers and integrating them with existing products. Follow us to learn more… like how you can use the new ChatGPT plugins that directly integrate with external knowledge by giving the model access to web browsing, code interpreting, and retrieval of self hosted knowledge bases!