Christopher Zenzel

Contents

Table of Contents

Share

Using Generative AI with Llama to Build Sarcastic Weather: Part 2

white robot near brown wall

Creating a sarcastic weather bot with Llama requires you to download an uncensored model with Hugging Face, but let’s first dive into exploring the differences between Censored and Uncensored models.

Censorship vs Uncensored

Most Generative AI solutions on the Internet, such as ChatGPT, require the use of Censorship to protect the end users from receiving inappropriate content of a wide range of subjects. When you use Hugging Face you can obtain models for Llama, or based of Llama, that have no censorship and are fined tuned to provide vulgar language.

Hugging Face

Let’s dive into Hugging Face. Hugging Face is a web site that readily provides models such as the Llama 7B Uncensored Chat Model from The Bloke. You can navigate to his/her repository by clicking here.

Loading the Model

After downloading the model store the model inside the models directory inside of your Llama.cpp repository directory. Now it is time to load the model into a server that is accessible on your computer.

To load the model(s) from The Bloke into Llama.cpp’s provided server application please issue the following command in your Terminal:

./server -c 4096 -m models/llama_uncensored_.gguf --host 0.0.0.0

Shortly afterwards you will see the model loading into your computer’s memory and/or graphic processing unit (GPUs) memory and you will be able to access the server from your machine inside your browser by navigating to localhost:8080. If you are inside a virtual machine by using the host argument, you can access it from your host’s computer.

Conclusion

Now you have successfully loaded an uncensored Llama model into your computer’s local LLM software. I wonder where this world will lead you next on building your own Generative AI sarcastic weather bot!