banner chatgpt ollama raspberry pi -- Mariia Shalabaieva / Unsplash / TDyan / RPTips

How to Run ChatGPT Locally on Raspberry Pi with Ollama

If you click our links and make a purchase, we may earn an affiliate commission. Learn more

Talking with AI like ChatGPT might be something we’re all getting used to on the daily, but it’s rather frustrating when you run into conversation limits, not to mention feeding AI companies everything they want to know about you. Is it possible to overcome these issues by running an AI chatbot locally on your Raspberry Pi? Yep, and in this post, I’ll walk you through the installation in a few easy steps.

AI chatbots can be self-hosted on a Raspberry Pi. Running large language models (LLMs) on the Raspberry Pi is free, with no usage limits, and can be adapted for different use cases. They can be installed using Ollama with a WebUI chat interface.

Let’s start with a brief overview of why you might want to do this. I’ll also show you how to install everything and give advice on which AI models in particular run well on a Raspberry Pi.

If you’re new to Raspberry Pi or Linux, I’ve got something that can help you right away!
Download my free Linux commands cheat sheet – it’s a quick reference guide with all the essential commands you’ll need to get things done on your Raspberry Pi. Click here to get it for free!

Overview: Why Host an AI Model on Raspberry Pi?

You probably use ChatGPT, Claude, or Grok through a web browser or mobile app. These AI chatbots run from data centers with massive computing resources. So when you ask, “What’s the average size of an African swallow?”, ChatGPT pools these resources to figure out an answer and respond to you quickly.

However, there are alternative ways to access these AI models. One way is to use scripts that connect to the ChatGPT API. Another way is to run the AI model entirely from your own computer (what we’ll be doing in this tutorial).

You might’ve guessed that this won’t be as powerful or as fast as using the online apps, so let me quickly cover why you’d want to do this before I jump into the installation steps.

Difference Between Using ChatGPT API

By the way, hosting your own AI chatbot locally is NOT the same as using the ChatGPT API. The API is a way to access ChatGPT (or other AI models) through writing code or scripts. It also generally costs money to access, depending on the model you select and how much you call on it.

If you’re interested in that approach, we’ve already written a guide on it here: Use the ChatGPT API for any Raspberry Pi Project.

What we’re doing in this article is different: hosting an AI model to run on the Raspberry Pi. There are upsides and downsides to this approach, which I’ll briefly cover next.

Benefits of Running an AI Model Locally

Here are a few reasons you might find it attractive to self-host AI models:

  • Cost – It’s free with unlimited usage; it beats having token limits and monthly subscriptions.
  • Local inference – AI projects can run entirely locally without internet access.
  • Privacy – Ever ask ChatGPT to draw what you look like or pinpoint where you live? It can be scarily accurate with how much data it collects. When you host the AI model yourself, the data is under your control.
  • Security – If you want to perform sensitive functions, like feeding your personal data or customer info to the AI, it’s more secure to keep this data local than sending it over the internet.
  • Switch between different models – You can download different AI models and switch between ones with different strengths—a great way to figure out which ones you like the most.
  • User-friendly interface – There’s no programming knowledge needed, unlike using an API.
  • Customization – Integrate a model into your specific use case.

Limitations of Running an AI Model Locally

OK, that all sounds amazing on paper, but the reality is that there are definitely limitations to running AI models from a Raspberry Pi:

  • Performance – The self-hosted models perform more slowly on Pi hardware as compared to the responses you can get from online apps.
  • Context limits – The models that run well on the Raspberry Pi will have lower context limits, meaning they can only handle remembering so much of a conversation, like when you want to modify entire programs or summarize multiple chapters of a book.
    For example, one of the models I’ve hosted on my Pi can only handle 128k tokens, whereas the ones you can access through apps have large context windows like 200k tokens and higher.
  • Model complexity – The RAM on a Raspberry Pi will only let you run AI models up to a certain parameter size, whereas ChatGPT or Grok online lets you access models with much higher parameter counts, which means that they’re able to understand more complex questions.

Hardware Requirements for Ollama on Raspberry Pi

How Much Does It Really Cost to Start With Raspberry Pi?

To follow this guide, here’s what I recommend in terms of hardware:

  • Raspberry Pi: The Raspberry Pi 5 is your best bet here because we need every bit of computing power we can get. It might still be worth a shot on a Raspberry Pi 4 as a learning experiment, but be aware that it might be quite slow.
  • Memory: 4GB minimum, 8GB recommended. The amount of RAM your board has determines how complex a model you can load, or how many you can load simultaneously.
  • Storage: 32GB of disk space gives you enough room to download and play with different models. If possible, I also recommend using faster storage, like an external USB SSD or an internal NVMe SSD, to improve response times.

That covers hardware, but what about software?
We’ll cover those parts in the next section.

Check this: I tried to replace my main PC with a Pi 5, here's what happened.

Lost in the terminal? Grab My Pi Cheat-Sheet!
Download the free PDF, keep it open, and stop wasting time on Google.
Download now

Ollama Installation on Raspberry Pi

Now that you’ve got the hardware, you’re ready to host AI models on your Raspberry Pi.

Here’s an overview of the big picture steps:

  • Install a 64-bit operating system.
  • Install Ollama to load AI models.
  • Install Open WebUI to get a user-friendly chat interface.

Let’s go over each of these steps in detail.

Installing an Operating System

raspberry pi imager flashing raspberry pi os lite

The first step is to install a 64-bit operating system. Running AI models on your Raspberry Pi is intensive, so I recommend starting with a lightweight OS to conserve resources.

For this tutorial, I installed Raspberry Pi OS Lite, which is more minimal with no desktop environment, and you can do the same by following our guide. But other lightweight options work too, like Ubuntu Server or DietPi.

(Optional: enable SSH during installation to follow the steps from another PC, like I did.)

Once your system’s running, update your system:
sudo apt update
sudo apt upgrade

Restart:
sudo reboot

Are you a bit lost in the Linux command line? Check this article first for the most important commands to remember and a free downloadable cheat sheet so you can have the commands at your fingertips.

That covers the base system, and now it’s time for the key software.

Installing Ollama

The next step is to install Ollama.
Ollama is the software in charge of hosting AI models.

All you need to do is open a terminal and paste one command.
Run the official Ollama installation script:
sudo curl -fsSL https://ollama.com/install.sh | sh

Note: At the time of writing, there was a bug that caused this script to crash on the Pi. It can be fixed by adding a flag to use the older HTTP1.1 standard:
sudo curl -fsSL --http1.1 https://ollama.com/install.sh | sh

This script will download Ollama, install it, and update you on its progress.

Lost in the terminal? Grab My Pi Cheat-Sheet!
Download the free PDF, keep it open, and stop wasting time on Google.
Download now
install ollama from terminal script
(AI performs better on GPU hardware, but as you can see from the last line above, Ollama is not programmed to use the Raspberry Pi’s GPU; thus, it will only utilize the CPU.)

When the script is finished, we can call the program with the ollama command:

ollama command line options

You could do everything from here on out from the command line. But I think we’re all more familiar with chatting with AI through a web browser. So let’s install a GUI module for that next.

Installing WebUI

ollama webui chat interface

The last step is to install Open WebUI.
WebUI creates a user-friendly chat interface for Ollama.

To make it easier to install WebUI on Raspberry Pi hardware, we will run it as a container using Docker. (Learn more about Docker here.)

Install Docker:
sudo apt install docker.io

Load the WebUI container image:
sudo docker run -d -p 3000:8080 -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama

ollama webui docker installation

This Docker command will download WebUI for you and then run it in the background as a server.

Congrats, you’ve finished the installation process! In the next section, I’ll show you how to use it.

Getting Started with Ollama

Now that you’ve installed Ollama and WebUI, how do you actually talk to the AI?
In this section, you’ll learn how to:

  • Access the web interface.
  • Select and load an AI model.
  • Chat with the AI.

Accessing the Web Interface

Let’s start by browsing to the web interface from another PC or mobile device on your network.
You can also do this from the Raspberry Pi itself.

Point a web browser to your Raspberry Pi’s IP address on port 3000.
For example:
https://192.168.1.100:3000

If you don’t know how to find the IP, check out our guide: 7 Easy Ways to Find your Raspberry Pi IP Address.

It took a few tries before it came up for me, so give it some time:
ollama webui welcome screen

Lost in the terminal? Grab My Pi Cheat-Sheet!
Download the free PDF, keep it open, and stop wasting time on Google.
Download now

Click Get Started at the bottom.
You’ll be prompted to make a local user account:
ollama webui create admin account

Enter your details, and click Create Admin Account to finish the setup.

Success, you’re in!
Now that everything’s running, let’s learn how to load an AI model.

Choosing an AI Model

The great thing about using Ollama to host AI models yourself is that you have a lot of options, such as different versions of deepseek, gemma, and phi. You can load multiple models, as many as you like, as long as you have the disk space and memory for them.

However, a more important number to pay attention to when selecting a model is its parameter size. For example, an AI model written as qwen3:7b means that it was built with 7 billion parameters.

In general, the parameter size tells you a few things:

  • The larger an AI model’s parameter size is, the more complex it is.
    In other words, it’s more likely to give you better answers.
  • Models with larger parameter sizes will take longer to give answers.
  • Models with larger parameter sizes will require more RAM to load.
ollama chatgpt model with ram requirements
ChatGPT models available in Ollama next to their RAM requirements.

As you can see from the image above, the ChatGPT models available for Ollama as of this writing are 20b and 120b. The 120b model requires 65GB RAM—too heavy for the Raspberry Pi. You could load the 20b model if you have a Raspberry Pi 5 with 16GB RAM.

If you have enough RAM, try it to see if you can handle ChatGPT on Pi hardware. But I have a feeling it might feel too slow for you. So for this guide, I will be recommending models that are a better fit for the Raspberry Pi.

For the Raspberry Pi 5:
– I recommend starting with models containing 4b parameters or less.
– Try the following AI first: gemma, qwen, or deepseek.

(gemma3 models with different parameter sizes, RAM requirements, and context windows.)

For the Raspberry Pi 4:
– Try tiny models, maybe 1b parameters or even smaller.
– For example, you could try qwen:0.5b or tinyllama:1.1b.

To see all of the AI models available and their parameter sizes, browse the Ollama library.

Loading the AI Model

Okay, now that you know which model you want, how do you actually get it?

You can load AI models into Ollama from the WebUI interface.
Here’s how:

  • From the Ollama library, copy the name of the model you want to load.
    For this example, let’s use gemma3:4b.
  • In your Ollama web interface, click Select a Model at the top.
    In the search box, paste the model name.
    ollama webui pull model
  • Click the Pull option, which means to download it.
Lost in the terminal? Grab My Pi Cheat-Sheet!
Download the free PDF, keep it open, and stop wasting time on Google.
Download now

Once it’s finished downloading, you can select which model to load by using the dropdown menu at the top.

Chatting With an AI Model

From this point forward, it’s just like how you would interact with ChatGPT: type your question, hit Enter, and wait for a response.

The responses will be slower than what you’re used to from cloud-based AI, but in return, you get the benefits of no question limits, no user tracking, and no internet connection required.

I was able to load models with 7 billion parameters on my Raspberry Pi 5 8GB, and although the answers were more accurate, they were also too slow for my liking.

Maybe I’m too impatient. So I went down to a lower parameter count, and I found qwen:0.5b and gemma3:1b were much more responsive on my setup.

Hovering over the info icon at the bottom of an answer can give you an idea of how well a model performed in response to your particular question:

That being said, this is just the beginning, and we may see performance improvements in time.


🛠 This tutorial doesn't work anymore? Report the issue here, so that I can update it!

If this project doesn’t work as expected on your setup, don’t worry. You can get help directly from me and other Pi users inside the RaspberryTips Community. Try it for $1 and fix it together.

Going Further

I hope that was enough to get you going. If you wanted to go further, here are a few different things you can play around with:

Install AI models with specialized functions – Try specialized models made for coding (e.g., phi3), image recognition (e.g., LLaVA), or even roleplay (e.g., nemotron-mini).

Benchmark to compare AI models – Install a benchmark tool like ollama-benchmark to determine which models perform the best for your hardware and use case.

(A higher tokens/s benchmark generally means better performance.)

Optimizing for better performance – After you’ve settled on the model that you think gives the best results for your purpose, research how to tweak its many settings and optimize it for Pi hardware. You can find some of these settings under the ‘Controls’ menu, while other advanced tweaks may need to be entered via terminal commands.

Custom applications – Once you’ve found the perfect model for what you want to do, you can have the AI work alongside a custom program that you’ve designed. For example, you can make a Python game that uses AI to respond to player input, or you can make the AI the brain for a robot that reacts based on certain situations.

Now that you’ve got AI models running on your Raspberry Pi, all you need to do to make cool projects is to inject your creativity. After all, you’ll still need to provide some of the intelligence for artificial intelligence.

Whenever you’re ready, here are other ways I can help you:

Test Your Raspberry Pi Level (Free): Not sure why everything takes so long on your Raspberry Pi? Take this free 3-minute assessment and see what’s causing the problems.

The RaspberryTips Community: Need help or want to discuss your Raspberry Pi projects with others who actually get it? Join the RaspberryTips Community and get access to private forums, exclusive lessons, and direct help (try it for just $1).

Master your Raspberry Pi in 30 days: If you are looking for the best tips to become an expert on Raspberry Pi, this book is for you. Learn useful Linux skills and practice multiple projects with step-by-step guides.

Master Python on Raspberry Pi: Create, understand, and improve any Python script for your Raspberry Pi. Learn the essentials step-by-step without losing time understanding useless concepts.

You can also find all my recommendations for tools and hardware on this page.

Similar Posts