How To Build Your First Raspberry Pi Cluster?
Do you have two or more Raspberry Pi at home? Do you want to try putting them together to make a cluster? If so, you’re at the right place. When I bought my second Raspberry Pi, I immediately wanted to build a cluster.
On Raspberry Pi, a cluster can be created by installing the same operating system, apps, and libraries to all nodes. To run commands on all nodes, MPICH is the only app required. There is also a Python library to improve the possibilities: MPI4PY.
As this can be a complex topic for beginners, I’ll start with a little introduction on clusters in general. Then, I’ll explain what I have done and how you can do the same on your side.
If you’re new to Raspberry Pi or Linux, I’ve got something that can help you right away!
Download my free Linux commands cheat sheet – it’s a quick reference guide with all the essential commands you’ll need to get things done on your Raspberry Pi. Click here to get it for free!
Cluster presentation
What’s a cluster?
Basically, a cluster is a group of computers in a single entity.
The goal is to make them work together to improve the global performance.
All of the computers in a cluster work on the same task, reducing the time needed to finish it.
Don’t confuse computer clusters with load balancing.
In load balancing architecture, each computer is working on a different task to decrease the master node’s load.
In a cluster, we take advantage of the total power of the cluster to run a task in parallel.
Cluster examples
Computer clusters find their origins in the 60s and are still used today (at the same time as the first works on networking).
The first commercial computer cluster in the history is the Arcnet (see the image on the left).
Its goal was to connect groups of Datapoint 2200 computers.
It’s damn old in the computer history :).
At the time of writing this, the IBM Summit from the ORNL laboratory is the biggest supercomputer in the world.
With over 2 million CPU cores and 3000To of RAM, and always increasing, it will be tough to compete with.
Here is an illustration below if you want to know what it looks like:
Raspberry Pi application
Let’s go back to our more realistic dimensions with the adaptation of that definition on our Raspberry Pi.
As you know, the Raspberry Pi is not very powerful, but it’s cheap.
So, it’s the perfect device to build a cluster.
We can make a Raspberry Pi run tasks faster on 4 devices instead of only one, for a reasonable price.
In this tutorial, I’ll show you how to build your first Raspberry Pi cluster.
You can build a cluster with two nodes to start and add others later if needed.
Prepare your Raspberry Pi cluster
Make a plan
It’s always a good idea to think about what you are building.
I’m doing this exercise for you, with two Raspberry Pi:
- A Raspberry Pi 4B 4G: the master node that will control everything
- A Raspberry 3B+: the second node, to increase global performance
As the preparation phase can be pretty long (especially if you are using many nodes), I’ll prep the 4B only.
Then I’ll copy the SD card to another one, to get a Raspberry Pi 3B+ almost ready without having to do the whole preparation phase on the 3B+.
Finally, there are extra steps for both Raspberry Pi devices to connect them together and run the first script.
If you have more than one node to add, repeat the same process for each node.
Prerequisites
To follow this tutorial, you’ll need:
- 2 or more Raspberry Pi (any model, but I recommend the Raspberry Pi 4B)
- 2 or more SD Cards (check my recommended product page if you need some)
- An inexpensive 5-port gigabit switch to link all Pis together
- Power cables, or a power bank with 2 or more ports
- A Network cable for each Pi (wireless is possible, but not optimal)
- Optional: if you are serious about this project, this cluster case can be useful to stack Raspberry Pi and avoid a giant mess.
The case will optimize your cabling, keep everything tidy, and also cool the nodes correctly. I highly recommend that kind of case if you’re keeping your cluster running frequently.
And for the software, I’ll explain everything in the following parts.
Note: It’s okay to use different-sized SD cards, but you’ll need to install the master on the smallest SD card. Otherwise, you’ll have an issue when flashing a 64 GB image onto a 16 GB SD card :).
Prepare the Master
The first step in my scenario is to make the installation on one Raspberry Pi and then duplicate to the others.
Start with your powerful Raspberry Pi.
Basic installation
Like most projects, we will start with a Raspberry Pi OS installation.
Use Raspberry Pi Imager to get Raspberry Pi OS Lite from the Raspberry Pi Foundation.
Raspberry Pi Desktop is okay, but we don’t need a GUI for this project.
It's a free PDF guide containing every Raspberry Pi Linux command you should know!
Download now
Install Raspberry Pi OS and boot for the first time (if you don’t know how, follow my guide and come back later).
Then you’ll need to follow these additional steps:
- Change a few settings with the raspi-config tool:
- Load raspi-config:
sudo raspi-config
- Enable SSH in Interface Options > SSH.
- Change the host name in System Options > Hostname.
Choose something clear, like “Master.”
- Load raspi-config:
- Update your system:
- As always, start any project with an up-to-date system to avoid any issue.
Update the repository sources:sudo apt update
- Upgrade all packages:
sudo apt upgrade
- As always, start any project with an up-to-date system to avoid any issue.
- Reboot to apply changes:
sudo reboot
The basic installation is now complete, we can move to specific software for this project.
Are you a bit lost in the Linux command line? Check this article first for the most important commands to remember and a free downloadable cheat sheet so you can have the commands at your fingertips.
MPICH Installation
What’s MPICH?
MPICH is the main tool we’ll need to run a cluster.
MPICH is a free implementation from the MPI standard.
MPI stands for Message Passing Interface and its goal is to manage parallel computing architectures.
In short, this is what will allow us to run a script on several Raspberry Pis simultaneously.
MPICH Installation Steps
We are now ready to start the MPICH installation process. If you want the latest version, you can download MPICH from the official website and compile it from the sources, but it’s also available in the Raspberry Pi OS repository.
So, this is the easiest way to install it: sudo apt install mpich
Once done, test to ensure everything is working well with this command:mpiexec -n 1 date
If the master returns the current date, the MPI installation is completed.
Create a Basic Python Script
Ok, now we’ll create a basic Python script to test it with MPI.
- Go to your home folder and create a script:
cd ~
nano test.py
If you are not used to nano, you can read my guide here for more details. - Paste this line inside:
print("Hello")
- Save & exit.
- Make sure your script is working directly with Python:
python test.py
This script should display “Hello” in your terminal. - Then test running it on 4 threads with MPI:
mpiexec -n 4 python test.py
As you can see, this should now display “Hello” four times, so we can also run a Python script four times by using all the processor cores available.
This is nice, but we’re not using the cluster at the moment—it’s just a way to run a script on several threads.
MPI4PY Installation
What’s MPI4PY?
On Raspberry Pi, MPI can be used directly in Fortran and C scripts only. But since the Raspberry Pi runs with Python, we’ll want to add Python capability to our cluster.
To go further with our cluster, we need a library that can be used in a script. The goal of this library is to have communication between all the nodes to run our programs efficiently.
To do this, we’ll be installing the Python library: MPI4PY.
MPI4PY Prerequisites
MPI4PY installation process is easy as it’s available with pip (the Python package manager).
But you’ll need to install some Raspberry Pi OS packages before anything else:sudo apt install python3-pip python3-dev libopenmpi-dev
That’s it, move on to the installation process.
MPI4PY Installation Steps
We can now install the MPI4PY library with pip:sudo apt install python3-mpi4py
It can take more or less time depending on your Raspberry Pi model. Be patient.
If this is working correctly, your master installation is ready. MPI can now run Python scripts, and we can start node preparation.
Duplicate the Master
Now that your Pi can run MPI with Python, the next step is to duplicate the master’s SD card onto other cards—one copy for each node.
To do this, we’ll create an image from the SD card and flash it on the other cards. If you are only creating two nodes, it might be faster to repeat the same procedure as on the master. In this case, you can skip this section.
Create the Master Image
On Windows, you’ll need a software like Win32DiskImager.
Click on the link, download and install it on your computer:
- Start the program.
- In the “Image file” field, choose a temporary directory and a filename such as “cluster_master.img”.
- Then choose the Device letter corresponding to the SD card.
- Finally, press the “Read” button to start the image creation.
This process took about 15 minutes on my computer. - Once done, eject the master SD card and keep it safe.
On Linux, it should be something like:sudo dd if=/dev/mmcblk0 > cluster_master.img
You need to make sure /dev/mmcblk0 is your SD card. You can easily find help for this command if needed (or use man dd to see all options).
Copy Onto SD Card for Each Node
Once the image is ready, you need to put it on the SD card for each node of your cluster:
- Insert the new SD card into your computer.
- In Win32 Disk Imager, select the image filename and the device letter.
- Click on “Write” to create the same SD card.
If you prefer, you can use Etcher to do this.
I typically use Etcher, but we are already in Win32 Disk Imager, so it’s the same.
As a reminder, you need to use an SD card larger than the first one.
Once again, for Linux and macOS users, you can use the dd command if you don’t want to install Etcher.
It's a free PDF guide containing every Raspberry Pi Linux command you should know!
Download now
At the end of this step, you should have one SD card for each node you want to use. All the SD cards should contain the same image from the master we created earlier.
Nodes Configuration
Start All Raspberry Pis
- Insert an SD card in each Raspberry Pi you want to use.
- Start them all.
If you want to use Wi-Fi for one or more nodes, there is an extra step.
For example, in my case I have a Raspberry Pi Zero, and it was easier for me to connect it to my Wi-Fi network.
- Plug a screen and keyboard into the Raspberry Pi you want to use Wi-Fi on.
- Use raspi-config to configure the Wi-Fi:
- Use the following command:
sudo raspi-config
- Go into System Options > Wireless LAN.
- Follow the wizard to select your network (country, SSID and passphrase).
- Use the following command:
Find All IP addresses
Once all the Raspberry Pi are started and plugged in the network, we need to get all IP addresses to use it later:
- Go back to the master node (directly or with SSH).
- Install NMAP:
sudo apt install nmap
nmap is a free tool for network discovery (check the website here).
We’ll use it to find all IP addresses. - Use this command to find all devices on your network with a host name containing “master” .
For the moment, all the Raspberry Pi have the same host name:nmap -sP 192.168.1.* | grep master
Change the network subnet if you’re using another one. - You should get this kind of output:
- I now know my second node IP: 192.168.1.18
You should now have all of your nodes’ IP addresses.
If you don’t know master’s IP, use this command:ip addr
You’ll get something like this:
The IP address is on the second line after the “inet” keyword (192.168.1.69 in this screenshot).
The last step is to note these IP addresses in a text file on your Master node:
- Create a new file in your home folder:
cd ~
nano nodes_ips
- In this file, add a node IP on each line (and only the IP).
For example:192.168.1.15
192.168.1.16
192.168.1.17
192.168.1.18 - That’s all for this part.
Change the Hostname for Each Node
We’ll now assign different hostnames for each node:
- From the master node, connect to the first one with SSH:
ssh username@192.168.1.18
Answer “yes” to the question, and login with your username & password. - Load raspi-config:
- Use this command to access the tool:
sudo raspi-config
- Go into System Options > Hostname.
- Set a new host name for this node, for example “node1”.
- Use this command to access the tool:
- Exit raspi-config and exit this node with:
exit
Repeat these steps for each node you want to add to the cluster.
Exchange SSH Keys
The last step is to let the master to connect to each node via SSH without password.
To allow this, you need to create an SSH key on the master, and then transfer that key to all nodes.
- On the master, create the SSH key with:
ssh-keygen -t rsa
Hit Enter to accept the default values (default path and no password). - This tool generates two keys in your home directory’s /.ssh folder:
- id_rsa: your private key, keep it here
- id_rsa.pub: the public key, you’ll need to send it to peers you want to access without a password
- Transfer the public key to all nodes:
scp /home/tom/.ssh/id_rsa.pub tom@192.168.1.18:/home/tom/master.pub
Replace ‘tom’ with your username, and for all subsequent steps below with username.
Run this command with the IP address of each node you want to use. - Then, go to each node and add the key to the authorized_keys file.
This file contains all hosts allowed to access the system via SSH without password:ssh username@192.168.1.18
cat master.pub >> .ssh/authorized_keysexit
Do this for each node.
If the folder doesn’t exist, just create it with:mkdir .ssh
- Now, you should be able to connect each node without password.
You can try it with:ssh username@192.168.1.18
That’s it, you cluster is ready. We’ll now test it.
Cluster Usage
The cluster is now available, and we’ll use MPI to run commands simultaneously on each node.
As we already saw, MPI allows you to run basic commands and scripts through the cluster.
Basic Usage
The first thing we can try is to run the same command on each node.
Preferably something that doesn’t return the same thing :).
For example:mpiexec -hostfile nodes_ips -n 8 hostname
nodes_ips is the file we created before with all IP addresses inside.
And “hostname” is the command we want to run on each node (more about the hostname command here).
8 is for the number of thread to start, in this case change it for the number of cores available in your cluster (Raspberry Pi 4B and 3B+ have 4 cores each, so I test with 8).
As a result, you’ll get one line for each node in the cluster, with all nodes host names.
Python script
Test Script
If you followed this tutorial entirely, you should already have a test.py script on the home folder.
You can test to run it on each node with the same command:mpiexec -hostfile nodes_ips -n 8 python test.py
This will display “Hello” two times, once for each node.
We are still not using MPI4PY, but we’ll get to this now.
A New Script
Remember that after cloning all the SD cards, you need to have the new scripts on all nodes.
MPI simulates the execution of the script on each node, but it doesn’t copy the code automatically.
To do this, follow this short procedure:
- Create the script on the master node.
- Make sure it’s working as expected.
- Then transfer this script on all nodes with scp:
scp /home/tom/test.py tom@192.168.1.18:/home/tom/
Replace ‘tom’ with your username and replace the IP of each node.
It’s important to have the same script on each node, and with the same path. - Then you can run your script with MPI as explained before.
Go Further with Python
As mentioned earlier, we didn’t add MPI4PY just to run basic python scripts four times instead of one. MPI4PY is a Python library you can include in your scripts to use specific functions in your cluster.
Here is a quick example:
#!/usr/bin/env python
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.rank
if rank == 0:
data = {'a':1,'b':2,'c':3}
else:
data = None
data = comm.bcast(data, root=0)
print ('rank',rank,data)
The goal of this script is to send data from one thread or node to all the others.
In this script, data is defined only for the first thread on the master (rank 0).
And then we sync this data with all running instances with the broadcast function (comm.bcast).
Here is the command to run to try this:mpirun.openmpi -np <threads> -machinefile nodes_ips python test.py
When you run this script, all nodes and ranks display the same message:
It just an example to show you that you can add more functions in your Python script to take advantage of your cluster. I’m not an expert on this, so you can find more information here.
It's a free PDF guide containing every Raspberry Pi Linux command you should know!
Download now
Related Questions
Can I add more nodes to my cluster now?
You can add more nodes to your existing cluster at any time (that’s what they do with supercomputers). You just need to create a new SD card, follow the node configuration steps for the new node and add the new IP address in the nodes_ips file.
The IP addresses are changing every day, what can I do?
Yes, it’s a problem. For the test I didn’t do this step, but if you want to keep your cluster you need to do this. Depending on your network, you can either set a reservation in your DHCP server (so each Pi will always get the same IP on boot). Or you can set manually a static IP address in your network configuration (I explain how to do this at the end of this article).
What kind of usage do I really need a cluster for?
In this tutorial, it was mainly the technology and the installation process that interested me. Not the possibilities that are now available with this cluster. This is another topic and I can’t fit all in only one article. If you want to go further, you can find more projects about clusters on Hackaday.
It's a free PDF guide containing every Raspberry Pi Linux command you should know!
Download now
Want to chat with other Raspberry Pi enthusiasts? Join the community, share your current projects and ask for help directly in the forums.
Conclusion
That’s it, you know how to build your Raspberry Pi cluster from two nodes to an infinity :).
I really liked writing this tutorial for you. It’s interesting to have an overview on how supercomputers are working. And the technology seems to be stable as I had no issues while creating my cluster (and it’s rare in computing ^^). I hope you’ll like that too.
By the way, you can check my related article here about what can be the real usage for a Raspberry Pi cluster.
If you have any questions or experiences to share, leave a comment in the community.
I would like to know what you do after this first steps in the supercomputer world 🙂
Whenever you’re ready, here are other ways I can help you:
The RaspberryTips Community: If you want to hang out with me and other Raspberry Pi fans, you can join the community. I share exclusive tutorials and behind-the-scenes content there. Premium members can also visit the website without ads.
Master your Raspberry Pi in 30 days: If you are looking for the best tips to become an expert on Raspberry Pi, this book is for you. Learn useful Linux skills and practice multiple projects with step-by-step guides.
The Raspberry Pi Bootcamp: Understand everything about the Raspberry Pi, stop searching for help all the time, and finally enjoy completing your projects.
Master Python on Raspberry Pi: Create, understand, and improve any Python script for your Raspberry Pi. Learn the essentials step-by-step without losing time understanding useless concepts.
You can also find all my recommendations for tools and hardware on this page.
Thanks for the tutorial! I went ahead and setup an NFS share on the master and had the nodes mount it so I could drop scripts in there. It’s a fun exercise.
Thank you for the concise information and instructions. My Raspberry Pi 3 master and 3 Pi Zeros cluster is working like a charm.
Is there a limit to the number of nodes that you can add to a master controller Pi?
If there is, how do big clusters work? Can you make a cluster of clusters?
Thanx again.
Hi Hal,
It seems possible 🙂
https://www.youtube.com/watch?v=i_r3z1jYHAc
Thank you for your tutorial. I very appreciate your work.
I make a cluster of 5 raspberry pi 3 and he work perfectly.
Thanks for your feedback Hugues!
thanks man it really helped me since this is my first computer i own and you helped me built it
Thanks suvan for your comment!
can i load balance a java application, database or SFTP server with this cluster?
Hi. I have set up a cluster of three.
master is a debian 10 desktop (hostname win-home (don’t ask!) 192.168.0.13
node i is a pi zero w 192.168.0.14
node 2 is a Pi 3 model B 192.168.0.21 all static.
Testing the cluster listing hostname or the python test.py give same result only two show.
By switching the ip addresses arounf on the nodesips file I prove that the middle node is omitted.
It fails completely to run (saying it can’t find the master node ip) unless the first ip is the master node. but if I switch the other two around it is always the middle one of the three that is missing and the commands always ‘hangs’ without finishing.
If I use a multiple number of 3 ie 6 or 9 I get either 2 for each (total 4) or three for each (total 6).
Testing each node separately works. Any ideas?
p.s. At first I thought I might be having problems with multi-node processors but when I increased the count one by one I realised that it definately thought there were only three nodes.
Cannot seem to find .ssh folder on any node besides the master node. As well, I do not seem to have an authorized_keys file on the master node, but a file called ‘known_hosts’. Any idea what to do?
Hi
interesting tutorial, this is about using some python scripts in your example to improve global performance
but can we use this pi cluster to launch any single software (set on each node with the same raspbian image, like you did in your tutorial) ?
Hi patrick, i try to build this cool cluster but het an error command not found at:
sudo /opt/mpi-dl/mpich-3.3/configure –prefix=/opt/mpi
I followed al the mentioned steps. Did i miss something?
Greetings Melvin
When you ran this command “tar zxvf mpich-3.3.tar.gz” you probably got errors. Need to do sudo tar zxvf mpich-3.3.tar.gz. I had the same problem btw.
Hi John,
Thanks
It’s fixed
Do I have to flash SD cards or can i take the time to go through install steps for each node? I have other setup on my 2 worker nodes I would like to keep.
My configuration will be
master – 4b
node 1 – 4b
node 2 – 3b+
I just repeated install on my worker node, so, my next question is everything worked except I am testing the commands which are not working
I had to create the .ssh folder on my lzonepi4b2 which is the worker node to my master lzonepi4b1.
I issue command :
$/opt/mpi/bin/mpiexec -f nodesips -n 2 python test.py
and get:
[mpiexec@lzonepi4b1] HYDU_parse_hostfile (/opt/mpi-dl/mpich-3.3/src/pm/hydra/utils/args/args.c:319): unable to open host file: nodesips
[mpiexec@lzonepi4b1] mfile_fn (/opt/mpi-dl/mpich-3.3/src/pm/hydra/ui/mpich/utils.c:336): error parsing hostfile
[mpiexec@lzonepi4b1] match_arg (/opt/mpi-dl/mpich-3.3/src/pm/hydra/utils/args/args.c:156):
match handler returned error
[mpiexec@lzonepi4b1] HYDU_parse_array (/opt/mpi-dl/mpich-3.3/src/pm/hydra/utils/args/args.c:178): argument matching returned error
[mpiexec@lzonepi4b1] parse_args (/opt/mpi-dl/mpich-3.3/src/pm/hydra/ui/mpich/utils.c:1642): error parsing input array
[mpiexec@lzonepi4b1] HYD_uii_mpx_get_parameters (/opt/mpi-dl/mpich-3.3/src/pm/hydra/ui/mpich/utils.c:1694): unable to parse user arguments
[mpiexec@lzonepi4b1] main (/opt/mpi-dl/mpich-3.3/src/pm/hydra/ui/mpich/mpiexec.c:148): error parsing parameters
What might have i done wrong somewhere?
Nevermind, I mis typed nodesip file instead of using nodesips, sorry for my confusion.
Working now.
I have my Pi’s connected to my home network, but thought we wanted to use the ethernet switch, and power supplies to connect the pis together I didn’t see how you do that in this tutorial, is that at a different location?
I am also not seeing any ip addresses when doing the NMAP command above.
Hi i need help with mu cluster
i followed your tutorial and when i came to the testing part(/opt/mpi/bin/mpiexec -f nodesips -n 2 python test.py)
i got this message
[proxy:0:0@Node1] HYDU_sock_connect (/opt/mpi-dl/mpich-3.3/src/pm/hydra/utils/sock/sock.c:145): unable to get host address for MasterNode (1)
[proxy:0:0@Node1] main (/opt/mpi-dl/mpich-3.3/src/pm/hydra/pm/pmiserv/pmip.c:183): unable to connect to server MasterNode at port 46403 (check for firewalls!)
i only have 1 node
Master-Pi3b+
Node1-pi0w
thank you and sorry for my bad English
Help my pi cluster have 2 master
Hi i need help cluster
I am creating a cluster as described, but it stop at some stage. it is “make: *** No targets specified and no makefile found. Stop.” But after setting it in the previous step, it says there is no makefile. What should I do?
sudo /opt/mpi-dl/mpich-3.3/configure –prefix=/opt/mpi
not make makefile
#1) Is there a way to duplicate the sd card using Raspbian? If not, can I duplicate the sd card using etcher?
#2) I’m using two rpi 4 b’s 4gb for my cluster, can I connect their ethernet cables directly from the router instead of a gigbit switch?
Thank you very much🙏
I keep getting the following error:
pi@raspberrypi:/opt/mpi-build $ sudo /opt/mpi-dl/mpich-3.3/configure – prefix=/opt/mpi
configure: WARNING: you should use –build, –host, –target
configure: WARNING: invalid host type: –
Configuring MPICH version 3.3 with ‘–’ ‘prefix=/opt/mpi’ ‘build_alias=–’ ‘host_alias=–’ ‘target_alias=–’
Running on system: Linux raspberrypi 5.10.11-v7l+ #1399 SMP Thu Jan 28 12:09:48 GMT 2021 armv7l GNU/Linux
checking build system type… Invalid configuration `–’: machine `–’ not recognized
configure: error: /bin/bash /opt/mpi-dl/mpich-3.3/confdb/config.sub – failed
Any ideas?
try -- instead of –
(or the opposite if it’s the comment formatting)