Running LLM locally is fascinating because we can deploy applications and do not need to worry about data privacy issues by using 3rd party services.

This article talks about how to deploy GPT4All on Raspberry Pi and then expose a REST API that other applications can use.

Installation

Preparation

Raspberry Pi 4 8G Ram Model
Raspberry Pi OS

Reference

https://qengineering.eu/install-vulkan-on-raspberry-pi.html
https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-bindings/python/README.md
https://github.com/nomic-ai/gpt4all/issues/1530

Step-by-step instructions

Install Vulkan SDK

The dependencies we need:

sudo apt-get update
sudo apt-get upgrade -y
sudo apt-get install libxcb-randr0-dev libxrandr-dev -y
sudo apt-get install libxcb-xinerama0-dev libxinerama-dev libxcursor-dev -y
sudo apt-get install libxcb-cursor-dev libxkbcommon-dev xutils-dev -y
sudo apt-get install xutils-dev libpthread-stubs0-dev libpciaccess-dev -y
sudo apt-get install libffi-dev x11proto-xext-dev libxcb1-dev libxcb-*dev -y
sudo apt-get install libssl-dev libgnutls28-dev x11proto-dri2-dev -y
sudo apt-get install x11proto-dri3-dev libx11-dev libxcb-glx0-dev -y
sudo apt-get install libx11-xcb-dev libxext-dev libxdamage-dev libxfixes-dev -y
sudo apt-get install libva-dev x11proto-randr-dev x11proto-present-dev -y
sudo apt-get install libclc-dev libelf-dev mesa-utils -y
sudo apt-get install libvulkan-dev libvulkan1 libassimp-dev -y
sudo apt-get install libdrm-dev libxshmfence-dev libxxf86vm-dev libunwind-dev -y
sudo apt-get install libwayland-dev wayland-protocols -y
sudo apt-get install libwayland-egl-backend-dev -y
sudo apt-get install valgrind libzstd-dev vulkan-tools -y
sudo apt-get install git build-essential bison flex ninja-build -y
sudo apt-get install python3-mako  -y

Build Vulkan API

sudo pip3 install meson
# install mako
sudo pip3 install mako
# install v3dv
cd ~
git clone -b 20.3 https://gitlab.freedesktop.org/mesa/mesa.git mesa_vulkan
# build v3dv (± 30 min)
cd mesa_vulkan
CFLAGS="-mcpu=cortex-a72" \
CXXFLAGS="-mcpu=cortex-a72" \
meson --prefix /usr \
-D platforms=x11 \
-D vulkan-drivers=broadcom \
-D dri-drivers= \
-D gallium-drivers=kmsro,v3d,vc4 \
-D buildtype=release build
ninja -C build -j4
sudo ninja -C build install
# check your driver
$ glxinfo -B

Build Python bindings

set up llmodel

git clone --recurse-submodules https://github.com/nomic-ai/gpt4all.git
cd gpt4all/gpt4all-backend/
vim CMakeLists.txt

Find set(LLAMA_KOMPUTE YES), and comment it

Save the txt file, and continue with the following commands

mkdir build
cd build
cmake .. -DKOMPUTE_OPT_DISABLE_VULKAN_VERSION_CHECK=ON 
cmake --build . --parallel

Make sure libllmodel.* exists in gpt4all-backend/build

Setup Python package

cd ../../gpt4all-bindings/python
pip3 install -e .

Test API Using Python

Now we can test GPT4All on the Pi using the following Python script:

import time
from functools import wraps
from gpt4all import GPT4All

def timer(func):
    """A decorator that prints the execution time of the function it decorates."""

    @wraps(func)
    def wrapper(*args, **kwargs):
        start_time = time.time()
        result = func(*args, **kwargs)
        end_time = time.time()
        print(f"{func.__name__} ran in: {end_time - start_time} sec")
        return result

    return wrapper

@timer
def ask(prompt):
    model = GPT4All("orca-mini-3b-gguf2-q4_0.gguf")
    output = model.generate(prompt, max_tokens=512)

if __name__ == "__main__":
    answer = ask("The capital of France is ")
    print(answer)

Results

It takes more than 2 minutes for Raspberry Pi to generate the following texts with a 3B model. It looks quite good on such a tiny machine, doesn’t it?

部落格

Install and Run GPT4All on Raspberry Pi

Installation

Preparation

Reference

Step-by-step instructions

Install Vulkan SDK

Build Python bindings

Test API Using Python

Results

發佈留言取消回覆

Installation

Preparation

Reference

Step-by-step instructions

Install Vulkan SDK

Build Python bindings

Test API Using Python

Results

發佈留言 取消回覆

發佈留言取消回覆