Everything you need to know to start fine-tuning LLMs in the privacy of your home

Got a modern Nvidia or AMD graphics card? Custom Llamas are only a few commands and a little data prep away

Setting up Axolotl for AMD Radeon GPUs

Getting Axolotl running on AMD's Radeon GPUs isn't officially supported yet, but because the app largely runs on top of PyTorch, it is possible to get it working with a few limitations.

The biggest challenge right now is AMD's Radeon graphics cards don't natively support Flash Attention, which helps to keep memory consumption manageable when dealing with larger context lengths.

The good news is that in the latest releases of PyTorch we can take advantage of experimental ahead-of-time Triton kernel libraries, which allows us to achieve functionality similar to Flash Attention 2 and critically avoid out-of-memory errors.

Installing Dependencies

We'll start by installing a few dependencies necessary to get everything working.

sudo apt update && sudo apt upgrade
sudo apt install python3-pip git cmake libstdc++-12-dev wget

Update drivers and install ROCm

Next, you'll want to make sure you have AMD's latest DKMS drivers and ROCm stack. In our testing we used AMD's ROCm 6.2 release. For Radeon graphics this can be achieved by executing the following commands.

wget https://repo.radeon.com/amdgpu-install/6.2.2/ubuntu/noble/amdgpu-install_6.2.60202-1_all.deb
sudo apt install ./amdgpu-install_6.2.60202-1_all.deb
amdgpu-install -y --usecase=graphics,rocm
sudo usermod -a -G render,video $LOGNAME
sudo reboot

If you're reading this a few months after publication, you may want to check out AMD's documentation on installing ROCm for up-to-date instructions.

Install Miniconda

Just like with Nvidia GPUs, we'll need to install Miniconda to add support for the older Python 3.10 release in Ubuntu 24.04. Follow the instructions for deploying Miniconda on Linux, and then create and activate the Axolotl environment by executing the following commands:

conda create -n axolotl python=3.10
conda activate axolotl

Once activated, you'll also want to install the libstdcxx-ng package, otherwise you'll run into errors when you start training.

conda install -c conda-forge libstdcxx-ng

Installing Axolotl

With our Python 3.10 environment up and running we can now clone Axolotl from GitHub and begin the installation.

git clone https://github.com/OpenAccess-AI-Collective/axolotl
cd axolotl
pip install packaging ninja
pip install -e .

After a few minutes the bulk of the packages will be installed. However, to make Axolotl work with our Radeon card, we need to remove and replace a couple of incompatible packages. Specifically, we want to remove torch, xformers, and bitsandbytes.

pip uninstall -y torch xformers bitsandbytes

With that done, we can now deploy PyTorch as we normally would by executing:

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.2

Note: You will see dependency errors, this is totally normal and to be expected. Many of the packages used by Axolotl still don't offer native support for AMD GPUs, so there are going to be missing or mismatched dependencies that pip is going to complain about.

Installing a working version of Bitsandbytes, necessary to make QLoRA work, is a little trickier, but thankfully AMD has a fork of the project that seems to do the job just fine. To get started execute the following commands:

git clone --recurse https://github.com/ROCm/bitsandbytes
cd bitsandbytes
git checkout rocm_enabled_multi_backend
pip install -r requirements-dev.txt
cmake -DCOMPUTE_BACKEND=hip -DBNB_ROCM_ARCH="gfx1100" -S .
make
pip install .

Note: "gfx1100" is the architecture used for Radeon 7900-series models like the GRE, XT, XTX and W7900. If you're attempting this on another card, you may need to modify this accordingly.

With all of that out of the way, we can move on to fine-tuning with Axolotl:

More about

TIP US OFF

Send us news


Other stories you might like