Everything you need to know to start fine-tuning LLMs in the privacy of your home
Got a modern Nvidia or AMD graphics card? Custom Llamas are only a few commands and a little data prep away
Setting up Axolotl for AMD Radeon GPUs
Getting Axolotl running on AMD's Radeon GPUs isn't officially supported yet, but because the app largely runs on top of PyTorch, it is possible to get it working with a few limitations.
The biggest challenge right now is AMD's Radeon graphics cards don't natively support Flash Attention, which helps to keep memory consumption manageable when dealing with larger context lengths.
The good news is that in the latest releases of PyTorch we can take advantage of experimental ahead-of-time Triton kernel libraries, which allows us to achieve functionality similar to Flash Attention 2 and critically avoid out-of-memory errors.
Installing Dependencies
We'll start by installing a few dependencies necessary to get everything working.
sudo apt update && sudo apt upgrade sudo apt install python3-pip git cmake libstdc++-12-dev wget
Update drivers and install ROCm
Next, you'll want to make sure you have AMD's latest DKMS drivers and ROCm stack. In our testing we used AMD's ROCm 6.2 release. For Radeon graphics this can be achieved by executing the following commands.
wget https://repo.radeon.com/amdgpu-install/6.2.2/ubuntu/noble/amdgpu-install_6.2.60202-1_all.deb sudo apt install ./amdgpu-install_6.2.60202-1_all.deb amdgpu-install -y --usecase=graphics,rocm sudo usermod -a -G render,video $LOGNAME sudo reboot
If you're reading this a few months after publication, you may want to check out AMD's documentation on installing ROCm for up-to-date instructions.
Install Miniconda
Just like with Nvidia GPUs, we'll need to install Miniconda to add support for the older Python 3.10 release in Ubuntu 24.04. Follow the instructions for deploying Miniconda on Linux, and then create and activate the Axolotl environment by executing the following commands:
conda create -n axolotl python=3.10 conda activate axolotl
Once activated, you'll also want to install the libstdcxx-ng
package, otherwise you'll run into errors when you start training.
conda install -c conda-forge libstdcxx-ng
Installing Axolotl
With our Python 3.10 environment up and running we can now clone Axolotl from GitHub and begin the installation.
git clone https://github.com/OpenAccess-AI-Collective/axolotl cd axolotl pip install packaging ninja pip install -e .
After a few minutes the bulk of the packages will be installed. However, to make Axolotl work with our Radeon card, we need to remove and replace a couple of incompatible packages. Specifically, we want to remove torch, xformers, and bitsandbytes.
pip uninstall -y torch xformers bitsandbytes
With that done, we can now deploy PyTorch as we normally would by executing:
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.2
Note: You will see dependency errors, this is totally normal and to be expected. Many of the packages used by Axolotl still don't offer native support for AMD GPUs, so there are going to be missing or mismatched dependencies that pip
is going to complain about.
Installing a working version of Bitsandbytes, necessary to make QLoRA work, is a little trickier, but thankfully AMD has a fork of the project that seems to do the job just fine. To get started execute the following commands:
git clone --recurse https://github.com/ROCm/bitsandbytes cd bitsandbytes git checkout rocm_enabled_multi_backend pip install -r requirements-dev.txt cmake -DCOMPUTE_BACKEND=hip -DBNB_ROCM_ARCH="gfx1100" -S . make pip install .
Note: "gfx1100" is the architecture used for Radeon 7900-series models like the GRE, XT, XTX and W7900. If you're attempting this on another card, you may need to modify this accordingly.
With all of that out of the way, we can move on to fine-tuning with Axolotl: