My Local AI Setup

Written by Jeff on December 22, 2024

Current Gaming Computer Setup Turned AI Server

💻 Processor: Intel® Core™ i9-13900KS
🖥️ Mainboard: ASUS ROG Strix Z790-E Gaming WiFi II LGA 1700
🧠 RAM: 128GB DDR5
🎮 GPU: NVIDIA RTX 4090
💾 SSD: 1TB (Samsung 980 Pro)
🌐 Network: Upgraded to 10G (previously using the built-in 2.5G Ethernet on the mainboard)

The move to 10G networking aligns perfectly with me that like faster and more reliable connections. Also I did have 40G Router in my homelab.

With my current setup, I utilize high-performance AI models tailored for different use cases:

🔍 Model: Llama 3.2 Vision (90B Parameters)
- 📝 Details: This model relies heavily on my large RAM, providing moderate inference speeds. It excels at resource-intensive tasks and advanced experimentation.

🖋️ Model: Llama 3.3 (70B Parameters)
- 📝 Details: Ideal for advanced natural language processing tasks, this model delivers robust and reliable performance for daily use.

🖼️ Model: Llama 3.2 Vision (11B Parameters)
- 📝 Details: Striking a balance between performance and efficiency, this model is excellent for day-to-day multimodal processing.
🌌 Model: InternVL2 (26B Parameters)
- 📝 Details: With advanced vision-language capabilities, this model excels at complex multimodal tasks while maintaining efficiency for regular use.

I deploy my AI projects using Ollama.

🔄 Update Fedora:
- Keep your system updated:
```
sudo dnf update -y
```
🛠️ Install Prerequisites:
- Install essential build tools and libraries:
```
sudo dnf install -y gcc make cmake git curl wget
```
🎮 Install NVIDIA Drivers:
- Open the Software Center.
- Search for "NVIDIA drivers" and install the appropriate ones for your GPU.
- Follow the guided steps for enabling Secure Boot if necessary.

🐋 Set Up Docker (Optional):

For containerized environments, install Docker:

sudo dnf install -y docker
sudo systemctl start docker
sudo systemctl enable docker

By following these steps, I successfully set up Ollama on Fedora 41, ensuring smooth operation with my NVIDIA RTX 4090 GPU.

These models are ideal for most modern systems:

For advanced tasks requiring higher specifications:

Qwen2.5-Coder: 32B
Deepseek-coder-v2: 16B (base) or 67B (advanced). The 236B version is impractical for most hobbyists.

Running models with 20B+ parameters often caters to geeks or enterprise-grade AI solutions, demanding robust hardware and significant resources.