π¦ Unlimited Llama - AI Desktop Assistant AiLo Core
β¨ Features
- π¬ Smart chat with local GGUF models
- π Integrated web search for up-to-date information
- π Text-to-Speech (TTS) and Speech Recognition (STT)
- π· OCR to extract text from images
- πΎ Advanced session management
- ποΈ Supports any LLM model size
- π OpenAI-compatible API server
- π€ Export in JSON, TXT, and Markdown
- π Integrated distribuited computing
π¦ Download
π Quick Start Guide
First Launch
- Load a model β π€ Model β π Load Model
- Start chatting β type in the box below and press Enter
- Sessions are saved automatically
π Web Search
- Enable/disable using the π Web Search toggle
- Automatically searches for news, recent info, or local data
- Displays the sources used
π Speech Synthesis (TTS)
- Enable via π TTS in the sidebar
- The assistant reads responses aloud
- Use π STOP to interrupt
π€ Speech Recognition
- π€ Voice Input for single input
- π€ Start Listening for continuous mode
π· OCR from Images
- Click π· Image OCR
- Select an image (PNG, JPG, etc.)
- Extracted text is automatically inserted into the chat
π οΈ Troubleshooting
β βModel not foundβ
- Make sure the GGUF file is in the
/modelsfolder - Verify the file format is
.gguf - Check that you have enough disk space
β βTesseract not foundβ
- Install Tesseract OCR following the instructions below
- Restart the application after installation
βοΈ Configuration Memory Optimization Memory Mapping (MMAP) What it does: Maps model directly from disk instead of loading entirely into RAM
Benefits: Reduces RAM usage by up to 70%, faster startup
Use when: Limited RAM, large models (>7GB)
Performance: Slightly slower inference, much less RAM usage
Memory Locking (MLOCK) What it does: Locks model in RAM preventing swap to disk
Benefits: Maximum performance, consistent response times
Use when: Abundant RAM, performance-critical applications
Performance: Fastest inference, permanent RAM occupation
βοΈ System Requirements
Minimum
- OS: Windows 10/11, macOS 10.15+, Linux (Ubuntu 18.04+)
- RAM: 8 GB (16 GB recommended)
- Disk Space: 2 GB + space for models
- CPU: Modern 64-bit processor
Recommended
- RAM: 16 GB+ for large models
- GPU: NVIDIA/AMD with CUDA or Metal (optional)
- Disk Space: 10 GB+ for large models
π§ Installation
1. Install Tesseract OCR (Required for OCR)
Windows
# Using Chocolatey (recommended)
choco install tesseract
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support