Local AI Run

Run AI locally, zero external dependencies

Double-click to install. Provides OpenAI-compatible API for any project. Vision / Embedding / Speech-to-Text — three services ready out of the box.

Choose your platform

macOS (Apple Silicon)

v1.0.0

M1 / M2 / M3 / M4, built-in Metal GPU acceleration

5.83 GB
Download Now

Windows (AMD64)

v1.0.0

NVIDIA GPU (CUDA) recommended

Coming soon

Three services, one click

🖼️

Vision-Language (VLM)

Gemma 4 E4B Q4_K_M. Image description, text generation, multimodal Q&A. Endpoint: :8080/v1/chat/completions

🔢

Text Embedding

BGE-base-zh, 768-dim Chinese embeddings for semantic search and similarity. Endpoint: :8081/v1/embeddings

🎙️

Speech-to-Text (STT)

Whisper large-v3-turbo, multilingual transcription. 8x realtime on M4 Max. Endpoint: :8082/inference

System Requirements

macOS (Apple Silicon) Windows (AMD64)
ProcessorApple Silicon M1+AMD64 / Intel x64
Memory16 GB+ (32 GB recommended)
GPUBuilt-in Metal (auto)NVIDIA GPU 8GB+ recommended
Disk8 GB (~6.3 GB models)
OSmacOS 12+Windows 10 / 11

🔒 Security

All services bind to 127.0.0.1 only. No external network, no cloud cost, fully on-device. Client must run on the same machine as the engine.