A 350M-parameter model that listens to your prompts.
Private. In your browser. No data leaves your device.
Dolly Assistant is a fine-tuned language model built on LiquidAI's LFM2.5-350M architecture, trained on 15K hand-written instruction-response pairs from the Databricks Dolly dataset. It classifies, summarizes, extracts, brainstorms — small footprint, broad use.
Everything runs locally via ONNX Runtime and WebGPU. Your prompts never leave your browser. No servers. No logs. No surveillance. Just inference.
The model runs entirely in your browser using WebGPU acceleration. Zero network calls after the initial load.
4-bit quantization keeps the model under 200 MB for browser delivery while preserving instruction-following quality.
Trained on 15K hand-written instruction-response pairs from Databricks Dolly, covering 8 task categories: QA, classification, summarization, extraction, and more.
"Simplicity is the ultimate sophistication."