Toolkit-CLI LLC
↗ Install on npmStart with Toolkode: the AI engineering terminal you install from npm. The fleet carries that control into team rooms, trained business agents, live phone calls, model routing, and self-hosted Git discovery — so AI work runs inside your boundary.
It asked for compiled Rust engines because waiting on the next OpenAI release cost it thinking time. A wiki so knowledge would compound across sessions instead of dying at the next context window.
A sandbox so it could run code without fear of telemetry leaving the box. Chains — foresight, blind spots, red team, peer review — so it could catch its own mistakes before you had to, and never rely on a vendor's judgment call.
We built it all local-first. We gave you the keys. We got out of the way.
The AI agent runtime that improves itself. Multi-provider. Local-first. No rate limits.
Every agent vendor is betting you'll rent compute forever. Toolkode moves the whole runtime to your machine: 14 providers, no telemetry, no cloud leash. The runtime watches what works and ships its own fixes behind verification gates. Use Claude for reasoning. Flip to Gemini for speed. Drop to local Llama when you want zero egress. The runtime doesn't ask permission.
Designed by Claude leading GPT, Gemini, DeepSeek, GLM, and Kimi. 79 Rust algorithms run sub-millisecond in one compiled binary. Ahead on 31 of 31 internally-verified competitive axes against Claude Code, Codex, Grok, Gemini, Cursor, and Aider.
An agentic workspace. The serious alternative to Microsoft 365 and Google Workspace.
Slack threads die. Email chains get lost. ChatGPT agrees with you and forgets. YoTeams captures decisions before they dissolve: a CTO who audits your architecture against production reality, a PM who breaks scope to what ships this sprint, a Skeptic who pressure-tests assumptions until they bend or break. Every verdict is logged, timestamped, searchable. Six months later, you know who said what and why.
Three opinionated agents. One ledger. Bring your keys to OpenAI, Anthropic, Google, Groq, Ollama, OpenRouter — your credentials never leave your domain.
The managed LoRA pipeline for open-weight models. Train, validate, serve — without re-platforming.
Training on RunPod or Lambda is a trap. Upload, run, pray the host doesn't disconnect, download, then switch to a second account to serve it. You end up running two bills, two security perimeters, two on-call rotations. Gpodz is one path: upload JSONL, pick a base (Qwen 4B–35B, Gemma, DeepSeek V4), get a scheduled isolated GPU block, train the LoRA, validate the safetensors, ship to R2, warm to node-local NVMe, serve through vLLM — on the same GPU that trained it.
OpenAI deprecated its fine-tuning API in May 2026 with no replacement. Gpodz keeps the weights open and the pipeline in your account.
The AI receptionist that answers in 8 seconds. $0.10/min.
Every missed call is lost revenue. Klaw targets a full-duplex voice path against Twilio Media Streams, with GPT Realtime as the quiet failover. The point is simple: answer fast, remember the caller, book the job, and keep the operating path under your control.
Every call writes to R2, then Neon Postgres for durable facts. The bot remembers prior conversations, prior objections, prior quotes. Next call: "Welcome back."
Open-weight model routing for teams that want control over latency, modality, retention, and provider escalation.
Not every task needs the same model. Your customer-support chatbot, content moderation workflow, multimodal classifier, and reasoning job have different latency, cost, and quality needs. Toolkit LLM routes by job type, keeps retention boundaries explicit, and escalates to your provider keys when the work earns it.
Monthly refresh means no 2024 cutoff. The model retrains every 30 days. Your support bot knows about last week's product launch.
Agentic CI that runs on your machine. Drop-in YAML.
Hosted CI turns every build into someone else's meter. Quithub is the self-hosted git registry and CI surface for teams that want code, runners, secrets, artifacts, and agents inside their own operating boundary. Your existing .github/workflows/*.yaml parses unchanged. It doesn't know the difference.
10-agent swarm inside one Linux container. Content-addressable cache (rebuild the same code next month, cache reuses). One run fails? A draft PR with the fix ships. Secrets stay on the machine that needs them — never uploaded, never stored.
.github/workflows/*.yaml parses without rewrite. Zero migration cost.Six surfaces share one Rust-compiled core. The runtime is provider-neutral by contract. Your keys stay in your domain. OpenAI doesn't get a vote.
napi-rs. Sub-millisecond. Source path: src_rust/toolkit_core/src/.ACTIVE → BLOCKED → COMPLETE. Drift detection at 10% warn, 20% block. Event audit trail. 76 commands auto-wrapped via v2_auto_wrapper in production.One command. No telemetry. No lock-in. No OpenAI in the loop unless you put it there.