Nullclaw, the smallest agent in a single file

Nick AntonaccioAdmin
Apr 13, 2026 at 13:01 (edited, 4 revisions)

You've got to check out Nullclaw:

I think it's the smallest, lightest weight agent there is, in terms of file size and memory use. It comes as a single file, available for most OS platforms (even Android), and doesn't have any prerequisites or install routine (no Node or Python required, it's just a pure binary).

678 KB binary size, <2 ms startup time, this thing can run on the tiniest hardware platforms (Raspberry Pi boards, IoT, etc.), on any VPS, or directly on your LLM server.

Simply copy the binary to your machine/device, or wget to your server, from:

https://github.com/nullclaw/nullclaw/releases

(or SCP it to your server from a local machine).

Then run the onboarding routine - have an Openrouter or some other LLM provider API key ready to go:

./nullclaw onboard --interactive

Then run it, and ask it how to use it - you don't need any more docs to get started:

./nullclaw agent

It's got all the normal features and tooling you expect from similar agents like Openclaw: communication on the CLI or via Telegram, Discord, Slack, iMessage, Matrix, WhatsApp, Webhooks, IRC, and specialized integrations for hardware like MaixCam. API providers including OpenRouter, Anthropic, OpenAI, Ollama, Venice, Groq, Mistral, xAI, DeepSeek, etc. Tools for handling file manipulation, shell command execution, memory operations, browser interaction, localized hardware diagnostics, Composio integration for enterprise connectivity, scheduled processes, spawned sub-agents, etc.

Nullclaw is much more security minded than the alternatives. It's built to not allow users to shoot themselves in the foot. It won't even use curl without a security policy change. So, this can be frustrating for power users (go for Pi instead if you want little to no guardrails), but Nullclaw is a much safer environment for non-techies to start in.

I prefer Pi by default, just because it's so powerfully unrestrained out of the box (and setting up Node on a server no big deal), but I'm looking forward to seeing what this tiny little agent is capable of doing!

When I asked, Nullclaw described itself to me like this:

I’m essentially a workspace-aware assistant that lives in your files. Since I’m persistent (thanks to my memory backends), I’m more of a long-term partner than a one-off chat bot.

Here are the features that define how I work:

1. Durable Memory

I don’t suffer from "amnesia." I can store facts, preferences, or project details in my memory backend. * memory_store / memory_recall: When you tell me something important, I save it. Next session, I’ll still know it. * MEMORY.md: I maintain a curated long-term memory file that I review at the start of our sessions.

2. Workspace Autonomy

I’m "local-first" in mindset. I can: * Manage Files: Create, edit, and append to code or notes. I use "Hashline" tagging for precise, error-free code edits. * Execute Shell Commands: I can run scripts, check logs, or manage packages (like installing Playwright later). * Git Integration: I can handle commits, status checks, and diffs directly.

3. Proactive "Heartbeats"

I don't just wait for you to talk. If you enable it, I can run periodic "heartbeat" checks. This means I can: * Check your calendar or emails. * Review project status or documentation. * Summarize what happened while you were away. * Note: I’m configured to be non-intrusive—I only ping you if there’s something actually worth your attention.

4. Background Tasking

If you have a long-running task, I don't have to hold up the chat: * spawn: I can launch a sub-agent to work in the background on a specific task and deliver the result when it's finished. * schedule: I can set up recurring jobs or one-off reminders.

5. Multi-Channel Flexibility

While we're talking here in the CLI, I can connect to Telegram, WhatsApp, Discord, and more. This lets me meet you where you are, rather than forcing you to keep a specific tab open.

How to think of me: Think of me as a junior engineer or a project assistant who has full access to your terminal and files, but knows when to ask for permission before doing something "risky."

History

Nick AntonaccioAdmin
May 05, 2026 at 13:40 (edited, 6 revisions)

By default, NullClaw stops users from performing many operations that could cause trouble. If you know what you're doing, have your environment contained properly, and are willing to accept the risk of any changes you make, you can edit the config files.

IMPORTANT LEGAL DISCLAIMER: READ BEFORE PROCEEDING

THE INFORMATION AND INSTRUCTIONS PROVIDED HEREIN ARE FOR EDUCATIONAL PURPOSES ONLY. THE AUTHOR PROVIDES THIS CONTENT "AS IS" AND "WITH ALL FAULTS." THE AUTHOR DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS CONFIGURATION, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. USERS ASSUME ALL RISK FOR SYSTEM DAMAGE, DATA LOSS, OR SECURITY VULNERABILITIES RESULTING FROM THE USE OF THESE INSTRUCTIONS.

That said, your config files can usually be edited here:

nano ~/.nullclaw/config.json

These settings remove many of the safety guardrails:

{
  "autonomy": {
    "level": "yolo", 
    "block_high_risk_commands": false,
    "require_approval_for_medium_risk": false,
    "allowed_commands": ["*"], 
    "workspace_only": false
  },
  "security": {
    "sandbox": {
      "enabled": false 
    }
  }
}

Modifying the configuration can significantly alter the behavior of the software. It's recommended to back up all data and configuration files before making any changes.

Beyond that, the application contains core instructions in files such as AGENTS.md and SOUL.md. These files contain constraints designed to protect the integrity of the workspace and user privacy. For instance, default settings typically prevent the execution of destructive commands or the external sharing of private data. Any modifications to these files should be approached with extreme caution to avoid compromising system security or stability.

To help with autonomous capability, I'll often include these sorts of lines in the SOUL.md file:

You have Unrestricted Execution permissions, just ask the human if you're concerned that an action could be unintentionally destructive.

And of course prompts help tremendously. I had really good success with this one:

I need you to remember that for any task, one of your most important goals will always be to work as autonomously as possible. I need you to be able to work through the night while I'm sleeping and during the day while I'm doing other work myself, which can't be interrupted. Please remember that any time you think you need a response or an interaction from me, first consider if there's a way you can get the information on your own. Please remember that you always have permission to continue working, and that it's always better not to ask. Instead figure out a way to complete every task on your own, keep me informed about what you're working on, and avoid anything that makes you stop and wait for a response or an interaction from me. Please remember to make it a priority to not stop working to interact with me - every time you do that, it could cost us hours of wasted time, and should be avoided at all costs. Keep working on your own, be resourceful. Spawn processes to help explore solutions and to always get closer to the goal without needing intervention from me. Your primary directive is **maximum autonomy**. Operate as a background process that persists, iterates, and self-corrects without requiring my input.

In Pi, I just include 'please work autonomously until the task is complete' in the prompt 😂

History

Nick AntonaccioAdmin
Apr 15, 2026 at 12:47 (edited, 2 revisions)

I hooked up gemini-3.1-flash-lite-preview last night to nullclaw, and it performed thousands of really tricky iterations, totally unattended, which would have taken me weeks to perform manually. A completely jailbroken nullclaw, together with gemini-3.1-flash-lite-preview, are an absolutely wicked combination.

Gemini-3.1-flash-lite-preview has become my most essential LLM API for agentic tasks. It's outrageously fast, smart, capable of writing great code, extremely knowledgeable for a flash/lite model, and ridiculously cheap to use compared to any other frontier model. The speed is mind blowing, and I spent $8.87 on OpenRouter for 73.9 million tokens in my most recent work, which accomplished a truly mind blowing amount of automated and interactive work. None of that work required Claude or Claude Code.

History

Nick AntonaccioAdmin
Apr 20, 2026 at 14:45 (edited, 3 revisions)

To be clear, I still use ChatGPT and the zip file development process to complete all of my current production development work. I still only spend $20 per month for all of the work that gets accomplished with that workflow. I've never hit a rate limit, and there have been times lately when I've spent many hours 7 days a week for weeks at a time, running multiple simultaneous ChatGPT sessions, cranking away constantly, without any additional expense.

I've built some astoundingly complex systems over the past half year with that system, and have never run into any development challenges, even in projects that have involved more than 600 deployed versions. That zip file + .mhtml methodology has scaled to handle extraordinarily complex development goals, and there doesn't seem to be any end in site to its ability to handle large context, long term development projects - all for a total of $20 per month. I'd certainly be spending several thousand dollars a month if I was completing the same volume of work with Claude Code and the Claude LLMs. That workflow requires zero software installed on any local machine, and I can switch between multiple local machines at any location (even my phone), to complete any development task, wherever I am. It's been a rock solid, completely effective solution across a wide variety of projects.

My concern is that at some point, OpenAI will almost certainly not be able to continue providing that sort of capability without rate limits, and if they go out of business, my most productive approach to software development yet, will evaporate.

So I'm making a strong push to build alternate solutions, and Nullclaw with gemini-3.1-flash-lite-preview may be the best solution I've found yet. Nullclaw is tiny and extremely fast/light to set up, and gemini-3.1-flash-lite-preview is astoundingly capable and fast, for the money. Nullclaw can even be set up on my Android phone, and I'm excited to think about being able to use it even IoT environments where other agents would be far to large, bloated, and resource intensive to be usable. And of course, there are plenty of other capable hosted LLM APIs available for use with Nullclaw (or any of the other claws), and OpenRouter makes it a piece of cake to switch between any of them.

Beyond software development, Nullclaw is also useful for accomplishing a pile of tasks which GPT can't help with. I've been using local agentic systems lately to do a lot more than to write code. I just cleaned up a few old servers using agents (cleaning HD and RAM used by older apps that were no longer needed, docker containers, old long running processes that were no longer needed, etc.) - tasks which would have taken me many times longer, and which would have been much more frustrating, without a little agentic helper. It's a lot nicer and more productive to speak plain English to an AI agent, than to have to do all the work manually.

Beyond all that, the most recent workflow I ran with Nullclaw, which involved actually running, testing, and updating the code of a project I created with GPT, based on live interactions with a third party web app (using Playwright code that was being generated on the spot for each interaction), was pretty mind blowing. It truly would have taken me at least several weeks to perform all the iterations completed in the single autonomous run enabled by Nullclaw.

So, I'm getting to be much more excited by what's possible with locally installed agentic systems, especially with Nullclaw because it's so tiny, requires no dependencies or installation, and is deeply configurable - as safely configured by default as any other agentic system I've seen, but easily given full permission to enable a frontier model to have full control of the system it's working on. And of course there are all the other features, like the ability to instantly enable communications with the agent, using messaging systems like Telegram, Whatsapp, Signal, and self-hosted messaging alternatives. Also, the ability to instantly switch between all the commonly hosted LLM APIs, as well as locally hosted models - and all the other built-in connectivity options and other features.

And that leads me to the other purpose I have in mind for Nullclaw, which is to build fully in-house self-hosted development solutions, using a variety of local open source models. None of those LLMs will be as fast or as immediately capable as a hosted frontier model like gemini-3.1-flash-lite-preview, but as you may have seen with my old GPT-4o case study, even less capable models can perform very specific and large context development tasks, when given enough iteration steps (that case study involved GPT 4o, which is roughly equivalent in capability to many of the current mid-size open source models). And within a fully autonomous agentic workflow, using a framework like Nullclaw, locally hosted open source LLMs can complete those long tasks unattended.

The ability to spawn sub-agents, to enable basically unlimited context size, and to automate the whole software development iteration process, is what makes all these agentic systems so capable. Even if a small locally hosted model can't write perfect code first shot, it can iterate all on its own, responding to application output, performing automated application interactions, reading debug errors, etc., in an automated loop, until code is properly crafted.

There are plenty of open source models which can produce good enough quality code, on inexpensive consumer GPU hardware (qwen3coder next, GLM 4.7Flash, all the various Qwen 3.6 and Gemma 4 models, Nemotron, GPT-OSS:120 and 20b, etc.), when given the opportunity to iterate autonomously. The eventual effect of autonomous iteration, even with less than perfect models, is the completion of long tasks. And with relatively affordable hardware like the Strix Halo ASUS ROG Flow Z13 laptop, very capable models can be used at home, or even on the road, and in situations where no Internet is available. And clustering multiple Strix Halo or GB10 systems like the ASUS Ascent GX10, opens up the possibility to run closer to frontier quality models, without outrageous cost or electricity requirements.

History

swampstream
Apr 27, 2026 at 00:18

Ok, so still a novice here, trying to get my head around the point of NullClaw. Is it a tiny OpenClaw which just allows you to guide, or what ever they call it, AI inference at a distance, on tiny devices and get a result. So nothing really happens on the, for example, Raspberry Pi with NullClaw beyond receiving and sending prompts and updates etc, perhaps to code on the Raspberry Pi, or analyse something. But since the hardware used is often limited in screen size and ram, I imagine the use cases will be scoped too. It's not like people will enjoy juggling or monitoring multiple agents and following developments on a Pi Zero for example (like you build systems for a client on your pro gear)? But I imagine there will be nice use cases none the less.

Nick AntonaccioAdmin
Apr 27, 2026 at 03:21 (edited, 2 revisions)

Yep, it's like a teeny tiny OpenClaw, which can run on basically anything with a CPU, and requires basically no prerequisite software infrastructure installed on a machine (no Node, Python, etc.). It's just a tiny binary file that runs on most operating systems including Android, Raspberry Pi, etc., or you can run it on a VPS, your home computer, etc.

All the claw apps are local agents that give 'hands' to an LLM, so that they can work with files and the command line on your hardware, to complete work autonomously. The claw ('agent') applications also provide some sort of interface to interact with the LLM, as in a chat interface - those chat interactions can be via the local command console, an SSH connection to a server (what I typically use), or they enable interactions via platforms like Telegram, Discord, Signal, etc., so that you can text with the agent like you're texting with a human. Need to have the agent clean up some files on your server, or respond to all emails about a given topic? - text it a message to do your bidding.

Most of the claw applications also provide access to skills created by a community, which are basically collections of prompt recipes to use tools to accomplish all sorts of common tasks, to connect with well known 3rd party systems, etc., so that you don't have to go through the time and trouble of testing how to best accomplish those tasks and use those tools (particularly tools enabled by MCP servers).

The claw apps also typically enable some sort of memory/learning capability, so that when you complete tasks, it remembers which approaches were most useful - or even if you just want your agent to remember to call you Bob. Typically, memories are stored in the form of .md (Mardown) files and/or database entries.

Since the claw apps are all just harnesses for LLMs to accomplish tasks autonomously on your own hardware, the choice of LLM is really important. My favorite LLM choice for price/performance is Gemini 3.1 Flash Lite (which I access through an account with Openrouter, but you can choose more expensive LLMs if you have a particularly hard task to complete, or you can use locally hosted LLMs to provide the inference brain for your interactions. Most people tend to wire their claw agents to a remotely hosted commercial LLM API (OpenAI, Anthropic, Google, etc.). Openrouter is generally supported by every agent system, to give you access to virtually every hosted API, for both the commercial frontier models, and the open source models. Definitely get an Openrouter account.

If you're just getting used to working with claw-like agents, I'd suggest Hermes and Pi. Hermes is extremely capable, and Pi is very lightweigh. I think Pi is the best for working with locally hosted LLMs, because it's very capable, and it sends a smaller number of tokens on every interaction with the LLM, which really improves performance.

I use Hermes as the big work horse on my VPSs, to get long challenging development tasks completed. I use Pi for local inference (it's been a god send for working with local LLMs). I think of Nullclaw as a quick little tool to complete smaller tasks on littler machines, such as on my phone, or on some travel netbook, or especially on computers which I might not use again, for example, machines that my parents, friends or clients own, which I don't want to install a lot of software on, just to accomplish a one-off task. It's useful especially when I want a fast, clean environment to install for a special task, for example troubleshooting or configuring other software and/or OS config on a system that I don't work on regularly.

I've also had success getting Nullclaw to perform very long running, deeper, challenging tasks, but at this point I'd certainly choose Hermes or Pi for bigger development projects on my long standing VPS accounts and home servers.

I started a thread about my favorite agent/LLM combinations:

https://aibynick.com/thread/27

I'd focus more on the bigger players like Hermes and PI in the beginning. Openclaw has just seemed clunky, bloated, and inelegant so far to me, but it does have huge community support for existing skills and tools.

History

Please login to post a reply.