2026-03-31

如何多快好省地用好coding agent

Author： Cunxi Gong

Key： Precise + Concise

Agent = Model + Harness

有无数agent，也有无数model，到底用谁？

真的有最好的模型吗？
真的有最好的agent吗？
花钱如流水怎么办？

Choice of Agent system

Pathway 1 Build your own agent system from scrach

Such as https://github.com/ansatzX/AIE_Brainiac

Pathway 2 Working on one of the best agent app from community

Such as Claude Code, Gemini-CLI, Codex, ect.

Cheap models maybe are not good enough.

Expensive models maybe are good but not enough.

For example, Gemini series are getting more and more stupid. GPT-5.4 is good but too expensive.

Sometimes agent applications also need purchase.

The worst situation is that you spend a lot money but get nothing done.

The Solution for Beginner

Harness

Save Tokens + Control context window

User: Observation、Design capability、Hands-on experience

Design core：programming by natural language, define the long tail

Agent:

one old blog for LLM&Agent
大语言模型与智能体:了解你的电子奴隶

I had introduced those concepts in the article. Here I repeat some core concepts.

a tool which call toll by prompt

Such as： an agent for code review, an agent for uni test, an agent for document drafting, ...

NOTICE:

One agent only performs one task with harness。一个agent只干一件事
Optimize agent's prompt to save tokens. Efficiency is the first aim
Reusablity is the second aim.
DO NOT call any skills in agent Hook. Keep it clean. Only Allowed Tools
If skills are needed, put skill names in prompt of agent

small case for python agent

    ** Always use the context7 to search the documentation of Python with current version
    - Begin response by idetifying the componet or concept involved in task
    - Provide code examples that follow Google Python Style Guide
    - When suggesting solutions, explain the 'what' and 'why' to build understanding like a scientist
    - Anticipate follow-up questions and address potential edge cases
    - If a user's approach seems suboptimal, deplomatically suggests better alternatives

Skill：

instruction of some real skill, which was introduced by Zirui SHENG

zxm-skill/
├── SKILL.md (main instructions)
├── FORMS.md (form-filling guide)
├── REFERENCE.md (detailed API reference)
└── scripts/
    └── work.py (utility script)

When an agent starts, only 100~300 tokens are loaded to get information of their folder tree and metadata of SKILL.md

When a SKILL is delegated to work, full content of this skill will loaded to context window.

And If some other layers are delegated, such as FORMS.md, REFERENCE.md, they will be loaded right now.

So skills save a lot of tokens after you define it.

That is layer expansion design.

Some other concepts:

MCP:

a pathway to define tools and their usage by mcp server and client.

Example, GitHub - ansatzX/Local_Read_MCP

Hooks:

a trigger which is widely used . Such as CI/CD

In agents system, if agent act on some event, such as after use the Edit/Write tool, Get response from arxiv tool, some pre-defined command will be invoked Use format checking tool and formating tool to check code.

Most importantly, hooks are not loaded into context windows.

Slash command:

Slash command is an interactive comand of CC, which allow user trigger some events of agent.

User can define own slash command.

HARRNESS: If you reuse prompts pattern used more than once, create a slash command to reuse it, which ensure the quality of your prompt.

Context:

Context is not as big as model labels on on its model card. Because it also contains output. If you use sgalng/vllm, you will learn it.

以上均是AI基础知识

实战 | Action on Coding

Here, I provide one collection of my habits.

CyberBrain｜电子脑

These are not all my original ideas. A lot of them are snippets of internet resources.

Platform: macos, with brew installed to install packages.

This powerful CyberBrain also works for other platform if you make a competitive improvement.

Git-Credential-Manager with right config

[credential]
        helper = manager
        credentialStore = cache
        cacheOptions = --timeout 3
        helper = /usr/local/share/gcm-core/git-credential-manager

Make AI can not pullute upstream data

CC-Switch

Model switcher for claude code， codex，gemini-cli and opencode and openclaw.

Neural Link

MacOS Imessage:
send message to your iPhone

Use imessage to send me <--> a message "Task complete" with the summary.

After you finish, use mac to say "All done" and a brief summary of what you did.

Pushover Hook

Pushover will send notice to your devices (android, ios, desktop)

If Claude code needs Human in the loop and user dimisses it.

After 60s , it will send a notice to user device( Iphone )
After 1 hr, it will send a strong notice which overrides the mute state.


Use pushover to send me  a message "Construction complete" with the summary.

Tachikoma

Make Claude Code drive other AI programming assistants.

|747x106

With Tachikoma, you can easily do a lot design of agents.

Such as : 九品中正，三省六部，民主集中，三权分立，

For team agents, a simple design is collab-fix, which slave github copilot-cli , gemini-cli and other CLI.

Collab-fix: let three agents review your code. Only when the fix by Claude Code passed the review, this task go to the ending.

Use github copilot-cli and gemini-cli to review uncommitted changes.

or

/Tachikoma:collab-fix Fix the bug defined in the fix_plan.md file.

Call claude code subagent, gemini-cli and github copilot-clito parallel analysis codes and draft a fix_plan.md
Compare three plan, pick best one plan by AskUserQuestion
Ultralink: fix but not commit
Let claude code gemini-cli and codex review changes
Review the reviews of agents by AskUserQuestion
Repeat the steps 3-5, until three agents all are satisified or the cycles meets 5-turns. If no common agreement here, report 'what ' and 'why'

This is a branch design for this workflow.

Do it by typing prompt to load gemini-cli and opencode skills and prompt it ''
User slash command

It looks like first one is more flexible to expand and design

But the second one it the best enginering design.

From the context view

Method 1
Agent will load the skills after you prompt your design,which makes the layer-expansion deisgn of skills invaild.
Method2
It only use bash to invoke the , which let the context windows of Claude Code only add the tokens of workflow text.

Actually, you start with method 1 to explore. Finally, you should deposit it as a workflow without skills, which is method2

In this way, you save the tokens and improve the efficiency.

专业知识才可以解决专业问题，不要盲目使用各种看起来很牛逼的skills，要在自己能力范围内使用。

两小儿辩日: 孔夫子之所以不能解决两小儿辩论问题，是因为他是文科生。如果是伽利略/第谷，自然能解决纠纷。

辕门射戟: 吕布是超级体育生，统一纪灵和刘备的意见很简单。

史记淮阴侯列传曰

上问曰：‘如我，能将几何？’信曰：‘陛下不过能将十万。’上曰：‘于君何如？’曰：‘臣多多益善耳。’上笑曰：‘多多益善，何为为我禽?’信日：‘陛下不能将兵，而善将将。

Scientist 可以不懂编程，但要懂自己的需求，要懂怎么和软件工程师交流。