2026-03-31
如何多快好省地用好coding agent
Author: Cunxi Gong
Key: Precise + Concise
Agent = Model + Harness
有无数agent,也有无数model,到底用谁?
-
真的有最好的模型吗?
-
真的有最好的agent吗?
-
花钱如流水怎么办?
Choice of Agent system
Pathway 1 Build your own agent system from scrach
Such as https://github.com/ansatzX/AIE_Brainiac
Pathway 2 Working on one of the best agent app from community
Such as Claude Code, Gemini-CLI, Codex, ect.
Cheap models maybe are not good enough.
Expensive models maybe are good but not enough.
For example, Gemini series are getting more and more stupid. GPT-5.4 is good but too expensive.
Sometimes agent applications also need purchase.
The worst situation is that you spend a lot money but get nothing done.
The Solution for Beginner
Harness
Save Tokens + Control context window
User: Observation、Design capability、Hands-on experience
Design core:programming by natural language, define the long tail
Agent:
one old blog for LLM&Agent
大语言模型与智能体:了解你的电子奴隶
I had introduced those concepts in the article. Here I repeat some core concepts.
a tool which call toll by prompt
Such as: an agent for code review, an agent for uni test, an agent for document drafting, ...
NOTICE:
-
One agent only performs one task with harness。一个agent只干一件事
-
Optimize agent's prompt to save tokens. Efficiency is the first aim
-
Reusablity is the second aim.
-
DO NOT call any skills in agent Hook. Keep it clean. Only Allowed Tools
-
If skills are needed, put skill names in prompt of agent
small case for python agent
** Always use the context7 to search the documentation of Python with current version
- Begin response by idetifying the componet or concept involved in task
- Provide code examples that follow Google Python Style Guide
- When suggesting solutions, explain the 'what' and 'why' to build understanding like a scientist
- Anticipate follow-up questions and address potential edge cases
- If a user's approach seems suboptimal, deplomatically suggests better alternatives
Skill:
instruction of some real skill, which was introduced by Zirui SHENG
zxm-skill/
├── SKILL.md (main instructions)
├── FORMS.md (form-filling guide)
├── REFERENCE.md (detailed API reference)
└── scripts/
└── work.py (utility script)
When an agent starts, only 100~300 tokens are loaded to get information of their folder tree and metadata of SKILL.md
When a SKILL is delegated to work, full content of this skill will loaded to context window.
And If some other layers are delegated, such as FORMS.md, REFERENCE.md, they will be loaded right now.
So skills save a lot of tokens after you define it.
That is layer expansion design.
Some other concepts:
MCP:
a pathway to define tools and their usage by mcp server and client.
Example, GitHub - ansatzX/Local_Read_MCP
Hooks:
a trigger which is widely used . Such as CI/CD
In agents system, if agent act on some event, such as after use the Edit/Write tool, Get response from arxiv tool, some pre-defined command will be invoked Use format checking tool and formating tool to check code.
Most importantly, hooks are not loaded into context windows.
Slash command:
Slash command is an interactive comand of CC, which allow user trigger some events of agent.
User can define own slash command.
HARRNESS: If you reuse prompts pattern used more than once, create a slash command to reuse it, which ensure the quality of your prompt.
Context:
Context is not as big as model labels on on its model card. Because it also contains output. If you use sgalng/vllm, you will learn it.
以上均是AI基础知识
实战 | Action on Coding
Here, I provide one collection of my habits.
These are not all my original ideas. A lot of them are snippets of internet resources.
Platform: macos, with brew installed to install packages.
This powerful CyberBrain also works for other platform if you make a competitive improvement.
Git-Credential-Manager with right config
[credential]
helper = manager
credentialStore = cache
cacheOptions = --timeout 3
helper = /usr/local/share/gcm-core/git-credential-manager
Make AI can not pullute upstream data
CC-Switch
Model switcher for claude code, codex,gemini-cli and opencode and openclaw.
Neural Link
- MacOS Imessage:
send message to your iPhone
Use imessage to send me <--> a message "Task complete" with the summary.
After you finish, use mac to say "All done" and a brief summary of what you did.
- Pushover Hook
Pushover will send notice to your devices (android, ios, desktop)
If Claude code needs Human in the loop and user dimisses it.
-
After 60s , it will send a notice to user device( Iphone )
-
After 1 hr, it will send a strong notice which overrides the mute state.
Use pushover to send me a message "Construction complete" with the summary.
Tachikoma
Make Claude Code drive other AI programming assistants.

With Tachikoma, you can easily do a lot design of agents.
Such as : 九品中正,三省六部,民主集中,三权分立,
For team agents, a simple design is collab-fix, which slave github copilot-cli , gemini-cli and other CLI.
Collab-fix: let three agents review your code. Only when the fix by Claude Code passed the review, this task go to the ending.
Use github copilot-cli and gemini-cli to review uncommitted changes.
or
/Tachikoma:collab-fix Fix the bug defined in the fix_plan.md file.
- Call
claude codesubagent,gemini-cliandgithub copilot-clito parallel analysis codes and draft a fix_plan.md - Compare three plan, pick best one plan by AskUserQuestion
- Ultralink: fix but not commit
- Let
claude codegemini-cliandcodexreview changes - Review the reviews of agents by AskUserQuestion
- Repeat the steps 3-5, until three agents all are satisified or the cycles meets 5-turns. If no common agreement here, report 'what ' and 'why'
This is a branch design for this workflow.
-
Do it by typing prompt to load gemini-cli and opencode skills and prompt it ''
-
User slash command
It looks like first one is more flexible to expand and design
But the second one it the best enginering design.
From the context view
-
Method 1
Agent will load the skills after you prompt your design,which makes the layer-expansion deisgn of skills invaild. -
Method2
It only use bash to invoke the , which let the context windows of Claude Code only add the tokens of workflow text.
Actually, you start with method 1 to explore. Finally, you should deposit it as a workflow without skills, which is method2
In this way, you save the tokens and improve the efficiency.
专业知识才可以解决专业问题,不要盲目使用各种看起来很牛逼的skills,要在自己能力范围内使用。
两小儿辩日: 孔夫子之所以不能解决两小儿辩论问题,是因为他是文科生。如果是伽利略/第谷,自然能解决纠纷。
辕门射戟: 吕布是超级体育生,统一纪灵和刘备的意见很简单。
史记淮阴侯列传曰
上问曰:‘如我,能将几何?’信曰:‘陛下不过能将十万。’上曰:‘于君何如?’曰:‘臣多多益善耳。’上笑曰:‘多多益善,何为为我禽?’信日:‘陛下不能将兵,而善将将。
Scientist 可以不懂编程,但要懂自己的需求,要懂怎么和软件工程师交流。