Mara agent anaweza kufanya — kufuta faili, kueneza amri za shell, kupiga simu API za nje, kutumia pesa — makosa yake (au prompt iliyokasirisha) inakuwa matokeo halisi. Ulinzi ni haki ndogo zaidi pamoja na malango ya idhini pamoja na kutengana: mpa tu kile kinachohitaji, isite uthibitisho wa amri zote zisizogeuzwa, na iiendelee mahali ambapo haiwezi kusambaza madhimba ya muda mrefu.
Hatari
- DESTRUCTIVE actions → rm -rf, DROP TABLE, force-push → irreversible data loss
- ARBITRARY shell → run anything → privilege escalation, exfiltration
- EXTERNAL APIs/$$ → send emails, place orders, spend on cloud/LLM tokens
- SECRET exposure → leak API keys / .env / tokens into logs or outbound calls
- PROMPT INJECTION → untrusted input (a web page, an issue) hijacks the agent
Uinjeksho wa prompt ni makali zaidi: maudhimisho ambayo agent anasoma yanaweza kuwa na maagizo, kwa hivyo chombo chochote ambacho agent kinaweza kupigia simu kinaweza kufikiwa na mhasarishi anayeongoza maudhimisho hayo.
Ruwaza za ulinzi
- LEAST PRIVILEGE / ALLOWLIST → only pre-approved commands & paths; deny by default
- APPROVAL FOR DESTRUCTIVE → human confirms deletes, payments, prod writes
- SANDBOX / DRY-RUN → preview the action; isolated container, no prod creds
- ISOLATED ENVIRONMENT → ephemeral VM/branch; blast radius = throwaway box
- AUDIT LOGS → record every tool call + args for review
- NO SECRETS IN CONTEXT → inject via env at runtime; never paste keys into prompts
- NETWORK LIMITS → egress allowlist; block arbitrary outbound requests
# Conceptual permission policy
tools:
read_file: { allow: true } # safe, reversible
run_shell: { allow: ["npm test", "git status"] } # allowlist only
delete_file: { allow: "ask" } # human approval required
send_email: { allow: "ask", sandbox: true } # dry-run first, then confirm
network:
egress: ["api.github.com"] # everything else blocked
Mfumo ni mkakati wa kiwango: kusoma ni bure, kuandika kugeuzwa ni rahisi, vitendo visivogeuzwa au vya nje vinahitaji uthibitisho, na pesa au uzalishaji uhitaji kutengana pamoja na mtu katika kitanzi.
Kwa nini ni muhimu
Kimya cha agents inatoka kwa kuwaruhusu kufanya, lakini tendo ni mahali ambapo hitilafu ya kujiendelea au prompt iliyotekwa inakuwa data iliyofutwa, ufunguo uliovuja au zabuni halisi. Kubunia ruhusa kama minyo ya haki ndogo, kukamatia vitendo visivogeuzwa nyuma ya idhini ya mtu, kueneza katika boksi la kumimina na kurasa ya ukaguzi, na kuweka siri na kutoka nje kwa mtandao vizuri inakupatia nguvu ya autonomy bila kumweka biashara katika kiwango kwamba mfano ni sahihi kila wakati.
