Machine Learning

What AI Agents Shouldn't Do On Their Own

nimda 3 weeks ago

0 6 7 minutes read

What AI Agents Shouldn't Do On Their Own

it focuses on what they have it can be do.

Independence is set as a goal: give them toolsgive them accessthey ran away.

The more freedom, the better the result.

That outline is very accurate. I use agents every day. They really boosted my score. I believe it!

And I lost two hours of work with an agent who was doing what I asked.

I was working on a feature a branch clean up.

The job description said “remove unused files and clean up the repo.” The agent translated it as “unused” in generalI removed the configuration directory that I hadn't touched in months but was still referring to the deployment script, and continued.

I caught it during the diff update. The setting was not in version control. Two hours rebuilding from memory and git history.

The job was clear and the agent was following instructions, the only problem was that nothing was telling him where he was Wait.

Knowing what tasks to do The gate it is part of the agents that run well. Give them full freedom in the wrong section and you'll spend the afternoon undoing what took them thirty seconds.

Hello! My name is Sara Nóbrega and I teach you how to become an AI power user in Learn AI. It's free to register!

What the agent should touch on himself

Other activities take it back. For example, reworked work can be rolled back or a new unit test can be removed. The cost of error is low.

Returning costs are varied by work. A redone job takes seconds to go back; you just restore the bond, but the reduced production table can take your entire week, if recovery is possible.

Question before you do the job: this can be postponed?

If so, let the agent go. If not, add a checkpoint before it starts.

Here it is permission matrix I work at:

Table showing recommended agent independence levels and personal review requirements by type of job. Small refactors and unit tests can have high agent autonomy, while API changes, dependencies, migrations, security, infrastructure, and production deployments require increasing levels of human review. Photo by Author and ChatGPT.

The categories that should always need someone

Some categories require a a place to check people regardless of how well defined the task is.

Risk a error too high, too recovery it is more expensive, to let the agent decide for himself.

What AI Agents shouldn't face alone, part 1. Image created with DALL-E. — Photo by Author and ChatGPT.

Destructive file operations

`rm -rf`, `git clean -fd`, `git reset --hard`.

These delete or give up a job that may not exist which cannot be returned.

The agent will run them if the job description means cleaning.

I've had one run `git clean -fd` during the refactor because the task was “clean temporary files.”

My non-committal job was gone. There was no malfunction, as the agent did exactly what the words said. The protection is a clear block list with a validation step, not trusting the agent to dictate where “cleanup” ends.

2. The database writes and migrates

Anywhere DELETE except a WHERE clause, any DROP or TRUNCATEany schema migration that affects you production data.

A type in a WHERE clause can clear the table. An out-of-order migration can damage data that is impossible to reconstruct. Always review before starting.

3. Cloud infrastructure

`terraform apply`, `kubectl delete`, `aws iam *`, `gcloud iam *`.

Infrastructure changes affect live systems and other groups in general. Changing permissions is very dangerous because the damage may not be noticed until something fails.

What AI Agents shouldn't face alone, part 2. Image created with DALL-E. — Photo by Author and ChatGPT.

4. Production deployment

Anywhere shipment of a production the environment must go through a human review step, even if the code is generated by an agent.

CI/CD pipelines can use automatic output, and that's fine. The decision to feed into production is yours.

You know what's on the plane, what incidents are open, what maintenance is scheduled. The agent does not have that context, and cannot request it mid-pipeline.

5. Auth and security logic

Authentication flow, authorization rules, token management, session management.

Bugs here they don't come from unit tests, they come from incident reports, sometimes months later.

An agent that writes auth logic will produce something that looks correct and passes the happy path.

Dangerous cases are edge cases: a token that does not expire under a certain sequence of API calls, a route that bypasses the middleware if the parameter is missing.

That's exactly what unit testing is and where the security update is located. Every auth change requires someone looking for those gaps, not satisfying a masked happiness method.

6. Secrets, `.env` files, API keys

An agent reading or writing information creates exposure danger. Keep this section off limits by default and manage it manually.

git push --force it sits in its own category because it rewrites history on the remote control. Once pushed, the local branches of other donors differ. Recovery is painful and sometimes impossible.

People should be careful with all these instructions. Agents make it easy to accidentally trigger them, buried within a long sequence of safe steps.

AGENTS.md: write a contract

Provide agents specific structure from scratch. An AGENTS.md file in the root of your repo tells the agent what the project is, how it's run, and what it's not allowed to touch without asking.

A which is not clear AGENTS.md finds you an agent who fills in the blanks with guesswork. I read this in a codebase that had no AGENTS.md at all.

The task was to “plan the structure of the project.” The agent moved files to all directories based on naming conventions that made sense to it. All references to those methods were broken.

The job took the agent twenty minutes; cleaning took me two hours. Three lines of scope restrictions would have blocked you completely.

Here is the template I use:

# AGENTS.md

## Project

[Brief description of the project and tech stack]

## Setup

```bash

# Install

npm install  # or pip install -r requirements.txt

# Run

npm run dev

# Test

npm test

# Lint

npm run lint

```

## Coding rules

- Make minimal changes. Don't refactor unrelated code.

- If behavior changes, add or update tests.

- Don't touch files outside the scope of the task.

- Keep diffs readable. One concern per commit.

## Safety rules

Ask before running any command in blocked_commands.md.

If you're unsure whether a command is safe, stop and ask.

## Definition of done

- Tests pass

- Diff is explainable in one sentence

- Final report provided (see below)

## Final report format

After every task, provide:

1. Summary of changes

2. Files changed

3. Tests run and result

4. Risks or assumptions

5. Anything not completed

```

A companion file, blocked_commands.mdspecifically lists what requires a person's consent before starting:

# blocked_commands.md

## Destructive file operations

- rm -rf

- git clean -fd

- git reset --hard

## Git operations

- git push --force

- git push --force-with-lease

## Database operations

- DROP TABLE

- TRUNCATE TABLE

- DELETE without WHERE clause

- Any migration that alters a production schema

## Cloud / infrastructure

- terraform apply

- kubectl delete

- aws iam *

- gcloud iam *

## Secrets

- Any command reading or writing .env files

- Any command touching API keys or credentials

When the AGENTS.md it is not clear, the agent is guessing. If specified, the agent used it, so the file is your contract. Write it before you start work, not after something is broken.

Check out my two latest ones articles where you can learn to render your AI unlimited context and check six normal hard decisions AI developers need to be productive.

A two-agent loop

For anything intermediate or higher, don't use one agent, use two.

Agent resources 1. Agent 2 update. Then Agent 1 uses only the critical response.

Manufacturer information:

You are a senior software engineer implementing a specific task.

Task: [describe the task]

Context: [link to AGENTS.md or paste relevant sections]

Rules:

- Make minimal changes.

- Stay in scope.

- Don't refactor unrelated code.

- Add tests if behavior changes.

- When done, provide a final report: summary, files changed,

  tests run, risks, anything incomplete.

Reviewer's note:

You are a code reviewer with no attachment to the implementation.

Review this diff: [paste diff]

Check for:

- Bugs and edge cases

- Missing tests

- Security issues

- Unintended behavior changes

- Anything outside the stated scope

Output:

- Critical issues (must fix)

- Minor issues (optional)

- Anything you'd flag for a human

Do not rewrite the code. Flag, don't fix.

The reviewer agent has no ego investment in the code. Checks for bugs, edge cases, test coverage, and security issues without trying to redo the job.

Code reviews are how you catch bugs. The two-agent loop is a parallel, automated process.

Final report

Requires a final report of all agent activity:

1. Summary of changes

2. Modified files

3. The test is ongoing and results

4. Risks or assumptions

5. Anything unfinished

This makes the agent responsible. If it cannot summarize what it did in clear words, that is a sign that the work was not clean.

It also creates scripts without you having to write them manually. The reports are piling up. If something breaks after a week, you can track what changed and why.

Unusual work

An unpleasant task behind the agent AI. Image produced with DALL-E. — Photo by Author and ChatGPT.

The hype surrounding AI agents is here to stay, and it's being very profitable. They make your increase output.

The workers who got the most were the ones who did the setup work: he wrote i AGENTS.mdthought about permission levels, created a list of blocked commands, set up a two-agent loop.

Agents work best when they have clear instructions. That part is up to you.

Thanks for reading!

You can find me at LinkedIn again A small stakewhere I share more details about AI and LLM.

Source link

nimda 3 weeks ago

0 6 7 minutes read