Knowing when to stop an Agent Loop

May 16, 2023

3 min read

Background

When using prompts in agent workflows like ReAct, it's important to know when to stop. Imagine you're making a web browser that can perform tasks, like ordering a 12 inch cheese pizza to your apartment. If your agent can order and pay for the pizza, that's great! But be careful: it might keep going and order more pizzas!! In these looping style workflows, it's really important to know when to stop.

Naive Approach

When I encountered this problem my first instinct was add a boolean property to the json response called complete. My prompt looked something like this:

Your goal is to "order me a 12 inch cheese pizza to my apartment" Here are the tools avaiable: - CLICK - Click on a button - FORM_FILL - Fill out a form - BACK - Go back a page - GOTO - Go to a URL You are on johnsofbleecker.com. Response with the next action you want to perform. If you are done that just say complete: true. Here is the JSON format expected: { tool: // One of the tools from above or empty is complete is true thought: // Explain why you are picking this tool - (I'll save this tip for another day) complete: // Boolean. Set to true if you are done }

The goal of this prompt is to return JSON that looks like this when you need to use a tool:

{ "tool": "CLICK" "thought": "Click on the order now button" }

And then after the goal is complete something like this:

{ "complete": true "thought": "Your pizza is orderd. The goal is complete" }

While this sounds good on paper it leads to a big problem.

Problem

While this worked occasionally, it often leads to a major error. Even with GPT-4 I would receive JSON that looked like this:

{ "tool": "Click on the order now button" "complete": true "thought": "Clcik on the order now button to complete the goal" }

I tried different prompts or some conditional logic in code that checked for complete first but never solution worked well. It turns out there is a much easier way.

Solution

The solution is actually super simple - just turn complete into a tool. Instead of having a seperate boolean just create a new tool instead. Here’s what my prompt would look like:

Your goal is to "order me a 12 inch cheese pizza to my apartment" Here are the tools avaiable: - CLICK - Click on a button - FORM_FILL - Fill out a form - BACK - Go back a page - GOTO - Go to a URL - CLOSE - CLose browser after the action is completed You are on johnsofbleecker.com. Response with the next action you want to perform. If you are done that just say complete: true. Here is the JSON format expected: { tool: // One of the tools from above thought: // Explain why you are picking this tool - (I'll save this tip for another day) }

This simple change lead to MUCH better performance. And if you want to fine tune a model this will be easier to fine tune since you are just focused on one property.

Prompt Wrangler makes it easy to turn GPT prompts into structured APIs. This make it easier to iterate on prompts without having to make code changes. Best of all it's 100% free!
Try Out Prompt Wrangler