The AI automation space is buzzing with excitement about browser agents. Everywhere you look, there are bold claims: “Automate everything with AI!” “No-code automation is finally here!” “Let AI do the work for you!”

As someone who’s always eager to test new productivity tools, I decided to put these promises to the test with a real-world use case. What followed was a humbling reminder that we’re still in the early days of this technology.

chatGPT atlas agent mode

I set out with a straightforward goal: create an automated workflow that would send daily learning materials to my Telegram and Discord channels. Nothing fancy, just a basic n8n workflow that would:

- Pull content or generate materials using OpenAI
- Format the content appropriately
- Send it to Telegram and Discord
- Run this daily on a schedule

This seemed like the perfect use case for browser agents. It’s repetitive, rule-based, and exactly the kind of task that AI automation should excel at.

I chose two popular browser agents that have been getting a lot of attention:

ChatGPT Atlas - OpenAI’s browser automation tool that promises to understand and interact with web interfaces intelligently.

Comet Browser - Another AI-powered browser agent designed to automate complex web workflows.

Both tools market themselves as capable of handling automation tasks with minimal human intervention. Let’s see how they performed.

ChatGPT Atlas
Time invested: 10 minutes
Expected outcome: A working n8n workflow with OpenAI integration
Actual outcome: Complete failure

Here’s what went wrong:
Problem 1: Node Selection Failure
The most basic task in n8n is selecting and configuring nodes. ChatGPT Atlas struggled with this fundamental operation. It couldn’t properly identify or select the OpenAI node, which was critical for my workflow.
I watched as it hovered over the wrong elements, clicked in the wrong places, and generally seemed confused about the n8n interface structure.

n8n openai node

Problem 2: Expression Passing Issues
In n8n, you often need to pass data from one node to another using expressions. This is workflow automation 101. ChatGPT Atlas completely failed at this.

expression issue


Even when I tried to guide it, it couldn’t understand how to reference previous node outputs or construct basic expressions. The data flow that should have been straightforward became an impossible task.


Problem 3: Zero Working Components
After 10 minutes of attempts, troubleshooting, and retries, I had exactly zero working components. Not a single node was properly configured. Not a single connection was made correctly.

n8n workflow


It was a complete failure.

Test 2: Comet Browser

After the disappointing Atlas experience, I switched to Comet Browser with renewed hope.

Time invested: 3 minutes
Expected outcome: A working n8n workflow
Actual outcome: Gave up

The Only Success: Schedule Trigger

In three minutes, Comet Browser managed to create one thing: a schedule trigger. That’s it.

comet brower fails

To be fair, it did that correctly. The trigger was set up and would fire on schedule. But that’s where the success ended.

After hitting roadblock after roadblock trying to add the next nodes, configure the OpenAI integration, and set up the message formatting, I made the pragmatic decision to stop.

Three minutes in, and I could already see this was going nowhere. The pattern was identical to Atlas: confusion about interface elements, inability to understand workflow logic, and failure to execute even basic tasks.

What Went Wrong?

Looking back at both experiences, several fundamental issues became clear:

1. Interface Understanding

Browser agents still struggle to truly understand complex web interfaces. n8n’s workflow builder isn’t even that complicated compared to many enterprise tools, yet both agents couldn’t navigate it effectively.

2. Context Retention

AI agents need to maintain context across multiple steps. “I just configured node A, now I need to connect it to node B using the output from A.” Both tools failed at this sequential reasoning.

3. Domain Knowledge Gap

Building workflows requires understanding not just the interface, but the logic of automation itself. What’s a trigger? How do expressions work? How should data flow? The agents lacked this deeper understanding.

4. Error Recovery

When things went wrong (which was constantly), neither agent could diagnose or recover from errors. They would simply retry the same failed action or give up entirely.

The Gap Between Hype and Reality

Here’s the uncomfortable truth: the marketing around browser agents has outpaced the actual capabilities by a significant margin.

What they promise:

  • Automate complex workflows with natural language
  • No technical knowledge required
  • Set it and forget it automation
  • AI that learns and adapts

What they deliver (so far):

  • Struggles with basic UI navigation
  • Requires constant human supervision
  • Frequent failures on simple tasks
  • Limited learning between attempts

This isn’t to say the technology is worthless. It’s to say we need to be realistic about what it can do today versus what it might do tomorrow.

Despite my frustrating experience, I remain optimistic about browser agents in the long term.

The technology is improving rapidly. Each month brings new updates, better models, and more capable agents. What failed today might work tomorrow.

The vision is compelling: truly autonomous automation that can adapt to changing interfaces, learn from mistakes, and handle complex multi-step processes without human intervention.

But we’re not there yet.