Vibe coding SaaS features with Cursor, Linear, and Reflag

I’ve mostly been using agents for simple bug fixes or tiny UI tweaks, but LLMs have evolved tremendously. I wanted to see whether it’s possible to vibe code an entire feature in a complex SaaS product like Reflag.

TL;DR:

  • It's possible to create whole features with agents in complex SaaS products—at least where you already have structure and code patterns in place
  • A good AGENTS.md file helps
  • You really need to watch for security issues—we still need humans in the loop
  • Agents work best with very specific direction
  • We’re going to see the proliferation of agent-generated PRs and your tools need to be ready

A little background

In Reflag, it’s not immediately clear who’s responsible for a particular feature flag - so let's add the ability to designate an “owner” to a flag. We’ll also add a new feature view that lets you see just your own flags.

The Reflag app is a straight forward React/Node.js app with a REST API with TypeScript types shared between the frontend/backend. We use Prisma to manage the database schema and we use Linear for issue management. We have an AGENTS.md file in place with basic instructions for agents - this proved key for a smooth experience working with agents.

For this experiment, I used Cursor in “Agent mode” and just left the model setting on “Auto”

Let’s start prompting

This is what we’ll end up with:

I started out by just copying the issue description from Linear (Cmd/Ctrl K -> “copy as markdown”) in the prompt:

As part of the archival workflow, it's useful to be able to assign a feature/flag to a particular Reflag user
there's more accountability in that you can see who's in charge of which flags. Like "stage" it helps communicate internally in the org
The assignee is responsible for cleaning up: the slack messages we send can be sent directly or at least mention that person.
We can create a filter that lets you see "my flags" so you can more easily find what you're looking for.

From experience, it’s good to give the LLM some hints if you already have a rough idea about what you’d like the implementation to be like, so I went ahead and added the following super minimal instruction and fired it off:

owner becomes a property on a flag. It should show in the flag sidebar with a drop down that lets you search Reflag users and assign it

Cursor Agent then went to work on scanning various parts of the code base. First the schema file which contains the database schema: 

It also came up with this TODO list which was spot on:

It then went to work on the schema.prisma file, the Zod schemas and the shared types in the API to make the necessary changes. All great!
It posted a nice summary - which of course I didn’t read very carefully because it was just too excited to run the app and see what it would look like:

And just in case you’re thinking “That’s too much to read. Just give me the TLDR;” - funnily enough the LLM was expecting that, because it also included a very brief summary of the changes:

In hindsight, I should have paid more attention to this list of changes because both of the UI changes in the above are wrong:

  1. It used the wrong component. This component lets you select an end user for the customer rather than a user of Reflag. I noticed this as soon as I booted up the application
  2. It implemented the “my flags” view by letting the user search for a name. This isn’t bad, but it wasn’t what I was going for.

I took a quick glimpse at the code which looked like a nice continuation of the existing patterns we have in place. Then I booted up the app. It was immediately clear that the sidebar user selector was using the wrong component. No biggie, just Cursor to fix it! (and don’t worry about typos in the prompt). 

After some more thinking and tinkering:

My excitement was reaching peak levels. I went ahead and tried out setting an owner. A validation error, how disappointing! Lets just try feeding it back into Cursor:

Here, it got confused and thought it had to do with the the “null” case - removing the owner:

I will spare you the details, but it didn’t manage to get itself out of this hole. At some point i stopped it and gave it more precise directions: 

It took a few more rounds to get right:

  • It hadn’t added a name or id to the Select which caused a runtime error
  • The component wasn’t as nice as the “Stage” picker next to it. I asked it to make it look like the stage picker and it just fixed it including the [x] button to remove the owner.
  • The component it built was just slopped into the sidebar file. I asked it to move it into a separate file/component

Feature flag it

We have tiny section in our AGENTS.md telling Cursor to use Reflag for feature flags:

This helps Cursor create flags. The MCP tools contain instructions on how to update the type files locally after a new flag gets created. This works really well!

After feature flagging the new feature, due to how type safety works with Reflag flags, the Agent noticed that the flag key it was using was giving a Typescript error and went looking to fix it:

Great! It created the feature flag on Reflag using the Reflag MCP automatically, ran the Reflag CLI to update the types locally to ensure type safety:

At this point there were still two big things missing that I could have forgotten about if I was too confident in Cursor:

  • No tests
  • A glaring security issue

It took a few rounds to get the tests right. Finally I asked it to ensure that you cannot assign a member outside of the organization and clean up the tests a bit:

Phew, happy we caught this before going live.

Finally, we’re done and the feature went into production behind a feature flag.

Conclusion

This was a great experience. I got really far, really quickly and proved that it is possible to vibe-code entire features in a complex SaaS app. At least, in situations where you already have structure and code patterns in place that you just need to extend upon.

I also learned a few things:

  • A good AGENTS.md file facilitates a much smoother agentic experience.
  • You really need to watch out for security issues. Keep the human in the loop and make sure to review the code well with security in mind.
  • If you give the agent a broadly defined task, expect to go a few rounds of prompting. They’re really good when you give specific directions.

The implication of this is we’re going to see the proliferation of agent-generated PRs. That will lead to engineers spending significantly more time on code reviews, dealing with merge conflicts as multiple changes are in flight, and there will be a higher likelihood of bugs or unintended side effects in agent-authored PRs.

This is why we’re building Reflag to be agent-ready, so you can mitigate these issues by getting agents to flag their own code.