How I Reduced My AI Coding Costs by Changing One Habit

Share
How I Reduced My AI Coding Costs by Changing One Habit

I started noticing something uncomfortable while using Cursor.

I wasn’t just using AI tools, I was burning through tokens way faster than expected. At first, I justified it as part of the productivity boost, but once I looked at my usage stats, it was clear something was off.

I wasn’t being efficient. I was just being careless with how I used the tools.


What was happening

After breaking down my usage, a few patterns stood out:

  • I defaulted to expensive models (like Sonnet-level reasoning) for almost everything
  • I asked for full solutions instead of structuring the problem first
  • I retried prompts multiple times due to unclear context

In practice, I was using a Ferrari to do tasks that didn’t require that level of power.


What actually worked

Instead of reducing usage, I changed how I used AI in my workflow.


1. Separate thinking from execution

This was the biggest improvement.

I split my workflow into two phases:

Planning (strong model):

  • Break down the problem
  • Define architecture
  • Evaluate trade-offs

Execution (lighter model):

  • Write code
  • Refactor
  • Iterate on small changes

Most tasks don’t need deep reasoning, they need consistency and speed.


2. Stop re-explaining context

A big source of token waste was repeating context.

I used to:

  • Paste large chunks of code
  • Re-explain the same feature multiple times
  • Restart prompts from scratch

Now I:

  • Keep context minimal
  • Only include what’s relevant
  • Build incrementally on previous prompts

This alone reduced both token usage and iteration count.


3. Ask for a plan before asking for code

Jumping straight into implementation usually led to more retries.

Instead, I started asking:

“What’s the best way to approach this?”

This improved:

  • Initial direction
  • Quality of responses
  • Number of iterations needed

Spending more tokens upfront on planning actually reduced total usage.


4. Use strong models intentionally

Not every task needs a high-end model.

I now reserve them for:

  • Architecture decisions
  • Debugging complex issues
  • Understanding unfamiliar areas

Everything else goes to a cheaper model.


5. Fewer, better prompts

Poor prompts were causing unnecessary iterations.

Now I focus on:

  • Clear goal
  • Explicit constraints
  • Minimal relevant context

Better prompts reduce retries, which directly reduces token usage.


Where this showed up in practice

This became obvious while working on real production tasks.

Things like refactoring networking layers or setting up CI pipelines don’t require the most powerful models, but deciding how to approach them does.

Separating planning from execution made a noticeable difference in both cost and efficiency.


Takeaway

The main shift wasn’t technical, it was behavioral.

Treating AI as a tool with a cost model changed how I approached problems:

  • Think before prompting
  • Use the right model for the task
  • Avoid unnecessary iterations

Final thoughts

Tools like Cursor are powerful, but without intentional usage, they can become expensive and inefficient.

Optimizing token usage wasn’t just about reducing cost. It made me more deliberate about how I think and build.


If you’re working on similar problems or exploring AI-assisted development, I’d be curious to hear how you’re approaching it.

Read more