When should you use Self-Refine?

Any generative task: writing, code, design, plans. When the first draft is acceptable but not great. Anywhere quality matters more than latency or cost. Inside agentic systems that have time to iterate.

When NOT to use Self-Refine?

Real-time chat where users wait for a response. Trivial tasks where the first draft is fine. Tight token budgets — self-refine roughly triples cost.

How does Self-Refine work?

The model produces a first draft. It switches into a critique persona and lists specific weaknesses in the draft (not vague — "this sentence is awkward because…"). It writes a revision plan addressing each weakness. It applies the plan, producing a new draft. Loop typically runs 3 iterations. Final output is the result of iteration N's revision.

All techniques

Glossary · Technique

Self-Refine

Also known as: Self-critique loop, Reflexion

Generate → critique own output → revise → repeat. Pushes a model's output much closer to its capability ceiling.

Try the interactive template

When to use it

Any generative task: writing, code, design, plans.
When the first draft is acceptable but not great.
Anywhere quality matters more than latency or cost.
Inside agentic systems that have time to iterate.

When not to use it

Real-time chat where users wait for a response.
Trivial tasks where the first draft is fine.
Tight token budgets — self-refine roughly triples cost.

How it works

1The model produces a first draft.
2It switches into a critique persona and lists specific weaknesses in the draft (not vague — "this sentence is awkward because…").
3It writes a revision plan addressing each weakness.
4It applies the plan, producing a new draft. Loop typically runs 3 iterations.
5Final output is the result of iteration N's revision.

Example

Lazy prompt

Write a LinkedIn post about shipping fast.

Using the technique

Write a LinkedIn post about shipping fast. After drafting, switch to a brutal-editor voice and find 3 specific weaknesses (point at exact sentences). Revise. Repeat twice more.

Common pitfalls

If the critique voice is too gentle, the model just re-outputs the same draft slightly reworded.
Without numbering iterations, the model may collapse the loop into one shot.
Some models 'agree' with their own critiques on the surface but don't actually change the substance.

Where this came from

Madaan et al., 2023 — "Self-Refine: Iterative Refinement with Self-Feedback."

Related techniques

Chain-of-Thought (CoT) Prompting

Force the model to think step-by-step before answering. Dramatically improves accuracy on multi-step problems.

Tree-of-Thoughts (ToT) Prompting

Generate multiple reasoning branches per step, evaluate each, and prune. Beats single-path Chain-of-Thought on hard decisions.

Constitutional AI

Train (or prompt) the model with an explicit set of principles, then have it critique its own outputs against them. Anthropic's safety technique.

Try it interactively

The interactive template lets you fill in your scenario and generates a copy-ready prompt that uses this technique.

Open the template