Using an LLM as a Rubber Duck for Ideas

One of their most underrated uses is in the earliest stage of engineering projects.

We often talk about using LLMs to help with coding, debugging, or small writing tasks. But one of their most underrated uses is in the earliest stage of engineering, when a project exists only as a rough idea, a few notes, or a hunch.

Before there is data or architecture, there is reasoning. We need to know what we are building, who it helps, and why it matters now. Many applied projects break down here. The issue is not implementation, but unclear thinking.

What "rubber ducking" means here

In debugging, "rubber ducking" means explaining your code out loud, line by line, to a rubber duck or anything sitting on your desk. The act of explaining helps you spot gaps you missed.

The same idea works at the design stage. When you explain a problem, unclear parts become obvious.

An LLM works well here because it:

listens consistently
asks for clarification when something is vague
and repeats your ideas back in simple language

The goal is not for the LLM to invent ideas for you. It is there to help you see your own thoughts more clearly.

1. Why models fail before they start

A few years ago, we worked on a system to identify PII in tabular data. Offline evaluation looked great. The demo worked. The numbers were strong.

Deployment told a different story.

The AI governance team was not aligned on how the model should be used.
Product managers could not make a clear case for why this was important at that moment.
We never wrote down what success meant or what timeline we were aiming for.

The model itself was fine. The reasoning was not.

Years later, I revisited the project and tried a simple exercise. I pasted the old design notes into an LLM and asked for a plain summary:

Help me answer:

What problem are we actually trying to solve?
What is the baseline today?
What does a "good enough" V1 look like?
What is out of scope?

The model's first response was vague, which meant our own writing was vague. After iterating a few times, the summary became clearer. It did not fix the project, but it exposed the real issue: we had never agreed on the problem in the first place.

I have seen the same pattern across many teams. Most "failed" models are not modelling failures. They are clarity failures.

No clear definition of success
No written list of what will not be built
Requirements living in people's heads rather than in text

Using an LLM as a rubber duck is one way to make these gaps visible early.

2. The blank paper problem

Every new project begins the same way: a blank page and too many possibilities. The hard part is not building the model. The hard part is writing the first real paragraph of the design document.

This simple starter prompt helps break through that first step:

I am exploring [domain/topic]. Help me clarify:

What problem am I solving?
Who experiences this problem?
What is the current baseline?
What is "good enough" for V1?
What is out of scope?

If the response sounds vague, your framing is not clear yet. When the model can restate the problem in a simple, accurate sentence, the foundation is strong enough to move forward.

This early clarification saves time and prevents unnecessary rework.

3. A five-step reasoning loop

Once the basics are clear, I use the same LLM conversation to structure the rest of the thinking. Over time, this has turned into a five-step loop. Each step produces something that fits naturally into a design doc.

Step 1. Clarify the problem

Set the context and the baseline.

Restate the problem as:

A single clear sentence
Who experiences it
The baseline today
The V1 definition of "good enough"
Three things that are out of scope

Repeat until the LLM can paraphrase it accurately.

Step 2. Frame the vision

A vision guides decisions. It sets boundaries.

Summarize the vision in one sentence. Critique it from:

Product impact
System complexity
User value

Then shift perspectives:

Act as a PM, an engineer, and a domain expert. What feels risky or underspecified?

Ask the LLM to break the vision into sections. This often becomes the outline for the design doc.

Step 3. Decompose the strategy

Example: hybrid search with intent detection and RAG

Strategy is about learning, not features. Break the work into stages (V0 to V2), and let each stage test one assumption.

Imagine building a hybrid search system with:

an intent classifier
a RAG pipeline for answer-seeking queries
and a fallback to keyword search when unsure

Here is the prompt:

If we build this in stages (V0 to V2), what assumption does each stage test?

For each stage:

The assumption
The smallest experiment possible
Acceptance criteria
Kill or continue signal

Example

V0 Stage: Assumption: We can separate "lookup" from "answer-seeking" queries. Experiment: Label 500 queries and train a classifier. Accept: Meaningful accuracy and stable labels. Kill/continue: If labels drift, simplify taxonomy.

V1 Stage: Assumption: RAG beats keyword search on usefulness. Experiment: Build simple RAG prototype and run blind eval. Accept: Raters prefer RAG 60–70% of the time. Kill/continue: If baseline wins, revisit retrieval.

V2 Stage: Assumption: The hybrid system behaves predictably in real use. Experiment: Limited rollout with fallback and logging. Accept: Neutral or positive user feedback. Kill/continue: If confusion rises, tighten thresholds.

This table helps shift conversations from "what do we build next?" to "what do we need to learn next?".

Step 4. Define non-goals and failure behavior

Clarify what you will not build. Clarify how the system behaves when it is unsure.

List five expansions we should avoid for now. Describe how the system behaves when uncertain:

What it shows
What it avoids
What it logs

This becomes a simple "won't do" list and a clear failure behavior note.

Step 5. Codify and reuse

At the end of the project, record the lessons.

List five rules of thumb from this project that will help in the next one. Keep them short.

These notes build a small memory of your reasoning patterns.

4. Using dialogue as design review

Each reasoning session follows the same pattern: propose, critique, refine. Switching roles helps find weaknesses.

If compute is cheap, what becomes the bottleneck? If we stop collecting data tomorrow, what breaks? What is the smallest test that could prove us wrong?

The point is not to let the model decide. The point is to expose weak reasoning before it becomes expensive.

5. The editing rubric

Before implementation, I run a simple quality check.

Rate 1–5:

Is the problem clear?
Does the vision match the problem?
Is the plan feasible?
Is the outcome valuable?
Is the writing plain and grounded?

Anything below four means refine the idea.

6. Making it a shared habit

When teams apply this loop consistently, it stops feeling like a framework and starts feeling natural.

Every project begins with a short reasoning note
Design docs carry the problem, vision, assumptions, and non-goals
Pull requests link back to early decisions

Over time, these notes become a record of how systems were conceived, not just how they were implemented.

7. Template

I'm working on a new project and need help clarifying my thinking. Guide me through this five-step reasoning process:

STEP 1: CLARIFY THE PROBLEM Restate the problem as:

A single clear sentence
Who experiences it
The baseline today
The V1 definition of "good enough"
Three things that are out of scope

Ask clarifying questions until you can paraphrase it accurately.

STEP 2: FRAME THE VISION Help me summarize the vision in one sentence. Then critique it from:

Product impact
System complexity
User value

Act as a PM, an engineer, and a domain expert. What feels risky or underspecified?

STEP 3: DECOMPOSE THE STRATEGY If we build this in stages (V0 to V2), what assumption does each stage test?

For each stage, help me define:

The assumption
The smallest experiment possible
Acceptance criteria
Kill or continue signal

Present this as a table.

STEP 4: DEFINE NON-GOALS AND FAILURE BEHAVIOR Help me list five expansions we should avoid for now. Describe how the system should behave when uncertain:

What it shows
What it avoids
What it logs

STEP 5: CODIFY LEARNINGS At the end, help me list five rules of thumb from this project that will help in the next one. Keep them short.

QUALITY CHECK RUBRIC Rate each aspect from 1-5:

Is the problem clear?
Does the vision match the problem?
Is the plan feasible?
Is the outcome valuable?
Is the writing plain and grounded?

Anything below 4 means we need to refine.

START HERE: My project idea is: [DESCRIBE YOUR PROJECT]

In practice

Using an LLM this way does not replace design work. It makes the work easier to do well. It gives you space to think before building and leaves behind artifacts your future self or your team can rely on.

When a project starts with a crisp problem, a simple vision, a short assumptions table, clear non-goals, and a shared rubric, alignment is not something you chase later. It is built into the project from the start.

That is the quiet value of using an LLM as a rubber duck for ideas. It turns scattered thoughts into structure long before the first experiment runs.