Prompt Engineering
Cheat Sheet
The complete reference for crafting effective prompts across GPT-4o, Claude, and Gemini.
40+ techniques with examples. Bookmark this page.
Basic Techniques
Foundation patterns every prompt engineer should know
Role Prompting
EssentialAssign a persona or expertise to the model to shape tone, depth, and domain focus. The model adapts vocabulary, reasoning style, and assumptions to match.
You are a senior backend engineer with 15 years of experience in distributed systems. Review this architecture diagram and identify potential failure modes.
When to use: Whenever you need domain-specific expertise, consistent tone, or a specific perspective. Works especially well for code review, writing, and analysis.
Few-Shot Prompting
EssentialProvide 2–5 input→output examples before your actual request. The model learns the pattern and applies it consistently. More reliable than describing the format in words.
Classify the sentiment:
Text: "Love this product!" → Positive
Text: "Worst purchase ever" → Negative
Text: "It works I guess" → Neutral
Text: "Shipping was fast but the item broke" →
When to use: Classification, formatting, style matching, data transformation. Essential when zero-shot results are inconsistent.
Chain-of-Thought (CoT)
High ImpactAsk the model to reason step-by-step before giving a final answer. Dramatically improves accuracy on math, logic, and multi-step reasoning tasks.
Solve this step by step, showing your reasoning at each stage before giving the final answer:
A store offers 20% off, then an additional 15% off the sale price. What is the total discount on a $200 item?
When to use: Math problems, logical deductions, complex analysis, debugging. Add "Let's think step by step" as a simple trigger.
Step-by-Step Instructions
EssentialBreak your request into numbered steps to control execution order and ensure nothing is skipped. Works like pseudocode for the model.
Follow these steps exactly:
1. Read the customer email below
2. Identify the core complaint
3. Determine urgency (low/med/high)
4. Draft a response using our brand voice
5. Flag if escalation is needed
When to use: Multi-part tasks, workflows, any time order matters. Prevents the model from skipping steps or combining them incorrectly.
Zero-Shot with Context
EssentialProvide rich context and constraints without examples. Works well when you clearly define what you want, including format, length, tone, and audience.
Write a 280-character tweet thread (5 tweets) explaining quantum computing to a 12-year-old. Use analogies. No jargon. Each tweet should stand alone.
When to use: When the task is clear enough that examples aren't needed. Good for creative tasks, simple transformations, and when you have specific constraints.
Delimiter Separation
EssentialUse clear delimiters (triple quotes, XML tags, markdown headers) to separate instructions from data. Prevents prompt injection and improves clarity.
Summarize the text between the tags.
<document>
[paste long article here]
</document>
Provide a 3-sentence summary.
When to use: Whenever your prompt includes user-supplied data, long documents, or multiple distinct sections. Critical for security.
Advanced Techniques
Power-user patterns for complex reasoning and reliability
Tree of Thought (ToT)
AdvancedThe model explores multiple reasoning paths, evaluates each, and selects the best. Like CoT but branching — better for problems with multiple valid approaches.
Consider 3 different approaches to solving this:
Approach A: [describe]
Approach B: [describe]
Approach C: [describe]
For each, evaluate pros, cons, and likelihood of success. Then pick the best one and execute it.
When to use: Strategy decisions, architecture design, complex debugging where multiple hypotheses exist. Higher token cost but better outcomes.
Self-Consistency
AdvancedGenerate multiple independent answers, then pick the most common result. Reduces variance and catches errors through majority voting.
Solve this problem 3 times independently, using a different method each time. Then compare your answers. If they agree, that's the answer. If they disagree, analyze why and determine the correct one.
When to use: High-stakes calculations, factual questions, when you need confidence in the answer. Best with temperature > 0 for diversity.
Self-Critique / Reflection
AdvancedAsk the model to generate an answer, critique it against specific criteria, then revise. Implements a feedback loop within a single prompt.
Write your answer. Then review it against these criteria:
- Is it factually accurate?
- Is it concise (<200 words)?
- Does it address the user's actual question?
If any criterion fails, revise and output only the final version.
When to use: Quality-critical content, customer-facing text, anywhere you'd normally review output manually.
Meta-Prompting
AdvancedUse the model to generate or improve prompts. Feed it your goal and let it craft the optimal prompt — then use that prompt. Recursion for prompts.
I want to build a prompt that extracts action items from meeting transcripts. The output should be JSON with assignee, task, and deadline.
Write the best possible prompt for this task. Include role, format spec, examples, and edge case handling.
When to use: When you're stuck crafting a prompt, building prompt templates for production, or optimizing existing prompts for better results.
ReAct (Reason + Act)
AdvancedInterleave reasoning and tool-use actions. The model thinks about what to do, executes an action, observes the result, then reasons about the next step.
Use this pattern for each step:
Thought: [what you need to figure out]
Action: [tool to use and input]
Observation: [what the tool returned]
... repeat until solved ...
Final Answer: [conclusion]
When to use: Agent workflows, tool-augmented tasks, research questions requiring multiple lookups.
Prompt Chaining
AdvancedBreak complex tasks into a pipeline of simpler prompts, where each step's output feeds the next. More controllable and debuggable than one mega-prompt.
Step 1: Extract key claims from article → list
Step 2: For each claim, assess if verifiable → filtered list
Step 3: Fact-check each verifiable claim → results
Step 4: Compile into final report
When to use: Complex workflows, content pipelines, anywhere a single prompt would be unreliably long. Each step can use a different model or temperature.
Output Formatting
Control the shape and structure of model outputs
JSON Mode
StructuredForce the model to output valid JSON. Use the API's response_format parameter when available, or specify the schema in your prompt.
Extract entities from this text. Return ONLY valid JSON, no markdown:
{"people": ["name"], "places": ["name"], "dates": ["ISO 8601"]}
Text: "John met Sarah in Paris on March 5, 2025."
Pro tip: Provide the exact JSON schema with example values. Use response_format: {type: "json_object"} in the API for guaranteed valid JSON.
XML Tags
StructuredUse XML-style tags to structure both input and output. Especially effective with Claude, which was trained to respect XML tag boundaries.
<instructions>Analyze the code below</instructions>
<code>
def fib(n): return fib(n-1) + fib(n-2)
</code>
Return your analysis in:
<bugs>...</bugs>
<fix>...</fix>
<explanation>...</explanation>
Pro tip: Claude particularly excels with XML. Use for separating instructions from data and for multi-part outputs. Also helps prevent prompt injection.
Markdown Formatting
StructuredRequest output in markdown for human-readable structured content. Great for documentation, reports, and content that will be rendered in a UI.
Write a technical comparison in markdown with:
- H2 headers for each option
- A pros/cons bullet list under each
- A summary comparison table at the end
- Bold the recommended choice
Pro tip: Specify which markdown elements to use. Models default to markdown, but explicit instructions prevent inconsistent heading levels or missing tables.
Structured Output / Tool Use
StructuredUse the API's function calling / tool use feature to get perfectly typed responses. The model fills in a predefined schema — no parsing needed.
// Define a tool schema:
{
"name": "extract_contact",
"parameters": {
"name": "string",
"email": "string",
"phone": "string | null",
"company": "string | null"
}
}
Pro tip: Most reliable way to get structured data. Use OpenAI's tools, Anthropic's tool_use, or Google's function_calling.
Model-Specific Tips
What works best for each major model family
GPT-4o / OpenAI
- ✓ System message is king — put persona & constraints there
- ✓ Excellent at
response_format: json_object - ✓ Function calling is best-in-class
- ✓ Follows numbered instructions very well
- ⚠ Can be verbose — add "be concise" explicitly
- ⚠ May refuse edge cases — rephrase as hypothetical
- ✓ o1/o3 models: skip CoT prompting — they do it internally
Claude / Anthropic
- ✓ XML tags are a superpower — use
<tags>everywhere - ✓ Excels at long documents (200K context)
- ✓ Best at following nuanced, detailed instructions
- ✓ Strong at code — give full file context
- ✓ Extended thinking for hard reasoning tasks
- ⚠ Can be overly cautious — frame requests clearly
- ✓ Prefill assistant message to steer output format
Gemini / Google
- ✓ Excellent multimodal — images, video, audio natively
- ✓ Massive context (1M+ tokens) — dump everything in
- ✓ Great at grounded search with Google integration
- ✓ Strong at structured data and tables
- ⚠ Less reliable at strict format compliance
- ⚠ May hallucinate on niche topics — provide sources
- ✓ Use system instructions for consistent persona
Common Patterns
Copy-paste templates for everyday tasks
📝 Summarization
Summarize this in [N] bullet points for a [audience]. Focus on [what matters]. Skip [what doesn't].
<text>...</text>
Always specify: length, audience, focus area, and what to omit.
🔍 Extraction
Extract from the text below:
- Company names (exact match)
- Dollar amounts (as numbers)
- Dates (ISO 8601)
Return as JSON array. If a field is ambiguous, use null.
Define exact fields, types, and how to handle missing/ambiguous data.
🏷️ Classification
Classify this support ticket into exactly ONE category:
[bug, feature_request, billing, account, other]
Return: {"category": "...", "confidence": 0.0-1.0, "reasoning": "..."}
Enumerate all valid categories. Add confidence scores for filtering.
✨ Generation
Write a [type] about [topic].
Tone: [casual/formal/technical]
Length: [words/paragraphs]
Audience: [who]
Must include: [key points]
Avoid: [what not to include]
The more constraints you give, the better the output. Underconstrained = generic.
📊 Analysis
Analyze this [data/code/text] and provide:
1. Key findings (top 3-5)
2. Potential issues or risks
3. Recommendations with priority
4. What you'd need to investigate further
Structure the output. Ask for prioritized findings and unknowns.
💻 Code
Write a [language] function that [does what].
Requirements:
- Input: [types and constraints]
- Output: [type and format]
- Handle: [edge cases]
- Style: [conventions]
Include tests.
Specify language, I/O types, edge cases, and conventions. Always ask for tests.
Anti-Patterns
Common mistakes that degrade output quality
Vague Instructions
❌ Bad
Make this better.✅ Good
Improve clarity by: shortening sentences to <20 words, removing jargon, adding a concrete example.The model can't read your mind. Define "better" explicitly.
Negative Instructions Only
❌ Bad
Don't be formal. Don't use bullet points. Don't be too long.✅ Good
Use a casual, conversational tone. Write in flowing paragraphs. Keep under 150 words.Tell the model what TO do, not just what NOT to do. Positive instructions are more reliable.
Overloading a Single Prompt
❌ Bad
Analyze this data, create charts, write a report, email it to the team, and schedule a meeting.✅ Good
Step 1: Analyze this dataset and identify the top 3 trends. Present as a table.Break complex tasks into a chain of focused prompts. Each prompt = one clear job.
No Examples or Format Spec
❌ Bad
Convert this data to a usable format.✅ Good
Convert to CSV with columns: name, email, role. Example row: "John,john@co.com,admin""Usable format" is ambiguous. Show one example of the exact output you want.
Trusting Without Verification
❌ Bad
List all Supreme Court cases about X. [accepts at face value]✅ Good
List cases about X. For each, include the citation so I can verify. Flag any you're uncertain about.Models hallucinate. Ask for citations, confidence levels, and explicit uncertainty markers.
Ignoring Context Window
❌ Bad
[Dumps 100-page doc] Summarize everything in detail.✅ Good
Summarize sections 3-5, focusing on methodology and key findings.Even with large context windows, quality degrades with irrelevant content. Feed only what's needed.
Temperature & Parameters Guide
Dial in the right settings for your task
Temperature Scale
0.0
Deterministic
Extraction, classification, math, code that must be correct
0.3
Focused
Summarization, translation, Q&A, technical writing
0.7
Balanced
General writing, explanations, most tasks (default)
1.0
Creative
Brainstorming, creative fiction, marketing copy, poetry
1.5+
Wild
Experimental only. Incoherent at high values. Rarely useful.
Top-P (Nucleus Sampling)
Controls the cumulative probability pool. Lower = more focused. Usually set one of temp OR top-p, not both.
Max Tokens
Set output length limits. Prevents runaway generation and controls cost. 1 token ≈ ¾ of a word in English.
Other Parameters
Frequency Penalty 0–2
Reduces repetition. 0.3–0.8 for most tasks. Higher = more varied vocabulary.
Presence Penalty 0–2
Encourages new topics. 0.1–0.6 to keep things fresh without losing coherence.
Stop Sequences
Define tokens that halt generation. Useful for structured outputs or preventing over-generation.
Token Optimization Tips
Reduce cost without losing quality
🎯 Prompt Compression
- → Remove filler words: "I would like you to please" → just state the task
- → Use abbreviations in system prompts (models understand shorthand)
- → Replace verbose instructions with 1 example (few-shot > explaining)
- → Use structured data (JSON/YAML) instead of natural language for inputs
⚡ Context Management
- → Summarize conversation history instead of keeping full messages
- → Use RAG to inject only relevant chunks, not entire documents
- → Trim system prompts — most can be cut 50% without quality loss
- → Cache frequent system prompts (OpenAI prompt caching saves 50%)
🔀 Model Routing
- → Use smaller models for simple tasks (classification, extraction)
- → Route to larger models only for reasoning/creative tasks
- → GPT-4o-mini, Claude Haiku, Gemini Flash — 90% of tasks, 10% of cost
- → Build a classifier that picks the model per request
📊 Cost Math Cheat Sheet
💡 Rule of thumb: Input tokens are 2-10× cheaper than output tokens. Keep prompts tight, but focus optimization on reducing output length first.
⚡ Quick Reference: Technique Selector
| Task Type | Best Technique | Temperature | Key Tip |
|---|---|---|---|
| Math / Logic | Chain-of-Thought | 0.0 | Add "show your work" |
| Classification | Few-Shot + JSON | 0.0 | Enumerate all valid labels |
| Code Generation | Role + Step-by-Step | 0.0–0.3 | Specify language, tests, edge cases |
| Creative Writing | Role + Constraints | 0.8–1.0 | Give tone, audience, length |
| Data Extraction | Structured Output | 0.0 | Use tool/function calling |
| Summarization | Delimiter + Constraints | 0.3 | Specify length and focus |
| Strategy / Planning | Tree of Thought | 0.5–0.7 | Explore 3+ approaches first |
| Brainstorming | Role + High Temp | 1.0–1.2 | Ask for quantity, filter later |
Want AI to do this for you?
Set up your own AI assistant that applies these techniques automatically. No prompt engineering needed.
Get started → /setupGenerate Custom Configs
Build optimized prompt templates, system messages, and API configurations for your specific use case.
Build yours → /generator