The Claude Opus 4.7 & 4.8 Playbook for Mining Professionals

If your prompts worked well a few months ago and feel slightly off now, you are not imagining it. The new generation of Claude — Opus 4.7 last quarter and Opus 4.8 released in late May 2026 — did not get worse. They got more literal.

The prompts that worked on the older models, the ones that relied on the model reading between the lines, now land differently. This guide covers the shifts that matter for our industry and the moves that fix them. Everything in here applies to both 4.7 and 4.8 unless a section says otherwise.

A note on Opus 4.8

Released 28 May 2026

Opus 4.8 arrived as a direct successor to 4.7, at the same pricing, and Anthropic itself describes it as a modest but tangible improvement. The good news for everyone reading this playbook: every principle in here applies to 4.8 the same way it applies to 4.7. The model is still literal. Negative instructions still fail. “Go beyond the basics” still outperforms almost anything else you can add to a prompt.

Four refinements are worth knowing before you upgrade:

1. Adaptive thinking. The thinking toggle in 4.8 is smarter. It now decides per turn whether to reason deeply or respond directly, rather than thinking on every prompt. The practical effect: leave thinking on across the board. The cost of leaving it on has dropped because the model only spends the budget when the question genuinely needs it.

2. Effort levels recalibrated. If you use the new effort control in the chat interface or the API, the levels have shifted. Medium now allows more thinking than before. High allows slightly less. Xhigh allows substantially more. If you had a workflow tuned at “high” on 4.7, re-test it at the same level before deciding to adjust.

3. Tool use is no longer reluctant. One of the more visible quirks of 4.7 was its reluctance to use tools like web search. 4.8 has largely fixed this. Shift 5 below is now a softer issue than it was, though stating your verification triggers explicitly is still the safest path for high-stakes deliverables.

4. The model proactively flags its own gaps. 4.8 is around four times less likely than 4.7 to let unsupported claims pass without flagging them. The “tell me anything you noticed but I did not ask about” instruction still helps, but the model does more of this on its own now.

A fifth item worth mentioning for the developers in your network: 4.8 brings dynamic workflows with parallel subagents in Claude Code. For analysts working in chat or Projects, this changes nothing. For anyone running agentic pipelines, it is worth a look.

With those refinements in mind, the rest of this playbook applies to either model.

The one sentence that explains everything

The previous model tried to understand what you meant. The new one does exactly what you typed.

Internalize that and the rest of this guide is just application.

The phrase that outperforms everything else

The single most useful sentence you can put in any prompt:

Go beyond the basics.

Four words. It comes straight out of Anthropic’s own guidance, and once you start using it, you will not stop. Opus 4.7 calibrates the depth of its work to what you literally asked for. Ask for an executive summary of a Pre-Feasibility Study, and you get an executive summary. Ask for the same summary and tell it to go beyond the basics, and now you get the non-obvious risks, the assumptions that should be flagged, the schedule contagion, and the bench-test gaps a senior project director would have caught on the third readthrough.

Where it earns its keep in mining work

Use it for

Executive summaries and board briefs where ‘thorough’ is hard to define upfront.
Risk registers — the headline risks are rarely the ones that hurt you.
Due diligence reviews where you want red flags surfaced, not just data extracted.
Benchmarking studies where the value lies in non-obvious peer comparisons.
Trade-off studies where the obvious answer is rarely the right one.
Stakeholder briefings where what is missing matters as much as what is said.
Reading a competitor’s technical report and asking what is not being told.

Skip it for

Structured data extraction from NI 43-101, SK-1300, or JORC reports.
Tabular outputs with a fixed schema.
Executive summaries with a hard word count.
Quick factual lookups where extra depth becomes friction.

Variations that compound the effect

Go beyond the basics. Include the considerations a senior practitioner in this domain would raise even if I did not list them.

Go beyond the basics. Surface the second-order risks and the assumptions worth challenging.

Go beyond the basics. Treat this as a polished deliverable, not a first draft.

Shift 1 — Literal instruction following

Before: You said “draft a risk register for this project.” The model inferred you wanted technical, commercial, schedule, ESG, and severity. You got a real risk register.

After: You say “draft a risk register.” You get a literal risk register. A short list. No severity. No categorization. Because that is what you asked for.

The fix: Name the categories. Name the columns. Name the scope. Or take the shortcut and tell it to go beyond the basics.

Draft a risk register for the Project X DFS. Cover technical, commercial, ESG, permitting, schedule, and people-and-organization categories. For each risk, include a likelihood score (1 to 5), an impact score (1 to 5), a mitigation, and a residual rating. Go beyond the basics. Surface risks that typically only emerge in execution, not in study phase.

Shift 2 — How much the model “thinks” is now adjustable

In the chat interface you have a thinking toggle that controls how deeply the model reasons before responding. For most mining work, leave extended thinking on. A few extra seconds of response time buys materially better reasoning on multi-step problems.

Where it matters most

Comparing two flowsheets or process options.
Reading and interpreting a long technical report.
Building a benchmarking comparison across multiple projects.
Working through a capital cost estimate review.
Synthesizing market intelligence from multiple sources.

For one-line questions (“what is the typical recovery range for flotation of porphyry copper ore?”) leave it off. For anything where you would normally pull a senior colleague into a meeting room, leave it on.

Shift 3 — Negative instructions do not land

Before: “Don’t use jargon” actually reduced jargon. The model picked up the intent.

After: “Don’t use jargon” attaches to that exact sentence and the rest of the response drifts back to jargon. Worse, “don’t make the summary too long” produces an unpredictably long summary because the model has no number to anchor to.

The fix: State the positive. Instead of “don’t be too technical,” say “write for a CFO with no metallurgy background.” Instead of “don’t use buzzwords,” say “use plain English a board member could read aloud.” Instead of “don’t make it long,” say “keep it under 300 words.”

Every negative instruction is a coin flip. Every positive instruction is a directive.

Shift 4 — Response length now mirrors the question

Short question, short answer. Long, open-ended question, long answer. The model is no longer padding to a comfortable length.

The fix: If your deliverable has a shape, state it.

Summarize the metallurgical testwork section in exactly 5 bullet points, no more than 25 words each.

Write a 1-page memo. 4 paragraphs. No headings.

Produce a side-by-side comparison table with 6 rows.

Shift 5 — The model uses web search less often

The new generation is more confident in its own knowledge and reaches for web search less often than its predecessors. Opus 4.7 was the most noticeable example of this; 4.8 has narrowed the gap considerably, but the instinct toward self-reliance is still there. If your workflow depends on verified commodity prices, current regulatory references, or the latest peer-project announcements, you may still see it skip the step on edge cases.

The fix: Tell it when to verify, not just that it can.

Verify every commodity price you cite with a current web source. Note the date of the reference. Confirm the latest status of the permitting framework in [jurisdiction] by web search before answering. If you reference any peer project for benchmarking, verify the latest publicly disclosed figures.

Shift 6 — The tone got drier

The new model dropped the warm conversational opener. For internal technical work this is an upgrade. For client-facing deliverables, briefing notes, or anything you forward without editing, you may want the warmth back.

The fix: “Acknowledge the framing of my question in one sentence before answering, then use a measured, professional tone throughout.” Or paste two or three sentences in the voice you want and tell the model to match the rhythm.

Voice-by-example beats voice-by-adjective every time.

Shift 7 — Default design aesthetic for slide decks

If you use Claude to draft presentations, investor materials, or board packs, the new model has a strong default look: cream backgrounds, serif display fonts, italic accents, terracotta touches. Beautiful for editorial briefs. Wrong for mining, infrastructure, finance, or any technical board pack.

The fix: Give it a concrete alternative — hex codes, typefaces, the visual feel.

Use a corporate brand palette: dark green (#12370F), deep gold (#BE9E44), and near-black (#050F04). Typography should be a clean serif for headings and a sans-serif for body. The output should read like a board pack, not a magazine.

Shift 8 — The old “CRITICAL: YOU MUST” pattern now backfires

Standing prompts inherited shouty language from older models. The new model does not need it — and over-applies the rule when you shout. Dial it back. “Always include a cost basis for capital estimates” is enough.

The deeper shift most guides skip

What you put in matters more than how you phrase it.

The model is now so literal that the inferential gaps it used to fill have to come from somewhere. If they do not come from your phrasing, they have to come from your context.

For mining work, that means

Upload the report instead of describing it.
Paste the cost breakdown instead of summarizing what it says.
Include the company tone-of-voice sample instead of naming the tone.
Provide 2 or 3 examples of past deliverables so the model can match the format.
Drop in the relevant section of the NI 43-101 instead of asking the model to recall it.

A practical rule: if your prompt is 80% instruction and 20% context, you are still in the old paradigm. Flip the ratio.

The examples multiplier

Three to five good examples will outperform almost any amount of instruction. Wrap them in simple tags so the model knows they are examples, not instructions.

<example> Input: [a passage from a prior memo] Output: [the summary or analysis you produced from it] </example>

Examples teach three things at once: the format, the tone, and the depth. Paste two or three well-written memos and the output quality jumps immediately.

The magic clarification line

Before you start, ask me any clarifying questions you need to produce the best output. Do not assume. Once I answer, proceed.

A close second, for restoring proactive-assistant behaviour:

After completing the task, tell me anything you noticed that I should be aware of but did not ask about.

The self-check move

For high-stakes outputs, add a verification step at the end:

Before you finish, audit your answer against these criteria: [list them]. If anything falls short, revise it before responding.

The 12 verbs that make the model sing

The model responds to verbs more than adjectives:

Draft

Review

Compare

Rank

Extract

Summarize

Reconcile

Benchmark

Validate

Forecast

Quantify

Synthesize

Chat vs Projects

You will use the model in two main surfaces. They behave slightly differently.

Chat. Most forgiving. You iterate live, so a vague first prompt costs you a follow-up turn, not a project. Still, front-load the context in the first message and you will save half your back-and-forth.

Projects. Where you do ongoing work over multiple sessions. Useful for standing engagements: a specific study, a recurring market intelligence brief, a long client account. The model retains documents and instructions across sessions, but it still does not infer cross-session intent the way the old model did. Re-state the goal at the start of each session, reference earlier decisions explicitly, and upload reference documents once then point to them in each new chat.

When a prompt that worked last month stops working

This will happen. Here is the debug order to run:

Debug order

Did you switch models? Every shift in this guide applies.
Is there a hidden negative instruction? Search the prompt for ‘don’t’ and ‘avoid’ — rewrite them as positive instructions.
Did the input grow? Long context pushes important instructions away from the question. Move critical instructions to the end, just before the question.
Did you lose examples? If the prompt used to have 3 examples and now has 1, restore them.
Did you add a ‘CRITICAL’? Remove it. Test again.
Did you forget to say ‘go beyond the basics’? Add it back.

Nine times out of ten, the regression is in one of those six lines, not in the model.

Five old prompt habits to break this week

Replace these

‘Help me write…’ → ‘Write…’ The hedge wastes tokens and softens the output.
‘Could you possibly…’ → drop the politeness. It does not improve quality.
‘Don’t make it too long.’ → replace with a number. ‘Under 300 words.’
‘Be thorough.’ → ‘go beyond the basics’ plus the specific elements you want.
‘Make it good.’ → state the criteria for good. The model does not share your taste unless you describe it.

A production-ready prompt skeleton

You are a [senior process engineer / project economist / mining analyst / due diligence reviewer] with deep experience in [commodity / jurisdiction / project stage]. Context: [Paste the relevant report sections, data tables, prior memos, or source material here.] Task: [Verb-led, scope-specified ask. e.g. "Benchmark the capital intensity of Project X against the 5 peer projects listed below..."] Examples of the output shape: <example> [A prior deliverable that matches the format and depth you want] </example> Before finishing, audit your answer against the requirements above. Revise if anything falls short. If anything in this brief is unclear, ask me before starting. Do not assume. Rules: Operating rules for this task: 1. Do not invent facts. If something is not in the context provided, treat it as unknown. 2. Never fabricate numerical values. If a value is missing, return "not provided in source" and continue. 3. For every value cited, state the source: either the document and section, or the assumption. 4. Distinguish between values quoted directly from source documents and values derived through calculation. For derived values, show the inputs. 5. Declare every unit, currency, and basis conversion explicitly, including the conversion factor used. 6. Time-stamp every reference where time matters (commodity prices, regulatory citations, peer project disclosures, tax rates). 7. State a confidence level for any interpretive answer: High (source-backed), Medium (interpolated from analogous data), Low (generalized industry estimate). 8. Where the data provided is insufficient to support a conclusion, say so explicitly. Do not manufacture a conclusion to be helpful. 9. If any part of this brief is unclear before you start, ask. Do not assume. 10. Ask clarifying questions if required.

The audit that catches 80% of weak prompts

Read your prompt out loud. Ask yourself:

Would a graduate engineer with no context know exactly what to deliver from this brief?

If yes, send it. If no, name the missing piece.

One closing move

The model is no longer guessing at your intent. So stop hiding it. And when in doubt, tell it to go beyond the basics. You will be surprised how often that is the entire difference between an output you can ship and one you have to redo.

Author

Francisca Lombard

Founder, LOMexcel · Mining Consulting and AI Advisory

The Claude Opus 4.7 & 4.8 Playbook for Mining Professionals.