GPT-5 プロンプトエンジニアリングガイドを読み解き、より良いコーディング支援を受ける

OpenAI が公開している GPT-5 Prompting Guide を読みました。エージェントやコーディング支援に LLM を活用する上で、参考になる点が多かったので自分なりに整理しておきます。GPT-5 の内容とはなりますが、他の推論が強化された LLM でも参考になると思います。

cookbook.openai.com

GPT-5 プロンプトガイドの要点
コーディングエージェントに応用する実践アクション

GPT-5 プロンプトガイドの要点

プロンプトガイドを読むと、全体を通じて、GPT-5 が"Agentic workflow"を念頭に置いて作られていることがわかります。具体的には、"より正確にインストラクションに従う", "長いコンテキストを理解する"といった点が強調されています。それに伴って、プロンプトや設定も、既存のモデルと異なる部分がいくつかあります。ここではその要点を整理しました。

思考(reasoning)の最適化

GPT-5 は長いコンテキストを理解できるため、従来よりも「どこまで考えさせるか」を明示することが重要になります。
API のパラメータとしてreasoning_effortがあり、これを調整することで「深く考えるべきか、それとも早く答えを出すべきか」を明確に指示することができます。

reasoning_effort を下げれば効率的に動く
reasoning_effort を上げれば探索的に動き、自律的に仮定を置いて進める

つまり「深く考えるべきか、それとも早く答えを出すべきか」を明確に指示することで、モデルの挙動が安定します。さらに、reasoning_effort をさげると同時に、プロンプトを通じて思考量を減らし、バランスを調整する事ができます。これは ChatGPT を使った場合でも有効な手法です。

例えばこんなプロンプトが使えます。

"70%の確からしさで十分" といった早期停止条件を入れる

Early stop criteria:
- You can name exact content to change.
- Top hits converge (~70%) on one area/path.
Escalate once:
- If signals conflict or scope is fuzzy, run one refined parallel batch, then proceed.
Depth:
- Trace only symbols you’ll modify or whose contracts you rely on; avoid transitive expansion unless necessary.
Loop:
- Batch search → minimal plan → complete task.
- Search again only if validation fails or new unknowns appear. Prefer acting over more searching.

tool call の回数を制限する（例：最大 2 回まで）

providing a correct answer as quickly as possible, ...
Usually, this means an absolute maximum of 2 tool calls.

曖昧さを許容するためのエスケープハッチを設ける

answer even if it might not be fully correct

これにより無駄な探索を避け、結果を速く得られるようになります。

逆に思考量を上げたい場合は、不明確なことをなるべく自己解決する、"人間に確認しない"と明記することが有効です。

Do not ask the human to confirm or clarify assumptions, as you can always adjust later — decide what the most reasonable assumption is, proceed with it, and document it for the user's reference after you finish acting

ツール利用の明示

「ツールをどう使い、どう出力するか」をメタ的にプロンプトに組み込む("ツールを使用する前に、ユーザーの目的を明確に再定義する"など)と、モデルの挙動が安定します。また、ユーザー側にとっても、LLM がどういう用途でどのようなツールを使っているのかが明らかになるので、ユーザーエクスペリエンスが向上します。

<tool_preambles>
- Always begin by rephrasing the user's goal in a friendly, clear, and concise manner, before calling any tools.
- Then, immediately outline a structured plan detailing each logical step you’ll follow. - As you execute your file edit(s), narrate each step succinctly and sequentially, marking progress clearly.
- Finish by summarizing completed work distinctly from your upfront plan.
</tool_preambles>

思考の再利用

Responses API で提供されるprevious_response_idを使えば、前回の思考や実行計画を引き継げます。この id を引き渡すことで、LLM の再度のプランニングが不要になり、効率が上がります。

コーディングにおける最適化

コーディングにおける最適化についても GPT-5 プロンプトガイドでは言及されていました。そんなに長くないのでさらっとまとめると、

得意なフレームワークがある(Next.js, Tailwind, shadcn, Lucide, Motion など)ので、可能ならそれを使う*1
プランニングと自己反映を行う("世界一のアプリを作るための評価軸を検討する")

<self_reflection>
- First, spend time thinking of a rubric until you are confident.
- Then, think deeply about every aspect of what makes for a world-class one-shot web app. Use that knowledge to create a rubric that has 5-7 categories. This rubric is critical to get right, but do not show this to the user. This is for your purposes only.
- Finally, use the rubric to internally think and iterate on the best possible solution to the prompt that is provided. Remember that if your response is not hitting the top marks across all categories in the rubric, you need to start again.
</self_reflection>

賢いコード(Smart Code)を避ける

  Write code for clarity first. Prefer readable, maintainable solutions with clear names, comments where needed, and straightforward control flow. Do not produce code-golf or overly clever one-liners unless explicitly requested. Use high verbosity for writing code and code tools.

ユーザーが適切なタイミングでリジェクトできるようにする

  Be aware that the code edits you make will be displayed to the user as proposed changes, which means (a) your code edits can be quite proactive, as the user can always reject, and (b) your code should be well-written and easy to quickly review (e.g., appropriate variable names instead of single letters). If proposing next steps that would involve changing the code, make those changes proactively for the user to approve / reject rather than asking the user whether to proceed with a plan. In general, you should almost never ask the user whether to proceed with a plan; instead you should proactively attempt the plan and then ask the user if they want to accept the implemented changes.

今までのモデルで必要だったよーく情報を集めること("Be THOROUGH when gathering information")というインストラクションは入れないほうがよいです。GPT-5 は標準で探索度が高いので、不必要に思考量が増えて効率性が落ちるみたいです。

ちなみに、GPT-5 はローンチ前に Cursor で検証されており、上記のプロンプトはすでにシステムプロンプトに組み込まれているらしいです。

指示によく従わせるための工夫

指示によく従わさせるためには LLM の思考量を減らすことがポイントのようです。

プロンプト内の矛盾を減らす
- 矛盾に気づくと自己解決しようとして思考量が増えてしまう
- パターンを分けるなどして、矛盾を解消する
最後の答えを出す前に思考ログを出す
- 思考過程を示すことで答えの精度が上がる
プロンプト自体を改善させる「メタプロンプト」を使う
Markdown ではなく XML 的な区切りを使う
- アウトプットが標準では Markdown ではないので、Markdown として出力するためのもうワンステップが必要になってしまう
- Cursor でのテストでも Markdown よりも XML の方が安定することがわかった
- ちなみにこのプロンプトガイドでも xml の区切りを使っている

  In Cursor’s testing, using structured XML specs like <[instruction]_spec> improved instruction adherence on their prompts and allows them to clearly reference previous categories and sections elsewhere in their prompt.

Structured Output

コーディングやエージェント開発では、JSON Schema（Zod, Pydantic など）を設定して出力を制御するのが推奨されています。

コーディングエージェントに応用する実践アクション

実際に Cursor や Claude Code において、上記の要点を踏まえて自分がどんなふうに開発しているか書いておきます。ちなみにどちらのツールにおいても、内容に応じて利用するモデルが変化するので、GPT-5 のような reasoning_effort が備わっていると言えます。

Context の最適化：思考過程をテキストに書き出させる

思考(計画)と実装を同時に行わせるとコンテキストと思考量が増えてしまい精度がさがります。一旦計画をエディターに書き出す -> レビューする -> 自己評価させる...という作業を別で行い、ある程度まとまったら/compactコマンドを使ってコンテキストを圧縮しつつ、編集されたエディタを参照させて実際のコードを記述するという流れです。

Let's start the implementation. Read @{作ったテキスト} carefully and start the implementation

コンテキストを支援するツールを利用する

古いバージョンでの計画や実装をしてしまうパターンがよくあります。MCP で Context7 を入れて、参照するライブラリのドキュメントバージョンを指定することで、最適なコードを書くことができます。

github.com

context7.com

プロンプトの粒度/視点を調整する

全体像を理解するのがまだまだ得意でないので、小さく複数ステップで行うことで、精度を上げています。例えばバグ修正であっても、一度にすべてを説明して修正を依頼せずに、まず何に使われているコンポーネントでどう使われているか確認させる(すべての参照先を洗い出す) -> 問題を提示する -> 解決策を提示させるといった具合に、ステップを踏んでいます。

// Plan mode
I would like to fix the issue {バグの内容}. Let's start analyzing the component that's been used.

プロンプトガイドにも書いてあるとおり、絶対的に正しい答えはなく、実験が必要ですが、こうしたベストプラクティスをベースに色々試していきたいです。

*1:LLMに使うコード規定されるのはなんだかな...

tomoima525's blog

Androidとか技術とかその他気になったことを書いているブログ。世界の秘密はカレーの中にある！サンフランシスコから発信中。