codex

Lightweight coding agent that runs in your terminal

$ anc audit codex --json

agent Rust openai/codex

MUST 19 / 28SHOULD 13 / 211 failing14 warningsaudited 2026-06-01 · anc 0.5.0

74pass rate

2/8principles met

Eight principles, scored

Behavioral and source checks. Every non-pass row carries a copy-paste remediation prompt.

Non-Interactive by Default

secret-bearing flag(s) without `*-file` companion or stdin path: --remote-auth-token-env. Flag values leak via process tables, shell history, and CI logs; provide stdin support or a `--<flag>-file` variant. · no default-value annotations found in --help. SHOULD-tier — agents reading help text need to see what value a flag falls back to when omitted (`[default: <value>]` per clap convention).

Remediation · P1

Make codex satisfy P1 (Non-Interactive by Default) from the agent-native CLI standard. Address:
- Secret-bearing flags expose stdin or *-file companion (secret-bearing flag(s) without `*-file` companion or stdin path: --remote-auth-token-env. Flag values leak via process tables, shell history, and CI logs; provide stdin support or a `--<flag>-file` variant.)
- `--help` advertises default values for flags (no default-value annotations found in --help. SHOULD-tier — agents reading help text need to see what value a flag falls back to when omitted (`[default: <value>]` per clap convention).)
Requirements: https://anc.dev/p1

fail

Structured, Parseable Output

--output/--format flag detected but could not validate JSON via safe probes (--help/--version override output flags in most CLIs) · no --json or --jsonl short alias found. Agents and pipelines benefit from short forms alongside the canonical `--output` enum.

Remediation · P2

Make codex satisfy P2 (Structured, Parseable Output) from the agent-native CLI standard. Address:
- Structured output support (--output/--format flag detected but could not validate JSON via safe probes (--help/--version override output flags in most CLIs))
- --json / --jsonl short aliases for --output (no --json or --jsonl short alias found. Agents and pipelines benefit from short forms alongside the canonical `--output` enum.)
- `--raw` flag for pipe-safe unformatted output (no `--raw` flag advertised. MAY-tier — useful for pipelines that want to strip formatting before piping to other tools.)
Requirements: https://anc.dev/p2

warn

Progressive Help Discovery

no `examples` subcommand or `--examples` flag found. MAY-tier — a curated usage block keeps agents from hunting through long help text. · no paired text + `--output json` example found within 5 lines in top-level or any subcommand `--help`. Pairing keeps agents from reverse-engineering the JSON invocation from the text one.

Remediation · P3

Make codex satisfy P3 (Progressive Help Discovery) from the agent-native CLI standard. Address:
- `examples` subcommand or `--examples` flag for curated usage patterns (no `examples` subcommand or `--examples` flag found. MAY-tier — a curated usage block keeps agents from hunting through long help text.)
- Help text pairs human and `--output json` example invocations (no paired text + `--output json` example found within 5 lines in top-level or any subcommand `--help`. Pairing keeps agents from reverse-engineering the JSON invocation from the text one.)
Requirements: https://anc.dev/p3

warn

P4
Fail-Fast, Actionable Errors
All 3 checks pass.
pass

Safe Retries & Mutation Boundaries

write-pattern subcommand(s) present (update) but no read-pattern surface detected. If the CLI is write-only by design the MUST is satisfied vacuously; otherwise expose the read surface with agent-recognizable verbs (list/get/show/query/find/search).

Remediation · P5

Make codex satisfy P5 (Safe Retries & Mutation Boundaries) from the agent-native CLI standard. Address:
- Read and write surfaces are both visible in subcommand list (write-pattern subcommand(s) present (update) but no read-pattern surface detected. If the CLI is write-only by design the MUST is satisfied vacuously; otherwise expose the read surface with agent-recognizable verbs (list/get/show/query/find/search).)
Requirements: https://anc.dev/p5

warn

Composable, Predictable Command Structure

7/24 subcommand(s) follow standard verb names. Non-standard: review, mcp, plugin, mcp-server, app-server, remote-control, completion, sandbox, debug, working, resume, the, fork, most, cloud, exec-server, features. MAY-tier — community-standard verbs (get/list/create/update/delete) help agents predict subcommand behavior across CLIs. · no `--color` flag advertised. MAY-tier — `auto|always|never` lets agents and pipelines override the TTY-based default.

Remediation · P6

Make codex satisfy P6 (Composable, Predictable Command Structure) from the agent-native CLI standard. Address:
- Subcommand verbs follow community-standard names (7/24 subcommand(s) follow standard verb names. Non-standard: review, mcp, plugin, mcp-server, app-server, remote-control, completion, sandbox, debug, working, resume, the, fork, most, cloud, exec-server, features. MAY-tier — community-standard verbs (get/list/create/update/delete) help agents predict subcommand behavior across CLIs.)
- `--color` flag for explicit color control (no `--color` flag advertised. MAY-tier — `auto|always|never` lets agents and pipelines override the TTY-based default.)
- Subcommand naming follows a consistent verb/noun convention (subcommand naming is inconsistent: 7 non-verb subcommand(s) (mcp, plugin, working, the, most, cloud, features) mix verb and non-verb children at the second level, so an agent cannot predict where the action lives. SHOULD-tier: pick a consistent shape (all verb-first, all noun-verb hierarchy, or any combination where each non-verb group's children are uniformly verbs). The verb list is a heuristic; inspect `--help` to confirm.)
- Operations are subcommands, not verb-shaped flags (top-level verb-shaped flag(s) found: --search. Operations belong under the `Commands:` block (`tool search "q"`), not on the flag namespace where they fight the `--help` filtering agents rely on.)
Requirements: https://anc.dev/p6

warn

Bounded, High-Signal Responses

no --quiet/-q flag detected in --help output · no `--verbose` / `-v` flag advertised. SHOULD-tier — agents debugging failures need a way to escalate diagnostic detail.

Remediation · P7

Make codex satisfy P7 (Bounded, High-Signal Responses) from the agent-native CLI standard. Address:
- Quiet mode available (no --quiet/-q flag detected in --help output)
- `--verbose` flag for diagnostic escalation (no `--verbose` / `-v` flag advertised. SHOULD-tier — agents debugging failures need a way to escalate diagnostic detail.)
- Help text advertises TTY-aware verbosity behavior (no TTY-aware language found in `--help`. MAY-tier — automatic verbosity reduction when stdout is piped or redirected lets agents skip the explicit `--quiet` flag. Behavioral probes cannot simulate a real TTY without a pty crate, so this audit relies on documented intent.)
Requirements: https://anc.dev/p7

warn

P8
Discoverable Through Agent Skill Bundles
All 3 checks pass.
pass

All Audits

P1: Non-Interactive by Default

PASS	Non-interactive by default
SKIP	Non-interactive gate flag advertised in --help	target satisfies P1 via alternative gate (help-on-bare or stdin-primary)
PASS	Flags advertise env-var bindings in --help
FAIL	Secret-bearing flags expose stdin or *-file companion	secret-bearing flag(s) without `*-file` companion or stdin path: --remote-auth-token-env. Flag values leak via process tables, shell history, and CI logs; provide stdin support or a `--<flag>-file` variant.
WARN	`--help` advertises default values for flags	no default-value annotations found in --help. SHOULD-tier — agents reading help text need to see what value a flag falls back to when omitted (`[default: <value>]` per clap convention).
PASS	Rich-TUI affordance for TTY contexts

P2: Structured, Parseable Output

WARN	Structured output support	--output/--format flag detected but could not validate JSON via safe probes (--help/--version override output flags in most CLIs)
SKIP	Structured-output CLI exposes its schema at runtime	no structured-output indicator (--output / --format / json / jsonl) in --help
WARN	--json / --jsonl short aliases for --output	no --json or --jsonl short alias found. Agents and pipelines benefit from short forms alongside the canonical `--output` enum.
WARN	`--raw` flag for pipe-safe unformatted output	no `--raw` flag advertised. MAY-tier — useful for pipelines that want to strip formatting before piping to other tools.
SKIP	`--output` advertises additional formats beyond text/json	no `--output` or `--format` flag advertised; vacuous skip for MAY-tier extra formats.
PASS	Bad invocation exits with structured usage-error code (2)
SKIP	Errors emit JSON envelope with `error`/`kind`/`message` under `--output json`	binary does not advertise `--output json` in --help; MUST applies only to CLIs that opt into the JSON contract.
SKIP	JSON success and error envelopes share their non-payload key set	binary does not advertise `--output json` in --help; envelope-consistency only applies to CLIs that opt into the JSON contract.

P3: Progressive Help Discovery

PASS	Help flag produces useful output
PASS	Version flag works (`--version` plus short alias)
PASS	Version flag works (`--version` plus short alias)
WARN	`examples` subcommand or `--examples` flag for curated usage patterns	no `examples` subcommand or `--examples` flag found. MAY-tier — a curated usage block keeps agents from hunting through long help text.
PASS	Short `-h` summary differs from `--help` long form
PASS	Each subcommand's `--help` ships at least one invocation example
WARN	Help text pairs human and `--output json` example invocations	no paired text + `--output json` example found within 5 lines in top-level or any subcommand `--help`. Pairing keeps agents from reverse-engineering the JSON invocation from the text one.

P4: Fail-Fast, Actionable Errors

PASS	Rejects invalid arguments
PASS	Error messages include a hint or remediation phrase
SKIP	`--output json` produces JSON-formatted errors	binary does not advertise `--output json` in --help; SHOULD applies only to CLIs that opt into the JSON contract.

P5: Safe Retries & Mutation Boundaries

SKIP	Destructive subcommands require `--force` or `--yes`	no destructive subcommands detected; MUST applies conditionally to CLIs with destructive operations.
WARN	Read and write surfaces are both visible in subcommand list	write-pattern subcommand(s) present (update) but no read-pattern surface detected. If the CLI is write-only by design the MUST is satisfied vacuously; otherwise expose the read surface with agent-recognizable verbs (list/get/show/query/find/search).

P6: Composable, Predictable Command Structure

PASS	Handles SIGPIPE gracefully
SKIP	Pager-using CLI ships --no-pager escape hatch	no pager signal (less/more/$PAGER/--pager) in --help
PASS	Respects NO_COLOR
WARN	Subcommand verbs follow community-standard names	7/24 subcommand(s) follow standard verb names. Non-standard: review, mcp, plugin, mcp-server, app-server, remote-control, completion, sandbox, debug, working, resume, the, fork, most, cloud, exec-server, features. MAY-tier — community-standard verbs (get/list/create/update/delete) help agents predict subcommand behavior across CLIs.
WARN	`--color` flag for explicit color control	no `--color` flag advertised. MAY-tier — `auto\|always\|never` lets agents and pipelines override the TTY-based default.
SKIP	Input-accepting commands read from stdin when no file is given	no input-accepting subcommand detected (process/parse/convert/transform/analyze/validate/format/lint/audit); vacuous skip for the conditional SHOULD.
WARN	Subcommand naming follows a consistent verb/noun convention	subcommand naming is inconsistent: 7 non-verb subcommand(s) (mcp, plugin, working, the, most, cloud, features) mix verb and non-verb children at the second level, so an agent cannot predict where the action lives. SHOULD-tier: pick a consistent shape (all verb-first, all noun-verb hierarchy, or any combination where each non-verb group's children are uniformly verbs). The verb list is a heuristic; inspect `--help` to confirm.
WARN	Operations are subcommands, not verb-shaped flags	top-level verb-shaped flag(s) found: --search. Operations belong under the `Commands:` block (`tool search "q"`), not on the flag namespace where they fight the `--help` filtering agents rely on.

P7: Bounded, High-Signal Responses

WARN	Quiet mode available	no --quiet/-q flag detected in --help output
WARN	`--verbose` flag for diagnostic escalation	no `--verbose` / `-v` flag advertised. SHOULD-tier — agents debugging failures need a way to escalate diagnostic detail.
SKIP	`--limit` / `--max-results` flag for list operations	no list-style subcommand detected (list/ls/search/query/find/show/get); vacuous skip for the list-only SHOULD.
SKIP	Cursor-based pagination flags for list traversal	no list-style subcommand detected; vacuous skip for the list-only MAY.
SKIP	`--timeout` flag for long-running operations	no long-running subcommand detected (serve/daemon/watch/tail/monitor/follow/run/start/stream); vacuous skip for the conditional SHOULD.
WARN	Help text advertises TTY-aware verbosity behavior	no TTY-aware language found in `--help`. MAY-tier — automatic verbosity reduction when stdout is piped or redirected lets agents skip the explicit `--quiet` flag. Behavioral probes cannot simulate a real TTY without a pty crate, so this audit relies on documented intent.

P8: Discoverable Through Agent Skill Bundles

PASS	Skill bundle has install path (`tool skill install [<host>]`)
PASS	`skill install --all` for multi-runtime install
PASS	`skill update` / `skill upgrade` for bundle refresh

Details

Version scored: 0.135.0
Audit date: 2026-06-01 17:35:49 UTC
Duration: 3.1s
Platform: linux/x86_64
Mode: command
Anc build: 0.5.0
Install: bun add -g @openai/codex

Embed the badge

This score (74%) clears the badge floor (70%). Copy this into your README:

[![agent-native](https://anc.dev/badge/codex.svg)](https://anc.dev/score/codex)

Preview:

Reproduce this scorecard for codex locally and inspect the failing audits:

anc audit --command codex --output json

Install anc first if you don't have it. Add --output json to get the same JSON shape committed under scorecards/.