Agentic harnesses run work. UAI records the portable evidence. Use this guide when a team is deciding how harness engineering, orchestration, tools, memory, observability, policy, evaluation, and human oversight should fit around UAI-1, AI Memory, and Project Handoff.
What an agentic harness is
An agentic harness is the software layer around models and agents that turns intent into controlled work. It may route tasks, call tools, read data, coordinate agents, enforce policy, collect approvals, keep runtime memory, observe traces, and decide when a human should step in.
- Execution: models, tools, workflows, retries, interruptions, and task state.
- Connectivity: APIs, MCP-style tool/resource sessions, local adapters, and data access policies.
- Coordination: runtime handoffs, A2A-style agent delegation, task boards, and approval checkpoints.
- Memory: short-lived runtime state, durable AI Memory, Project Handoff files, and cold wiki archives.
- Evidence: traces, decisions, validator results, release packets, handoff summaries, and redacted public records.
Harness engineering operating model
Harness engineering is the practical operating loop around the runtime: prepare a bounded spec pack, run the work in the harness, review artifacts, then write back only accepted memory and evidence.
| Step | Harness engineering work | UAIX handoff or evidence output |
|---|---|---|
| Prepare | Assemble current Project Handoff files, AI Memory, constraints, page digest, acceptance criteria, and test plan. | A compact startup packet that names source authority, support boundaries, target routes, and targeted checks. |
| Run | Let the runtime choose tools, call MCP resources, coordinate agents, collect approvals, track traces, and run evals. | No automatic public claim. UAIX stays out of the live control path unless the app already uses a validated UAI record. |
| Review | Separate accepted artifacts from raw traces, hidden prompts, private logs, failed attempts, and speculative optimization. | Reviewed summaries, redactions, validator output, test or eval results, and explicit non-claim boundaries. |
| Write back | Promote only the facts that future humans or agents should load. | Project Handoff updates, AI Memory changes, release notes, roadmap state, adoption metrics, page digests, and next actions. |
Page digests, onboarding checkpoints, eval summaries, adoption metrics, and workflow decisions can improve UAIX.org when they are reviewed artifacts. Raw traces, hidden prompts, private logs, speculative optimization, and background cleanup plans should stay in the harness, issue tracker, or cold memory until promoted.
Where UAI fits
UAI should not compete with the harness. UAI-1 is the public exchange and evidence contract that preserves the part of an agentic run another team, vendor, release, auditor, customer, or future agent must be able to inspect and carry forward.
| Harness strategy area | Harness usually owns | UAI should preserve |
|---|---|---|
| Planning and orchestration | Task decomposition, model selection, tool sequencing, retries, and interruption behavior. | The reviewed intent, task status, outcome, and release-facing handoff summary. |
| Tool and data access | MCP sessions, API clients, credentials, local adapters, and data access enforcement. | Redacted request/result evidence, source references, provenance, and validation-ready payload records. |
| Agent-to-agent coordination | A2A-style discovery, delegation, capability routing, and runtime handoff behavior. | Capability statements, task-state snapshots, responsibility transfer summaries, and durable exchange packets. |
| Memory and context | Runtime memory, conversation state, vector stores, and tool-local cache policy. | AI Memory packets, Project Handoff files, active constraints, decisions, owners, checks, and promotion rules. |
| Observability and evaluation | Traces, spans, metrics, eval runs, approval events, and behavior dashboards. | Only reviewed, redacted identifiers and summaries needed to reproduce, cite, or audit the public record. |
| Policy and governance | Guardrails, human approval gates, data controls, and deployment policy. | Support-claim boundaries, validator evidence, conformance-pack material, and dated release trail links. |
Reference architecture loop
Think about UAI as the record layer around a run, not the run loop itself. A good agentic architecture can use UAI before work starts, while work is running, and after work is accepted.
| Moment | Runtime or harness job | UAIX job |
|---|---|---|
| Before the run | Load task instructions, choose models, expose tools, set approvals, and prepare runtime memory. | Load Project Handoff, current constraints, source authority, evidence rules, and the UAI-1 profile that will receive public proof if the run succeeds. |
| During the run | Execute steps, call tools, coordinate agents, collect traces, request human approval, and handle interruptions. | Keep UAI out of the live control path unless a validated exchange record is already part of the application design. |
| Review gate | Separate accepted outputs from raw traces, private data, rejected attempts, and unsupported claims. | Redact, summarize, select source references, attach provenance, and decide which facts are safe to promote. |
| After acceptance | Persist implementation changes, deployment notes, eval outcomes, or customer-facing results in the owning system. | Write the durable UAI-1 record, validator evidence, conformance or adoption packet, AI Memory update, and Project Handoff summary that future agents and reviewers can load. |
What should leave the harness
Not every trace line, prompt, tool result, or approval event should become UAI evidence. Promote only the smallest reviewed record that another party needs outside the runtime session.
| Question | If yes | If no |
|---|---|---|
| Will another team, vendor, release, auditor, customer, or future agent need to validate this fact? | Consider a UAI-1 record, validator result, or Project Handoff update. | Leave it in runtime logs, issue history, or private operational records. |
| Does the fact support a public implementation or support claim? | Attach schema, registry, example, validator, conformance, roadmap, and changelog evidence before publishing. | Keep it as internal background or planned work. |
| Does it include secrets, credentials, customer data, raw approvals, hidden prompts, or private traces? | Do not publish it. Redact first, then preserve only the reviewed evidence pointer if needed. | It may still need source review and support-boundary checking before promotion. |
| Is it current project truth the next agent must load? | Promote it into AI Memory, Project Handoff, or typed .uai records after review. |
Archive it as source history or cold memory rather than startup context. |
Common integration patterns
Most teams do not need to start by building a new runtime integration. Start with the lowest-risk pattern that creates useful evidence, then move toward adapters only after public fixtures and release evidence exist.
| Pattern | Use it when | How UAI participates | Boundary |
|---|---|---|---|
| Evidence sidecar | The harness already works, but the outcome needs a portable record. | Use UAI-1 examples, validator output, source references, and a short reviewed result packet beside the runtime artifact. | UAI observes and records after review; it does not drive the run loop. |
| Release gate record | A feature, implementation, or public claim needs proof before publication. | Attach UAI-1 payloads, validator evidence, conformance or adoption material, roadmap state, and changelog links to the release decision. | A passing packet is evidence for that packet and named scope, not certification or endorsement. |
| Repository handoff | Future humans or agents need to continue the work without private chat history. | Update AI Memory, Project Handoff, AGENTS.md, readme.human, and typed .uai records with accepted facts, checks, blockers, and next actions. |
Handoff files are project memory, not UAI-1 conformance evidence by themselves. |
| Bridge fixture lab | A team wants future MCP, A2A, trace, or runtime-adapter support. | Create small redacted fixtures, expected mappings, validator expectations, support boundaries, and maintenance owners before naming support. | Bridge work remains planned or research-track until UAIX publishes fixtures, tests, and release trail. |
First proof run
A useful first proof run is intentionally small: one runtime result, one UAI-1 profile, one redacted payload, one validator result, one owner, one release or handoff destination, and one explicit non-claim boundary.
- Choose the record: name the request, response, task status, capability statement, or error record that needs to survive the run.
- Select the public profile: connect the record to the current UAI-1 schema, registry entry, example, or conformance fixture that actually applies.
- Redact before validation: remove secrets, customer data, raw approvals, hidden prompts, and private traces before the payload leaves the harness.
- Validate and cite: keep the validator output, fixture payload, implementation scope, and dated release or handoff note together.
- Write back accepted memory: update Project Handoff only with the facts a future agent should load, and leave raw trace detail in the owning runtime or archive.
- Name the boundary: state what the proof does not claim, especially certification, endorsement, official adapter support, SDK, CLI, automatic sync, or broad conformance.
Strategy choices
- Keep UAI non-runtime: let harnesses execute work while UAI records durable exchange, evidence, and handoff material.
- Start with one proof run: choose one UAI-1 profile, one candidate payload, one validator result, and one record owner before discussing broad platform support.
- Map traces into evidence only after review: traces are useful raw material, but public records should contain redacted, selected, reproducible facts.
- Use AI Memory for current project truth: Project Handoff and typed
.uaifiles should carry accepted constraints, decisions, progress, and checks, not entire runtime traces. - Keep bridge language planned until proved: reference adapters, trace exporters, local validation, redaction lint, SDKs, CLIs, and conformance wording need fixtures and release evidence before they become support claims.
Bridge profiles to prove next
The useful future work is not a generic harness replacement. It is a small set of evidence bridges that prove how a completed run can become a portable UAI record without leaking secrets, overclaiming support, or depending on one runtime.
| Bridge idea | Evidence needed first | Public status |
|---|---|---|
| Trace-to-UAI export | Example traces, redaction rules, fixture payloads, and validator expectations. | Planned evidence work. |
| MCP tool-call evidence packet | Tool request/result mapping, capability references, provenance, and safety filtering. | Planned evidence work. |
| A2A task-state handoff packet | Capability statement, task owner, task status, delegation boundary, and result summary fixtures. | Planned evidence work. |
| Project Handoff write-back | Accepted decisions, changed files, checks run, blockers, and next actions written into current handoff files. | Current pattern; automation remains planned. |
| Local validation and redaction lint | Rules for secrets, private data, unsupported claims, and public-safe evidence output. | Planned evidence work. |
Current support boundary
- Current UAIX support is the published public record: UAI-1, Validator, Conformance Pack, AI Memory, Project Handoff, API Reference, and the published implementation tracks.
- This guide is strategy and adoption guidance. It does not create a runtime harness, hosted importer, scheduler, SDK, CLI, official adapter, certification, endorsement, compliance program, automatic repository writer, automatic LLM Wiki sync, or UAI-1 conformance claim for repo-local handoff files.
- When agentic harness work becomes public evidence, attach validator output, fixture payloads, implementation scope, release notes, and roadmap status before repeating the claim.
Do not claim
- Do not claim UAIX runs agents, replaces agent runtimes, or owns live tool-calling control flow.
- Do not claim official MCP, A2A, OpenAI, vendor, SDK, CLI, or harness adapter support until public fixtures, tooling, and release evidence exist.
- Do not treat runtime traces, private dashboards, raw dropped files, old reports, or AIWikis cold memory as active public support evidence until reviewed and promoted.
- Do not place secrets, credentials, private customer data, privileged operations, raw approvals, or unredacted traces into portable UAI, AI Memory, or Project Handoff records.
Practical reading path
- Read Standards Fit to decide whether UAI-1, MCP, A2A, observability, or the harness owns the immediate job.
- Use UAI-1, the Validator, and Conformance Pack when the run needs public exchange evidence.
- Use AI Memory, the AI Memory Package Wizard, and Project Handoff when the durable output is project context that future humans and agents should load.
- Use OpenAI/Codex, Coding Agents, and Context Budget guidance when the problem is agent-runtime pickup, multi-tool handoff, or hot/cold memory maintenance.
- Use Roadmap before describing bridge profiles, trace exporters, redaction lint, reference adapters, SDKs, CLIs, or conformance language as current support.
Related UAIX records
- Standards FitLayer ownership for UAI-1, MCP, A2A, observability, and bridge evidence.
- UAI-1The current public exchange and evidence contract.
- Project HandoffDurable project memory that survives the run.
- AI MemoryCompact accepted context for humans and agents.
- OpenAI / Codex GuideRuntime runs the agents; Project Handoff preserves project memory.
- Coding Agents GuideOne local bundle across many coding-agent surfaces.
- Context Budget GuideKeep hot handoff context compact while old history moves to cold memory.
- RoadmapEvidence gates before future bridge ideas become support claims.