OpenAI-compatible gateway
UnifiedMemory exposes OpenAI-style routes for clients that already know how to call chat completions or responses APIs.
POST /v1/chat/completions
POST /v1/responses
Use these when you want familiar model-client shapes with UnifiedMemory auth, scope, and optional memory context.
Auth
Authorization: Bearer $UNIFIED_MEMORY_API_KEY
Gateway keys should be scoped. The key decides container, source app, agent identity, and available tools.
Chat completions example
Common request fields:
| Field | Required | Type | Description |
|---|---|---|---|
model | Yes | string | Provider/model route. MiniMax routes should use configured model names. |
messages | Yes for chat completions | array | OpenAI-style chat messages. |
input | Yes for responses | string or array | OpenAI Responses-style input. |
memory | No | object | UnifiedMemory options such as container_tag, trust_policy, and max_context_tokens. |
stream | No | boolean | Streaming hint when supported by the gateway route. |
metadata | No | object | Caller metadata. Do not include secrets. |
curl https://edge-api.jithendranara.dev/v1/chat/completions \
-H "Authorization: Bearer $UNIFIED_MEMORY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "minimax/MiniMax-M2.7",
"messages": [
{"role": "system", "content": "Use memory only when supplied."},
{"role": "user", "content": "What should I know before editing the docs?"}
],
"memory": {
"container_tag": "jeethendra",
"trust_policy": "balanced",
"max_context_tokens": 900
}
}'
Responses example
curl https://edge-api.jithendranara.dev/v1/responses \
-H "Authorization: Bearer $UNIFIED_MEMORY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "minimax/MiniMax-M2.7",
"input": "Summarize the current memory docs rules.",
"memory": {
"container_tag": "jeethendra",
"include_courtroom_links": true
}
}'
Response shape
The gateway preserves the client-compatible response envelope and may add memory diagnostics under metadata or response extensions:
{
"id": "resp_01",
"object": "chat.completion",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Before editing docs, keep public pages curated and exclude proof artifacts."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 620,
"completion_tokens": 28,
"total_tokens": 648
},
"memory": {
"container_tag": "jeethendra",
"tokens_injected": 210,
"trust_policy": "balanced",
"cross_agent_candidates_suppressed": 0
}
}
Common outcomes
| Status | Meaning |
|---|---|
200 | Model response returned. |
401 | Missing or invalid key. |
403 | Key lacks gateway or memory capability. |
404 | Model, container, or route unavailable. |
422 | Invalid OpenAI-style request payload. |
503 | Upstream model or cognition provider unavailable. |
Boundaries
- Gateway memory context still follows trust and scope gates.
- Test/eval/forensic rows are not injected because a chat client asked for them.
- Use Agent Context when you want the memory packet without model synthesis.