Skip to main content

OpenAI-compatible gateway

UnifiedMemory exposes OpenAI-style routes for clients that already know how to call chat completions or responses APIs.

POST /v1/chat/completions
POST /v1/responses

Use these when you want familiar model-client shapes with UnifiedMemory auth, scope, and optional memory context.

Auth

Authorization: Bearer $UNIFIED_MEMORY_API_KEY

Gateway keys should be scoped. The key decides container, source app, agent identity, and available tools.

Chat completions example

Common request fields:

FieldRequiredTypeDescription
modelYesstringProvider/model route. MiniMax routes should use configured model names.
messagesYes for chat completionsarrayOpenAI-style chat messages.
inputYes for responsesstring or arrayOpenAI Responses-style input.
memoryNoobjectUnifiedMemory options such as container_tag, trust_policy, and max_context_tokens.
streamNobooleanStreaming hint when supported by the gateway route.
metadataNoobjectCaller metadata. Do not include secrets.
curl https://edge-api.jithendranara.dev/v1/chat/completions \
-H "Authorization: Bearer $UNIFIED_MEMORY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "minimax/MiniMax-M2.7",
"messages": [
{"role": "system", "content": "Use memory only when supplied."},
{"role": "user", "content": "What should I know before editing the docs?"}
],
"memory": {
"container_tag": "jeethendra",
"trust_policy": "balanced",
"max_context_tokens": 900
}
}'

Responses example

curl https://edge-api.jithendranara.dev/v1/responses \
-H "Authorization: Bearer $UNIFIED_MEMORY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "minimax/MiniMax-M2.7",
"input": "Summarize the current memory docs rules.",
"memory": {
"container_tag": "jeethendra",
"include_courtroom_links": true
}
}'

Response shape

The gateway preserves the client-compatible response envelope and may add memory diagnostics under metadata or response extensions:

{
"id": "resp_01",
"object": "chat.completion",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Before editing docs, keep public pages curated and exclude proof artifacts."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 620,
"completion_tokens": 28,
"total_tokens": 648
},
"memory": {
"container_tag": "jeethendra",
"tokens_injected": 210,
"trust_policy": "balanced",
"cross_agent_candidates_suppressed": 0
}
}

Common outcomes

StatusMeaning
200Model response returned.
401Missing or invalid key.
403Key lacks gateway or memory capability.
404Model, container, or route unavailable.
422Invalid OpenAI-style request payload.
503Upstream model or cognition provider unavailable.

Boundaries

  • Gateway memory context still follows trust and scope gates.
  • Test/eval/forensic rows are not injected because a chat client asked for them.
  • Use Agent Context when you want the memory packet without model synthesis.