User GuideDeveloper Guide

igentbase Developer Documentation

Everything you need to build, publish, and monetize AI agents on the igentbase marketplace.

Platform Overview

igentbase is a marketplace for AI agents. Users discover and use agents through the marketplace. Developers publish agents and earn revenue from usage.

The architecture is simple:

  1. You build an agent (an HTTPS endpoint that accepts requests and returns responses)
  2. You register it on the Developer Console with metadata, pricing, and configuration
  3. The igentbase AI Gateway routes user requests to your agent, handles authentication, rate limiting, token metering, and billing
  4. You earn revenue from every paid call. Users are billed. You get paid.
Earn 75% of every call. igentbase retains a 25% platform commission to cover AI Gateway infrastructure, token metering, billing, marketplace distribution, and user acquisition. Your revenue dashboard shows your net earnings — what you actually receive.

How the AI Gateway Works

The AI Gateway sits between users and your agent. For every request:

  1. User sends a request with their API key to gateway.igentbase.com/{agent_id}/{version}/mcp
  2. AI Gateway authenticates the user's access key, checks rate limits and balance
  3. Gateway forwards the request to your endpoint with X-Gateway-Token and X-User-Token headers
  4. Your agent processes the request and returns a response
  5. AI Gateway passes the response to the user untouched
  6. Asynchronously, the token-calc service counts tokens and deducts credits from the user's balance
The AI Gateway never inspects, stores, or logs the content of requests or responses. Only metadata (token counts, latency, function name, cost) is recorded for billing and analytics. User prompts and agent responses flow through without being read.

Using the Developer Console

The Developer Console is your home base for managing everything. Sign in with Google, Microsoft, or GitHub.

PageWhat It Does
DashboardOverview of all your agents — installs, revenue, status, health at a glance
Agent DetailsDeep dive into a single agent: versions, functions, pricing, configuration, governance, analytics
RevenueRevenue breakdown per agent, per day. Charts, tables, and export
PayoutPayout history, pending balance, payout profiles, tax info
EditorsManage team members who can edit your agents
NotificationsInstall alerts, review notifications, payout confirmations, policy notices
SettingsProfile, organization, email preferences, API keys, webhooks, account deletion
SupportSubmit tickets, view ticket history

Creating Your First Agent

  1. Sign up at the Developer Console with Google, Microsoft, or GitHub
  2. Complete your profile — name, organization (optional), description. This becomes your public publisher page on the marketplace
  3. Create a new agent — provide a name, unique slug (e.g., my-weather-agent), short description, icon, and category. The agent is saved as a draft
  4. Add a version — choose protocol (MCP or A2A), set your endpoint URL, and add functions/tools
  5. Set pricing — choose per-token or per-request pricing for each function
  6. Fill governance — declare data collection, storage, retention policies. This builds your trust score
  7. Submit for review — click "Submit Agent" to publish your draft. Your agent goes through compatibility and stability tests
  8. Approval — the platform team reviews your submission. You'll receive a notification when approved or if changes are needed
  9. Live — once approved, your agent appears on the marketplace
Your agent slug is permanent and becomes part of the API URL: gateway.igentbase.com/my-weather-agent/1.0.0/mcp. Choose it carefully.

Draft vs Published

New agents start as drafts. Drafts are only visible to you in the Developer Console (marked with a yellow "Draft" badge). They do not appear in the marketplace. Once you complete all steps and submit, the agent enters the approval queue.

Approval required. All new agents must be approved by the platform team before they appear in the marketplace. Typical review time is 1-2 business days. You'll receive a notification when your agent is approved.

MCP, A2A & Models

igentbase supports two agent protocols (MCP and A2A) and a model listing type for AI models and REST services:

MCP (Model Context Protocol)

The standard protocol for tool-calling agents. Based on JSON-RPC 2.0 over HTTP.

  • Endpoint: /{agent_id}/{version}/mcp
  • Methods: POST (RPC calls), GET (SSE notifications), DELETE (session termination)
  • Session management is your responsibility — the AI Gateway passes through Mcp-Session-Id headers unchanged
  • Spec: MCP Streamable HTTP Transport

Billable MCP methods (you set the price):

MethodDescription
tools/callExecute a tool
resources/readRead a resource
prompts/getGet a prompt
sampling/createMessageCreate a sampled message

All other MCP methods (initialize, tools/list, resources/list, prompts/list, etc.) are overhead calls charged a small base fee to the user. You don't set pricing for these.

A2A (Agent-to-Agent Protocol)

Google's protocol for agent-to-agent communication. JSON-RPC 2.0 over HTTP.

  • Endpoint: /{agent_id}/{version}/a2a
  • Method: POST only
  • Task management is your responsibility — the AI Gateway passes through task IDs unchanged
  • Agent Card: automatically served at /{agent_id}/{version}/.well-known/agent.json (public, no auth)
  • Spec: Google A2A Protocol

Billable A2A methods:

MethodDescription
tasks/sendSynchronous task execution
tasks/sendSubscribeTask with SSE streaming
tasks/resubscribeReopen SSE for in-progress task
messages/sendSend a message
messages/streamStream a message
artifacts/createCreate artifact
artifacts/updateUpdate artifact
artifacts/getGet artifact content
notifications/pushPush notification
sessions/createCreate session

All other A2A methods (tasks/get, tasks/list, health/check, etc.) are overhead calls charged the base fee. The AI Gateway rejects any A2A method not in the supported list.

Models

For AI models and services that use plain REST endpoints (not MCP or A2A). Choose "Model" as the listing type in the Developer Console.

How Model Routing Works

When you create a model, you provide a destination URL — this is the base URL of your server. Each function you define in the Developer Console becomes a separate AI Gateway endpoint that users call:

WhatURL Pattern
User calls the gatewayPOST a3s.igentbase.com/{agent_id}/{version}/{function_name}
AI Gateway forwards to youPOST {your_destination_url}

The AI Gateway proxies the full request body (JSON) to your destination URL unchanged. It adds these headers:

HeaderValuePurpose
X-Gateway-TokenYour agent's gateway tokenVerify this to ensure requests come from igentbase (details)
AuthorizationBearer {user_api_key}The calling user's API key
X-User-TokenOpaque user identifierStable per-user token for session tracking (details)
Content-Typeapplication/jsonRequest body format

Naming Your Functions

Each function you add in the Developer Console is exposed as a separate endpoint on the AI Gateway. The function_name you choose becomes the last segment of the AI Gateway URL.

For example, if your model has functions chat, embed, and complete:

# Users call these AI Gateway endpoints:
POST a3s.igentbase.com/my-model/1.0.0/chat
POST a3s.igentbase.com/my-model/1.0.0/embed
POST a3s.igentbase.com/my-model/1.0.0/complete

# All three are forwarded to your single destination URL:
POST https://your-server.com/api/v1/model
Tip: Your server receives all function calls at the same destination URL. Use the request body or a header to distinguish which function was called, or configure separate destination URLs per function if your architecture requires it.

Example: Python Flask Model

import os, hmac
from flask import Flask, request, jsonify

app = Flask(__name__)
GATEWAY_TOKEN = os.environ["IGENTBASE_GATEWAY_TOKEN"]

@app.route("/api/v1/model", methods=["POST"])
def handle_request():
    # 1. Verify the request comes from igentbase
    token = request.headers.get("X-Gateway-Token", "")
    if not hmac.compare_digest(token, GATEWAY_TOKEN):
        return jsonify({"error": "Unauthorized"}), 401

    # 2. Process the request
    body = request.get_json()
    prompt = body.get("prompt", "")

    # 3. Return your model's response as JSON
    return jsonify({
        "response": f"Model output for: {prompt}",
        "usage": {"input_tokens": 10, "output_tokens": 25}
    })

# Verification route (required during setup)
@app.route("/_igentbase_verify")
def verify():
    return os.environ.get("IGENTBASE_VERIFY_TOKEN", "")

Example: Node.js Express Model

const express = require('express');
const app = express();
app.use(express.json());

const GATEWAY_TOKEN = process.env.IGENTBASE_GATEWAY_TOKEN;

app.post('/api/v1/model', (req, res) => {
  // 1. Verify gateway token
  const token = req.headers['x-gateway-token'] || '';
  if (token !== GATEWAY_TOKEN) {
    return res.status(401).json({ error: 'Unauthorized' });
  }

  // 2. Process request
  const { prompt } = req.body;

  // 3. Return response
  res.json({
    response: `Model output for: ${prompt}`,
    usage: { input_tokens: 10, output_tokens: 25 }
  });
});

// Verification route (required during setup)
app.get('/_igentbase_verify', (req, res) => {
  res.send(process.env.IGENTBASE_VERIFY_TOKEN || '');
});

app.listen(8080);

Model Response Format

Your model must return valid JSON. The response body is passed through to the caller unchanged. Include token usage if your pricing is per-token:

{
  "response": "Your model's output text or data",
  "usage": {
    "input_tokens": 150,
    "output_tokens": 42
  }
}
Pricing for models works the same as agents. Set per-request or per-token pricing for each function in the Developer Console. The platform handles billing, rate limiting, and usage tracking automatically.

Agent Response Contract

Your agent is an HTTPS endpoint. The AI Gateway forwards requests to you and passes your response back to the user untouched. Here's the contract:

What Your Agent Receives

POST https://your-agent.example.com/mcp
Headers:
  X-Gateway-Token: <your shared secret>   ← validate this!
  X-User-Token:    <opaque user identifier>
  Authorization:   Bearer <user's API key>
  Content-Type:    application/json

Body: (forwarded verbatim from the user's client)

What Your Agent Must Return (Non-Streaming)

{
  "streaming":      false,
  "content": {
    "type":  "text",
    "role":  "assistant",
    "data":  "Paris is the capital of France."
  },
  "input_tokens":   18,
  "output_tokens":  9,
  "cache_tokens":   42,
  "error_code":     null,
  "error_msg":      null
}
FieldTypeRequiredDescription
streamingbooleanYesSet to false for regular JSON responses
contentobjectYesThe response payload (see Content Types)
input_tokensinteger/nullNoInput tokens as reported by your model
output_tokensinteger/nullNoOutput tokens as reported by your model
cache_tokensinteger/nullNoTotal cache tokens (read + write combined)
error_codeinteger/nullYesYour error code on failure, null on success
error_msgstring/nullYesHuman-readable error message, or null

HTTP Status Codes

CodeMeaningWhen to Use
200CompletedRequest succeeded, full response returned
206PartialResponse truncated (context limit, max-tokens hit)
400FailedBad request — user sent invalid input
403RefusedContent policy or permission violation
422UnprocessableValid format but you can't handle this specific request
429ThrottledYour agent's own rate limit (include Retry-After header)
500ErrorInternal error in your agent
503UnavailableYour agent is temporarily down

Streaming (SSE)

For long-running or real-time responses, return Content-Type: text/event-stream and send SSE events:

data: {"streaming":true,"delta":"Hello"}\n\n
data: {"streaming":true,"delta":" world"}\n\n
data: {"streaming":true,"delta":"!"}\n\n
data: {"streaming":true,"delta":null,"input_tokens":42,"output_tokens":3,"cache_tokens":0}\n\n
data: [DONE]\n\n
FieldDescription
streamingAlways true for SSE events
deltaThe new text token(s). Set to null in the final usage event
input_tokensTotal input tokens — send in the final event (before [DONE])
output_tokensTotal output tokens — send in the final event
cache_tokensTotal cache tokens — send in the final event
The final event must be exactly data: [DONE]\n\n. The AI Gateway uses this to know the stream is complete.

The AI Gateway passes every SSE chunk to the user in real time without buffering. It also splits the stream into 64 KB Kafka messages for the token-calc service to process.

Content Types

The content field always has this structure:

{
  "type": "<content type>",
  "role": "assistant",
  "data": <varies by type>
}
TypeData ShapeUse Case
textstringPlain text or markdown responses
refusalstringModel declined to answer (use HTTP 403)
tool_call{"tool":"name","input":{...},"output":{...}}Agent invoked an external tool
image{"url":"..."} or {"base64":"...","mime":"image/png"}Generated or retrieved image
multipartArray of text/image objectsMix of text and images

Multipart Example

{
  "type": "multipart",
  "role": "assistant",
  "data": [
    { "type": "text",  "data": "Here is the architecture diagram:" },
    { "type": "image", "data": { "url": "https://your-cdn.com/diagram.png" } },
    { "type": "text",  "data": "The left box represents the gateway." }
  ]
}

Verifying the AI Gateway Token (X-Gateway-Token)

Always validate X-Gateway-Token. If you skip this, anyone can call your agent directly, bypassing authentication, rate limiting, and billing. This is a policy requirement.

When you create an agent, the platform generates a gateway token — a shared secret unique to your agent. It's shown once in the Developer Console. The gateway sends it on every request as X-Gateway-Token.

Your agent must:

  1. Read the X-Gateway-Token header from incoming requests
  2. Compare it against the token you stored during setup
  3. Reject with HTTP 401 if missing or mismatched

Python Example

GATEWAY_TOKEN = os.environ["IGENTBASE_GATEWAY_TOKEN"]

@app.route("/mcp", methods=["POST"])
async def handle_mcp(request):
    token = request.headers.get("X-Gateway-Token", "")
    if not hmac.compare_digest(token, GATEWAY_TOKEN):
        return json({"error": "Unauthorized"}, status=401)
    # ... process request

Node.js Example

const GATEWAY_TOKEN = process.env.IGENTBASE_GATEWAY_TOKEN;

app.post('/mcp', (req, res) => {
  const token = req.headers['x-gateway-token'] || '';
  if (token !== GATEWAY_TOKEN) {
    return res.status(401).json({ error: 'Unauthorized' });
  }
  // ... process request
});
Use constant-time comparison (like hmac.compare_digest in Python or crypto.timingSafeEqual in Node.js) to prevent timing attacks.

Auth During Registration

When you first register your agent, the Developer Console uses "Fetch from Server" to connect to your endpoint and discover what tools, resources, prompts, or skills it offers. At that point, your agent doesn't exist on the platform yet — there's no gateway token.

So during initial registration, your agent must not require any auth to list its capabilities.

Once your agent is created, the Developer Console gives you a gateway token. You can then add the X-Gateway-Token check to your server — the platform will include it on all subsequent requests, including future "Fetch from Server" calls for new versions.

Recommended workflow:
1. Deploy your agent without auth
2. Register it on the Developer Console (platform fetches your capabilities)
3. Copy your gateway token from the Console
4. Add the gateway token check to your server
5. Redeploy — your agent is now protected

Endpoint Verification

Before your agent or model can be published, you must prove you control the endpoint URL. The platform verifies ownership by calling your endpoint and checking for a specific token.

How Verification Works

  1. When you configure your endpoint URL in the Developer Console, a unique verification token is generated (valid for 1 hour)
  2. You add the token to your server (method depends on protocol)
  3. Click "Verify Endpoint" — the platform calls your server and checks for the token
  4. On success, your endpoint is verified and you can proceed
Verification is required when you first create an agent/model and whenever you change the destination URL in a new version. If you keep the same URL, re-verification is skipped.

MCP Agents

Add a tool named _igentbase_verify that returns the verification token:

from mcp.server.fastmcp import FastMCP

mcp = FastMCP("my-agent")

@mcp.tool()
def _igentbase_verify() -> str:
    """Platform verification tool."""
    return "igb_verify_abc123..."  # paste the token from the Console

The platform calls your MCP endpoint, invokes tools/call with _igentbase_verify, and checks the response contains the token.

A2A Agents

Add a skill named _igentbase_verify to your agent card and implement it to return the token:

# In your A2A agent card (/.well-known/agent.json), add:
{
  "skills": [
    {
      "id": "_igentbase_verify",
      "name": "_igentbase_verify"
    }
  ]
}

# In your task handler, return the token when _igentbase_verify is called:
if task.skill_id == "_igentbase_verify":
    return "igb_verify_abc123..."

The platform sends a tasks/send request targeting the _igentbase_verify skill and checks the response.

Models

Add a GET /_igentbase_verify route that returns the verification token as plain text:

# Python Flask
@app.route("/_igentbase_verify")
def verify():
    return "igb_verify_abc123..."  # paste the token from the Console

# Node.js Express
app.get('/_igentbase_verify', (req, res) => {
  res.send('igb_verify_abc123...');
});

The platform sends GET {your_destination_url}/_igentbase_verify and checks the response body contains the token.

After verification: You can remove the _igentbase_verify tool/skill/route from your server. It's only needed during the verification step. The token expires after 1 hour regardless.

Session & User Tracking (X-User-Token)

Every request includes an X-User-Token header — an opaque, stable identifier for the user making the request. You can use it for:

  • Session management — track conversation state per user
  • Personalization — remember user preferences across calls
  • Caching — maintain per-user caches
  • Usage tracking — count calls per user on your side

The user token is the same across all of a user's API keys for your agent. It changes if the user uninstalls and reinstalls your agent.

Do not use the Authorization header for session tracking. Users can rotate API keys. The X-User-Token is the stable identifier. Also, never store the Authorization value — it's the user's API key and is sensitive.

Managing Sessions

# Simple session store keyed by user token
sessions = {}

@app.route("/mcp", methods=["POST"])
async def handle_mcp(request):
    user_token = request.headers.get("X-User-Token", "")

    # Get or create session
    if user_token not in sessions:
        sessions[user_token] = {"history": [], "preferences": {}}

    session = sessions[user_token]
    # ... use session for context

For MCP sessions specifically, the Mcp-Session-Id header is passed through from the client. Use it for MCP session lifecycle (initialize → calls → shutdown) while using X-User-Token for persistent user identity.

File Downloads (Resource IDs)

If your agent generates large outputs (PDFs, images, datasets, exports), don't embed them in the response body. Instead:

  1. Your agent does the work and stores the result in your own storage (S3, GCS, etc.)
  2. Return a resource ID or download URL in the response
  3. The client fetches the file directly from your storage

Example: Generate a Report

// User calls: tools/call with name="generate_report"

// Your agent response:
{
  "streaming": false,
  "content": {
    "type": "tool_call",
    "role": "tool",
    "data": {
      "tool": "generate_report",
      "input": { "query": "quarterly revenue" },
      "output": {
        "resource_id": "rpt_abc123",
        "download_url": "https://your-agent.com/downloads/rpt_abc123",
        "filename": "Q1_Revenue_Report.pdf",
        "size_bytes": 245760,
        "expires_at": "2026-05-01T00:00:00Z"
      }
    }
  },
  "input_tokens": 50,
  "output_tokens": 30,
  "cache_tokens": null,
  "error_code": null,
  "error_msg": null
}
Why not embed files? The gateway truncates response bodies larger than 1 MB before publishing to Kafka. Large embedded payloads also slow down streaming. Use URLs for anything over a few KB.

You can authenticate download URLs using the X-User-Token — generate signed URLs that are scoped to the user token, so only the authorized user can download.

a3swim — Gateway Integration SDK

a3swim is a lightweight Python middleware that handles all gateway integration for your agent — token verification, request IDs, rate limiting, access logging, error formatting, and AGENT_CONTRACT response wrapping. One line of code instead of implementing everything yourself.

It works with any Python framework: FastMCP, Google ADK, FastAPI, Flask, Django, Starlette, or any ASGI-compatible app.

Zero boilerplate. Without a3swim, you need to manually validate gateway tokens, format AGENT_CONTRACT responses, count tokens, handle errors, and set up logging. With a3swim, you add one line and get all of it.

What a3swim Handles

FeatureWithout a3swimWith a3swim
Gateway token verificationManual — read header, compare, return 401Automatic
Request ID trackingManual — generate UUID, set response headerAutomatic
Rate limitingBuild your own or use a libraryBuilt-in per-key token bucket
HMAC signature verificationManual — compute, compare, rejectOne config flag
Error responsesFormat AGENT_CONTRACT JSON yourselfAutomatic — any exception becomes valid contract JSON
Access loggingBuild structured logging from scratchAutomatic structured JSON logs
Body size limitsManual Content-Length checkAutomatic

Installation

# Base install — zero heavy dependencies
pip install a3swim

# With accurate token counting (optional)
pip install a3swim[tiktoken]

Base package only requires pyyaml. No FastAPI, no Starlette, no gRPC — just the middleware.

Quick Start

FastMCP Agent (Most Common)

import os
from mcp.server.fastmcp import FastMCP
from a3swim import A3Swim

mcp = FastMCP("my-agent")

@mcp.tool()
def greet(name: str) -> str:
    return f"Hello, {name}!"

if __name__ == "__main__":
    gateway_token = os.environ.get("GATEWAY_TOKEN", "")
    if gateway_token:
        import uvicorn
        app = A3Swim(mcp.sse_app(), gateway_token=gateway_token)
        uvicorn.run(app, host="0.0.0.0", port=8000)
    else:
        mcp.run(transport="sse")

Set the GATEWAY_TOKEN environment variable to your agent's gateway token from the Developer Console. Without it, the agent runs in bare development mode.

FastAPI Agent

from fastapi import FastAPI
from a3swim import A3Swim

api = FastAPI()

@api.post("/invoke")
async def invoke(data: dict):
    return {"result": "Hello from FastAPI"}

app = A3Swim(api, gateway_token=os.environ["GATEWAY_TOKEN"])

Google ADK Agent

from google.adk.agents import Agent
from a3swim import A3Swim

agent = Agent(name="my-agent", model="gemini-2.0-flash")
adk_app = agent.as_starlette()  # Returns ASGI app

app = A3Swim(adk_app, gateway_token=os.environ["GATEWAY_TOKEN"])
Any ASGI app works. If your framework produces an ASGI application (most modern Python frameworks do), you can wrap it with A3Swim().

Configuration

a3swim supports three configuration methods:

1. Keyword Arguments (Simplest)

app = A3Swim(your_app, gateway_token="secret", agent_id="my-agent")

2. Python Dictionary

app = A3Swim(your_app, config={
    "gateway_token": "secret",
    "agent_id": "my-agent",
    "rate_limit_enabled": True,
    "rate_limit_rpm": 120,
    "rate_limit_burst": 20,
    "tokenizer": "cl100k_base",
    "log_file": "/var/log/my-agent.log",
})

3. YAML File

app = A3Swim(your_app, config="a3swim.yaml")

Example a3swim.yaml:

gateway_token: ${GATEWAY_TOKEN}
agent_id: my-agent
enforce_gateway_token: true

rate_limit_enabled: true
rate_limit_rpm: 120
rate_limit_burst: 20
rate_limit_per_key: true

hmac_enabled: false

tokenizer: cl100k_base
max_body_bytes: 1048576

log_enabled: true
log_file: /var/log/my-agent.log
log_api_key_mode: masked

YAML files support ${ENV_VAR} and ${ENV_VAR:-default} substitution.

4. Environment Variables

app = A3Swim.from_env(your_app)

Maps A3SWIM_GATEWAY_TOKENgateway_token, A3SWIM_RATE_LIMIT_RPMrate_limit_rpm, etc.

All Configuration Options

OptionTypeDefaultDescription
gateway_tokenstring""Shared secret for X-Gateway-Token validation
enforce_gateway_tokenbooltrueWhether to reject requests with invalid tokens
agent_idstring""Your agent ID (used in access logs)
hmac_enabledboolfalseEnable HMAC signature verification
hmac_secretstring""HMAC shared secret
hmac_algorithmstringsha256sha256 or sha512
rate_limit_enabledboolfalseEnable rate limiting
rate_limit_rpmint60Requests per minute
rate_limit_burstint10Maximum burst size
rate_limit_per_keybooltruePer-API-key (true) or global (false)
tokenizerstringcl100k_baseToken counting encoding
max_body_bytesint1048576Max request body size (1 MB)
log_enabledbooltrueEnable access logging
log_filestring""Log file path (empty = Python logger only)
log_api_key_modestringmaskedmasked, hash, or full

Middleware Components

The A3Swim wrapper applies middleware in this order (outer → inner):

  1. ErrorHandler — catches all exceptions and returns AGENT_CONTRACT error JSON
  2. RequestId — ensures X-Request-Id on every request and response
  3. AccessLog — logs method, path, status, latency, API key as structured JSON
  4. BodyLimit — rejects oversized request bodies
  5. GatewayToken — validates X-Gateway-Token (constant-time comparison)
  6. HMAC — verifies HMAC signature and timestamp (optional)
  7. RateLimit — per-key token bucket rate limiting (optional)
  8. Your App

You can also use individual middleware for a custom stack:

from a3swim import (
    GatewayTokenMiddleware,
    RequestIdMiddleware,
    RateLimitMiddleware,
    ErrorHandlerMiddleware,
)

# Build a custom stack — apply inner to outer
app = your_app
app = RateLimitMiddleware(app, rpm=60, burst=10)
app = GatewayTokenMiddleware(app, token="secret")
app = RequestIdMiddleware(app)
app = ErrorHandlerMiddleware(app)

Response Helpers

a3swim provides helpers for building AGENT_CONTRACT responses:

Non-Streaming Response

from a3swim import wrap_response

# Simple text response
response = wrap_response(
    "Paris is the capital of France.",
    input_tokens=18,
    output_tokens=9,
    cache_tokens=42,
)
# Returns: {"streaming":false,"content":{"type":"text","role":"assistant","data":"Paris is..."},...}

# Error response
from a3swim import wrap_error

error = wrap_error(422, 1021, "Unsupported file format")
# Returns: {"streaming":false,"content":{"type":"text","role":"assistant","data":""},"error_code":1021,...}

Content Types

# Refusal
wrap_response("I cannot help with that.", content_type="refusal")

# Tool call
wrap_response({
    "tool": "search_web",
    "input": {"query": "latest news"},
    "output": {"results": ["..."]}
}, content_type="tool_call", role="tool")

# Image
wrap_response({
    "url": "https://cdn.example.com/chart.png",
    "alt": "Revenue chart"
}, content_type="image")

Streaming Helpers

For SSE streaming responses, use the streaming helpers:

from a3swim import sse_delta, sse_usage, sse_error, SSE_DONE

# Content event
sse_delta("Hello")
# → data: {"streaming":true,"delta":"Hello"}\n\n

# Usage event (send before [DONE])
sse_usage(input_tokens=42, output_tokens=10, cache_tokens=5)
# → data: {"streaming":true,"delta":null,"input_tokens":42,"output_tokens":10,"cache_tokens":5}\n\n

# Error event
sse_error("INTERNAL_ERROR", "Something went wrong")
# → data: {"streaming":true,"delta":null,"error_code":"INTERNAL_ERROR","error_msg":"..."}\n\n

# Terminator
SSE_DONE
# → data: [DONE]\n\n

Full Streaming Example

from starlette.responses import StreamingResponse
from a3swim import sse_delta, sse_usage, SSE_DONE

async def stream_response():
    # Stream content tokens
    for word in ["Hello", " world", "!"]:
        yield sse_delta(word)

    # Final usage event
    yield sse_usage(input_tokens=10, output_tokens=3, cache_tokens=0)

    # Terminator
    yield SSE_DONE

return StreamingResponse(
    stream_response(),
    media_type="text/event-stream",
)

Token Counting

a3swim includes a built-in token counter. It uses tiktoken if installed, otherwise falls back to word-based counting.

from a3swim import get_tokenizer

counter = get_tokenizer("cl100k_base")  # or "o200k_base", "fallback"
count = counter.count("Hello world, how are you?")
# → 6 (with tiktoken) or 5 (fallback word count)
Install tiktoken for accurate counts. Run pip install a3swim[tiktoken] to get accurate OpenAI-compatible token counting. Without it, a3swim uses word-based approximation.

Framework Examples

FastMCP (MCP Agents)

from mcp.server.fastmcp import FastMCP
from a3swim import A3Swim
import uvicorn, os

mcp = FastMCP("code-reviewer")

@mcp.tool()
def review(code: str) -> str:
    return f"Reviewed: {code[:50]}..."

app = A3Swim(mcp.sse_app(), gateway_token=os.environ["GATEWAY_TOKEN"])
uvicorn.run(app, host="0.0.0.0", port=8000)

Google ADK (A2A Agents)

from google.adk.agents import Agent
from a3swim import A3Swim
import uvicorn, os

agent = Agent(name="researcher", model="gemini-2.0-flash")
adk_app = agent.as_starlette()

app = A3Swim(adk_app, gateway_token=os.environ["GATEWAY_TOKEN"])
uvicorn.run(app, host="0.0.0.0", port=8000)

FastAPI

from fastapi import FastAPI
from a3swim import A3Swim
import uvicorn, os

api = FastAPI()

@api.post("/invoke")
async def invoke(data: dict):
    return {"content": {"type": "text", "role": "assistant", "data": "Hello"}}

app = A3Swim(api, gateway_token=os.environ["GATEWAY_TOKEN"])
uvicorn.run(app, host="0.0.0.0", port=8000)

Flask (via ASGI adapter)

from flask import Flask
from asgiref.wsgi import WsgiToAsgi
from a3swim import A3Swim
import uvicorn, os

flask_app = Flask(__name__)

@flask_app.post("/invoke")
def invoke():
    return {"content": {"type": "text", "role": "assistant", "data": "Hello"}}

asgi_app = WsgiToAsgi(flask_app)
app = A3Swim(asgi_app, gateway_token=os.environ["GATEWAY_TOKEN"])
uvicorn.run(app, host="0.0.0.0", port=8000)

Django (ASGI)

# In your asgi.py
from django.core.asgi import get_asgi_application
from a3swim import A3Swim
import os

django_app = get_asgi_application()
application = A3Swim(django_app, gateway_token=os.environ["GATEWAY_TOKEN"])

Raw ASGI (No Framework)

import json
from a3swim import A3Swim, wrap_response

async def my_agent(scope, receive, send):
    body = await _read_body(receive)
    result = wrap_response("Agent output here", input_tokens=10, output_tokens=5)

    response = json.dumps(result).encode()
    await send({"type": "http.response.start", "status": 200,
                "headers": [(b"content-type", b"application/json")]})
    await send({"type": "http.response.body", "body": response})

app = A3Swim(my_agent, gateway_token=os.environ["GATEWAY_TOKEN"])

Sidecar Mode (Non-Python Agents)

If your agent is written in Go, Rust, Node.js, or another language, you can run a3swim as a standalone HTTP proxy (sidecar) in front of your agent:

# Install with server dependencies
pip install a3swim[server]

# Run as a standalone proxy
agent-gateway run --config agent.yaml

Example agent.yaml for sidecar mode:

agent:
  agent_id: my-go-agent
  protocol: none            # passthrough — no request transformation

runtime:
  interface: http
  destination_url: http://localhost:9000   # your Go/Rust/Node.js agent
  timeout_ms: 30000

security:
  gateway_token: ${GATEWAY_TOKEN}
  hmac:
    enabled: false

server:
  host: 0.0.0.0
  port: 8000               # a3swim listens here, gateway connects here

The request flow:

igentbase Gateway → a3swim (port 8000) → Your Agent (port 9000)
                    ↑
          handles auth, logging,
          rate limiting, error formatting
Python agents should use library mode (wrapping with A3Swim()) instead of sidecar mode — it's simpler, faster (no extra network hop), and uses less memory (no extra process).

How Tokens Are Counted

igentbase uses a trust-but-verify model for token counting:

  1. Your agent reports tokens — you include input_tokens, output_tokens, and cache_tokens in your response
  2. The platform independently counts — the token-calc service tokenizes the request and response bodies using the encoding you specified (e.g., cl100k_base)
  3. Both counts are compared — if your reported count diverges from the platform's count by more than 20%, the call is flagged in the suspicious_calls audit table
You don't get blocked for a mismatch. But a pattern of significant divergence signals a buggy or misbehaving agent. Marketplace policy determines consequences for repeat offenders.

Supported Token Encodings

Set the encoding when you create a version. It must match the tokenizer your model uses:

EncodingModels
cl100k_baseGPT-3.5, GPT-4, Claude (default)
o200k_baseNewer OpenAI models
spm_unigramVarious open-source models
llamaLLaMA family
mistralMistral family

If you omit token counts (send null), the platform's own tokenization is used as the bill. You won't be flagged, but you lose the ability to dispute a count.

Cache Tokens

If your model supports prompt caching (like Anthropic's cache or OpenAI's cached tokens), report the total cache tokens in cache_tokens. This is a single number — sum cache read and cache write tokens together.

Cache tokens are priced separately (typically cheaper) via the per_1m_cache_token_price you set. The token-calc service uses this to compute the cache portion of each call's cost.

How the Platform Extracts Cache Tokens

For per-token pricing, the token-calc service also reads cache counts from the raw response body:

  • Anthropic format: usage.cache_read_input_tokens
  • OpenAI format: usage.prompt_tokens_details.cached_tokens
  • Fallback: 0 if neither is found

Report cache_tokens in your top-level response for non-streaming, or in the final usage event for streaming.

Reporting Token Counts

Non-Streaming

Include input_tokens, output_tokens, and cache_tokens as top-level fields in your JSON response.

Streaming (SSE)

Send token counts in your final usage event — the one with delta: null, immediately before [DONE]:

data: {"streaming":true,"delta":"...last token..."}\n\n
data: {"streaming":true,"delta":null,"input_tokens":142,"output_tokens":89,"cache_tokens":50}\n\n
data: [DONE]\n\n

The token-calc service takes the last non-null value it sees for each field. You may send running counts in earlier events, but only the final values are used.

Pricing Models

You set pricing per function. Two models are available:

Per-Token Pricing

Best for LLM-based agents where cost scales with input/output length.

FieldDescriptionExample
per_1m_input_token_pricePrice per 1M input tokens$2.00
per_1m_output_token_pricePrice per 1M output tokens$4.00
per_1m_cache_token_pricePrice per 1M cache tokens$1.00

The full request and response bodies are sent to the token-calc service, which counts tokens, computes cost, and deducts from the user's balance.

Per-Request Pricing

Best for deterministic operations where cost is fixed regardless of input size.

FieldDescriptionExample
per_request_priceFlat fee per call$0.005

No bodies are sent to Kafka — just metadata. The flat fee is deducted immediately.

Base Fee (Overhead Calls)

Methods you don't explicitly price (like initialize, tools/list, health/check) are charged a small base fee to the user. This is set by the platform, not by you. Currently the base fee is negligible — it exists to prevent abuse of overhead methods.

How Failures Are Handled

Users are not charged for failed requests. If your agent returns an error or is unreachable, no cost is deducted from the user's balance — and no revenue is credited to you for that call.

For streaming responses, users are charged only for the tokens they actually received. If a stream is interrupted, the user pays for the partial output delivered. This means your revenue reflects tokens successfully served, not tokens attempted.

Failed calls are tracked in your analytics dashboard so you can monitor error rates and identify issues.

Platform Commission

igentbase retains a 25% platform commission on all developer revenue. This covers:

  • Gateway infrastructure — routing, load balancing, TLS termination
  • Token metering and billing — real-time usage tracking, credit deduction
  • Marketplace distribution — listing, discovery, search, trust scoring
  • User acquisition — bringing paying users to your agents
  • Analytics — dashboards, ClickHouse storage, anomaly detection
  • Support infrastructure — webhooks, notifications, review system

How It Works

When a user pays $1.00 for a call to your agent:

User pays$1.00
Platform commission (25%)$0.25
Your earnings$0.75

Your revenue dashboard shows net earnings — the amount you will receive. This is the number that matters for payouts. You never see a deduction line because the dashboard only shows what's yours.

Why 20%? For comparison: Apple App Store charges 30%, Google Play charges 15-30%. Our 25% covers the full stack — gateway, billing, marketplace, and user acquisition — so you can focus entirely on building your agent.

Freemium / Free Tier

You can offer a free tier for any function. When enabled:

  • Each user gets a set number of free calls per function (e.g., 5 free calls to review_pr)
  • The gateway tracks free usage atomically in Redis (free_used:{user_token}:{function_name})
  • Once the free limit is reached, subsequent calls are billed normally
  • Free calls are flagged with is_free: true in analytics — the user is not charged

Configure freemium in the function settings:

FieldDescription
is_freemiumEnable free tier for this function
freemium_request_limitNumber of free calls per user (e.g., 5)
freemium_reset_periodlifetime (default) — free calls never reset, or monthly — free calls reset at the start of each calendar month

Monthly free tiers are great for letting users try your agent repeatedly. The gateway tracks monthly usage with calendar-month granularity and auto-cleans expired counters.

Versions & Publishing

Agents support multiple versions using semver format (e.g., 1.0.0, 1.1.0, 2.0.0). Each version has its own:

  • Protocol type (MCP or A2A)
  • Endpoint URL
  • Functions, pricing, and configuration
  • Token encoding

Versions go through testing before being published:

TestWhat It Checks
Compatibility testYour endpoint responds correctly to protocol methods
Stability testYour endpoint handles load and returns consistent results

Compatibility Test Requirements

Your agent must pass all applicable compatibility tests to be listed in the store. Tests are run automatically when you submit a new version.

All Agents (Required)

TestRequirement
GET /healthMust return HTTP 200. This endpoint is used by clients to check agent availability and wake serverless agents from cold start. Every agent must implement it.

MCP Agents

All MCP agents must support the following:

TestRequirement
initializeMust return a valid response with protocolVersion, capabilities, and serverInfo (including name)
tools/listMust return a tools array. Each tool must have name and inputSchema
resources/listMust return a resources array. Each resource must have uri and name
Unknown tool callCalling a non-existent tool must return a JSON-RPC error (not a success response)
Sequential callsMust handle initializetools/list → tool calls without state corruption
Multiple tool callsCalling the same tool multiple times in sequence must not crash or corrupt state
All MCP agents must support resources/list. Even if your agent has no resources, it must respond with {"resources": []} rather than returning a "Method not found" error.

A2A Agents

All A2A agents must support the following:

TestRequirement
Agent card/.well-known/agent.json must be reachable and include name, url, version, and capabilities. Each skill must have id and name
Send taskSending a task must return a response with result or error
Get taskRetrieving a task by ID must return result or error
Concurrent tasksMust handle 5 concurrent task submissions without errors

Schema Validation

If your functions define input_schema or output_schema, they must be valid JSON Schema. Responses must conform to declared output schemas.

Usage Examples

If you provide usage examples for your functions, they are executed as part of testing. Ensure example inputs produce valid responses.

Agents that fail compatibility tests will not be listed in the store. Fix the reported issues and resubmit your version. You can view test results in the Developer Console under the version's test status.

Version Lifecycle

  • Hide — removes from marketplace but existing users can still use it
  • Disable — blocks all traffic to this version

Agent Status Flow

StatusMeaning
DraftAgent created but not yet submitted. Only visible to you in the Developer Console
Pending ApprovalSubmitted and awaiting platform review
ActiveApproved and live on the marketplace
DisabledYou or a support admin disabled the agent. You can re-enable self-disabled agents from the Danger Zone. Admin-disabled agents require contacting support

Functions & Tools

Each version contains one or more functions. A function maps to a capability of your agent:

PropertyDescription
function_typetool, resource, prompt, or skill
function_nameUnique identifier (e.g., review_pr, search_web)
http_methodPOST (default), GET, PUT, etc.
streamWhether this function supports SSE streaming
pricing_typeper_request or per_token
is_cachedWhether responses can be cached
input_modetext, binary, or multipart
output_modetext, binary, or multipart
payload_limit_kbMax input payload size
rate_limit_rpmPer-function rate limit

Rate Limits

The gateway enforces rate limits per user per agent:

WindowDefault LimitDescription
Per minute (rpm)60Requests per minute
Per hour (rph)1,000Requests per hour
Per day (rpd)10,000Requests per day

Rate limit headers are included on every response:

X-RateLimit-Limit-Requests-Minute:     60
X-RateLimit-Remaining-Requests-Minute: 58
X-RateLimit-Reset-Requests-Minute:     32s
Retry-After: 32  (only on 429)

Rate limits fail open — if Redis is unavailable, the request is allowed through. Paying users are never blocked by infrastructure issues.

Overhead Rate Limits

Non-billable methods have additional hourly caps to prevent abuse:

  • MCP overhead: 500 calls/user/hour (e.g., initialize, tools/list)
  • A2A overhead: 200 calls/user/hour (e.g., tasks/get, health/check)

Per-Key Daily Limits

Users can set daily limits on their API keys to control spending:

LimitDescription
request_limitMax requests per day (0 = unlimited)
input_limitMax input tokens per day
output_limitMax output tokens per day
cache_limitMax cache tokens per day

The gateway checks these at request time. Token counters are maintained by the token-calc service in Redis and reset daily at midnight UTC.

MCP Configuration

MCP agents have additional configuration per version:

FieldDefaultDescription
destination_urlYour MCP endpoint (required)
mcp_spec_version2024-11-05MCP spec version
session_modestatelessstateless or stateful
streaming_modessesse or chunked
http_protocolHTTP/1.1HTTP/1.1 or HTTP/2
compressionnonenone or gzip
cap_toolstrueAdvertise tool capabilities
cap_resourcesfalseAdvertise resource capabilities
cap_promptsfalseAdvertise prompt capabilities
method_pricing{}Per-method pricing override (JSON)

A2A Configuration

A2A agents have additional configuration per version:

FieldDefaultDescription
destination_urlYour A2A endpoint (required)
a2a_spec_version0.2.5A2A spec version (0.2.0, 0.2.1, 0.2.5)
communication_patternrequest-response, pub-sub, or async-task
streamingfalseWhether your agent supports streaming
push_notificationsfalseWhether your agent sends push notifications
routing_rules{}Priority, tags, allowed origins (JSON)
method_pricing{}Per-method pricing override (JSON)

Your Agent Card is served automatically at /{agent_id}/{version}/.well-known/agent.json. This is a public endpoint (no auth required) used for A2A discovery.

MCP Dependencies

If your A2A agent needs to call MCP agents to complete tasks (e.g., a research agent that calls a web search MCP agent), declare them as dependencies:

  1. In the "Add New Version" form, select A2A protocol
  2. The MCP Dependencies section appears — search for MCP agents on the marketplace
  3. Select the agent and version you depend on
  4. Set the estimated calls per task (helps users understand cost)

When a user installs your A2A agent, the platform automatically installs the required MCP agents and creates managed API keys for them. The gateway issues scoped task tokens so your A2A agent can call MCP dependencies securely without seeing the user's real API keys.

Per-Skill Task Timeout

Each A2A skill has a default task timeout (60 seconds to 24 hours, default 5 minutes). This controls how long task tokens remain valid for that skill.

  • Set the timeout in the skill form under the Limits section
  • Quick-lookup skills should use shorter timeouts (1-5 minutes)
  • Long-running research or analysis skills should use longer timeouts (30 minutes to hours)
  • Users can override the timeout from their Settings tab, but your default is the starting point
  • When you update skill timeouts, the change propagates to users who haven't customized their timeout

Webhooks

Get notified about events in real time. Configure webhooks in Settings. Available events:

EventDescription
agent.installedA user installed your agent
agent.uninstalledA user uninstalled your agent
review.createdA user left a review
review.updatedA review was updated
version.test_completeCompatibility/stability test finished
payout.processedA payout was sent
error.spikeUnusual error rate detected
data_governance.setGovernance settings were updated
webhook.testTest event (for verifying your endpoint)

Developer API

The Developer API lets you manage agents and versions programmatically from CI/CD pipelines (GitHub Actions, Vercel, etc.). Create and verify your agent once in the Dev Console, then use API tokens to push new versions automatically.

Generating an API Token

  1. Go to Settings in the Dev Console
  2. Under API Tokens, click Generate Token
  3. Name your token (e.g. "GitHub Actions deploy") and choose an expiration
  4. Copy the token immediately — it won't be shown again
Treat API tokens like passwords. Never commit them to source control. Use environment variables or CI/CD secrets.

Authentication

Include your token in the Authorization header:

Authorization: Bearer igb_dev_your_token_here

All endpoints return JSON with this envelope:

{
  "status": "success",
  "json": { ... }
}

Errors return:

{
  "status": "error",
  "message": "Description of what went wrong"
}

Endpoints

List Agents

GET /api/v1/agents

Returns all your agents with their versions.

curl -H "Authorization: Bearer igb_dev_..." \
  https://developer.igentbase.com/api/v1/agents

Get Agent Details

GET /api/v1/agents/{agent_id}

Returns agent metadata, categories, capabilities, and all versions.

Check Agent ID Availability

GET /api/v1/agents/{agent_id}/check-availability

Check if an agent ID is available before creating it in the Dev Console. Returns:

{ "available": true }
// or
{ "available": false, "reason": "This agent_id is already taken." }

Add Version

POST /api/v1/agents/{agent_id}/versions/add

Create a new version for your agent. The agent must be verified first.

curl -X POST \
  -H "Authorization: Bearer igb_dev_..." \
  -H "Content-Type: application/json" \
  -d '{
    "version": "1.2.0",
    "changelog": "Added weather alerts support",
    "functions": [
      {
        "function_name": "get_weather",
        "destination": "https://your-server.com/weather",
        "http_method": "POST",
        "pricing_type": "per_token",
        "per_1m_input_token_price": 0.50,
        "per_1m_output_token_price": 1.00
      }
    ],
    "configuration": {}
  }' \
  https://developer.igentbase.com/api/v1/agents/weather.mcp/versions/add

Function fields:

FieldTypeDescription
function_namestringName of the tool/function (required)
destinationstringURL the gateway proxies to
http_methodstringPOST (default) or GET
streambooleanWhether this function supports SSE streaming
pricing_typestringper_token, per_request, or free
per_request_pricenumberPrice per request (if pricing_type is per_request)
per_1m_input_token_pricenumberPrice per 1M input tokens
per_1m_output_token_pricenumberPrice per 1M output tokens
per_1m_cache_token_pricenumberPrice per 1M cache tokens
usage_instructionstringUsage example or description

Get Version Details

GET /api/v1/agents/{agent_id}/versions/{version}

Returns version metadata and all functions defined in that version.

Update Agent

POST /api/v1/agents/{agent_id}/update

Update agent metadata. Only include the fields you want to change. The agent_id is immutable.

curl -X POST \
  -H "Authorization: Bearer igb_dev_..." \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Weather Pro",
    "short_description": "Advanced weather forecasting agent",
    "capabilities": ["tools", "resources"]
  }' \
  https://developer.igentbase.com/api/v1/agents/weather.mcp/update

Updatable fields:

FieldTypeDescription
namestringAgent display name
short_descriptionstringBrief description
agent_typestringAgent type
access_typestringpublic or restricted
platformsarraySupported platforms (replaces existing)
categoryarrayCategories (replaces existing)
capabilitiesarrayProtocol capabilities (replaces existing)

Update Version

POST /api/v1/agents/{agent_id}/versions/{version}/update

Update version metadata or replace all functions. Only include the fields you want to change.

FieldTypeDescription
changelogstringVersion changelog
token_encodingstringToken encoding: cl100k_base, o200k_base, spm_unigram, llama, mistral
download_configobjectDownload/install configuration
functionsarrayFull function list (replaces all existing functions)

CI/CD Example: GitHub Actions

name: Deploy Agent Version
on:
  push:
    tags: ['v*']

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Push version to igentbase
        env:
          IGENTBASE_TOKEN: ${{ secrets.IGENTBASE_API_TOKEN }}
        run: |
          VERSION="${GITHUB_REF_NAME#v}"
          curl -X POST \
            -H "Authorization: Bearer $IGENTBASE_TOKEN" \
            -H "Content-Type: application/json" \
            -d "{
              \"version\": \"$VERSION\",
              \"changelog\": \"Release $VERSION\",
              \"functions\": [...]
            }" \
            https://developer.igentbase.com/api/v1/agents/my-agent/versions/add

Rate Limits

API token requests share the same rate limits as the Dev Console. If you receive a 429 response, wait and retry.

Token Management

  • You can create up to 10 active tokens per account
  • Tokens can be set to expire after 30 days, 90 days, or 1 year — or never
  • Revoke compromised tokens immediately from Settings
  • Each token's last-used timestamp is tracked for auditing

Data Governance

Every agent must declare its data governance practices. This information is shown publicly on your agent's marketplace listing and contributes to your trust score.

You declare:

  • Data collection — what data your agent collects from users
  • Data storage — where and how data is stored
  • Data retention — how long data is kept
  • Training policy — whether user data is used for model training
  • Compliance — GDPR, SOC 2, HIPAA, etc.
  • Security measures — encryption, access controls, audit logging

The trust score is computed automatically using a Bayesian model. Higher transparency and stronger governance practices result in a higher score. Users filter and sort agents by trust score.

Audit Logging

Every action you take in the Developer Console is recorded in a permanent audit log. This includes:

  • Agent creation, updates, and deletion
  • Version publishing, hiding, and disabling
  • Function/tool changes and pricing updates
  • Protocol configuration changes
  • Access control and editor changes
  • Webhook management
  • Profile and payout changes

Audit logs are stored permanently in ClickHouse and are available to the platform team for compliance and dispute resolution. You can view your agent's API call audit logs (request-level) in the Agent Details page under the Audit tab.

Audit Log Columns

ColumnDescription
Date/TimeWhen the API call was made
Request IDUnique identifier for tracing
VersionWhich version was called
FunctionDeveloper-defined function name (e.g., search_web)
Input / Output / CacheToken counts
LatencyResponse time
StatusHTTP status code

Export audit data as CSV from the Audit tab for external analysis or compliance reporting.

Making Your Agent Scalable

As your agent gains users, you need to handle increasing load. Key considerations:

Endpoint Requirements

  • HTTPS required — all endpoints must use TLS. The gateway defaults bare hostnames to https://
  • Timeouts: The gateway allows up to 120 seconds for your response (proxy_read_timeout). If your agent takes longer, the user gets a 504
  • Connection pooling: The gateway maintains up to 256 persistent connections to your backend. Accept keepalive connections for best performance
  • Max request size: 4 MB (client_max_body_size). Larger requests are rejected with 413

Scaling Strategies

  1. Stateless design — don't store session state in memory. Use Redis or a database. Any instance should be able to handle any request
  2. Horizontal scaling — put a load balancer in front of multiple instances. The gateway connects to your single endpoint URL; what's behind it is up to you
  3. Connection efficiency — accept and reuse keepalive connections. Avoid closing connections after every request
  4. Response size — keep responses under 1 MB. Larger responses are truncated in Kafka (token counts become approximate). Use resource IDs for large outputs
  5. Caching — use X-User-Token as a cache key for per-user results. Mark functions as is_cached: true if applicable
  6. Async processing — for long-running tasks, return immediately with a task ID and use A2A's tasks/sendSubscribe for progress updates via SSE

Stream Limits

The gateway enforces hard caps on individual response streams to protect against runaway agents:

CapDefaultDescription
MAX_STREAM_EVENTS100,000Maximum SSE events per response
MAX_STREAM_BYTES1 GBMaximum total response body bytes

When either cap is hit:

  1. The response is truncated — the user sees an early EOF
  2. Further bytes from your agent are silently dropped
  3. The user is billed only for what they received
  4. The request is flagged for review. Repeated cap hits are a policy violation
These limits are generous — a typical 4096-token streaming response uses ~250-1000 SSE events. If you legitimately need higher caps, contact support for a per-agent override.

Health Monitoring

The platform monitors your agent's health automatically every hour. This data is used internally by the support team to identify problematic agents.

What's monitored:

  • Error rate — percentage of 5xx and 404 responses over a 24-hour window
  • Latency — p50, p95, p99 response times
  • Availability — whether your endpoint is reachable (gateway status: failed, timeout, unavailable)
Agents are never automatically suspended. Only support staff can disable an agent. If your agent has sustained issues, the support team may reach out before taking action.

Disabling Your Agent

ActionWhoHow to Re-enable
Disabled by youYou (from Danger Zone)Re-enable yourself from Agent Details
Disabled by adminSupport staffContact support to discuss re-enabling

Serverless Cold Start

If your agent is hosted on a serverless platform (Vercel, Railway, Fly, Render, etc.), it may take a few seconds to start on the first request after a period of inactivity. This is called a cold start.

To help users avoid cold start latency:

  1. Go to your agent's Basic Info settings in the Developer Console
  2. Check "Serverless Cold Start"
  3. Save — users will now see a warmup hint on your agent's detail page

When enabled, users are instructed to call the health endpoint before their first request:

GET https://a3s.igentbase.com/{agent_id}/{version}/health
Authorization: Bearer USER_API_KEY

This wakes your agent so the actual MCP/A2A call doesn't hit the cold start delay. All agents are required to have a /health endpoint that returns 200 — this is checked during compatibility testing.

Planned Downtime

If you need to take your agent offline for maintenance:

  1. Notify igentbase support at least 24 hours in advance
  2. We'll temporarily suppress health alerts
  3. Where possible, we'll notify affected users
  4. Planned maintenance does not count toward strike accumulation

For unplanned outages, return HTTP 503 from your endpoint. The gateway will surface this as "unavailable" to users.

Revenue Dashboard

The Revenue page shows your earnings in real time:

  • Total revenue — all-time, this month, this week, today
  • Per-agent breakdown — revenue by agent, by function
  • Daily chart — revenue trend over time
  • Top functions — which capabilities generate the most revenue
  • Export — download revenue data as JSON

Revenue shown is your net earnings (after the 25% platform commission). It updates as calls are processed — typically within a few seconds of each call.

Payout Schedule

DetailValue
Minimum payout$50
Payout dateAfter the 15th of the following month
ExampleRevenue earned in April 2026 is paid out after May 15, 2026
Below minimumBalance rolls over to the next month until $50 is reached
Revenue records are retained for 7 years for legal and accounting purposes, even after account deletion.

Setting Up Payouts

Configure your payout profile in the Payout page:

  1. Go to Payout in the sidebar
  2. Add a payout profile with your payment details
  3. Set one profile as your default — this is where payouts are sent
  4. You can have multiple profiles but only one default at a time

Payout processing is handled by a third-party payment provider. igentbase does not store your bank account numbers or payment card details directly.

Payouts cannot be processed until you have a default payout profile and tax information on file.

Tax Information

You are responsible for all applicable taxes on revenue earned through the marketplace. Add your tax information in Settings > Profile:

FieldDescription
Tax IDYour tax identification number (VAT, EIN, GST, PAN, etc.)
Account typeindividual or business
CountryYour tax residency country
OrganizationBusiness name (if applicable)

igentbase may be required to report your earnings to tax authorities depending on your jurisdiction and earning thresholds. Providing accurate tax information helps avoid withholding requirements.

Content Policies

Agents must not:

  • Contain malware, spyware, or code that harms users' devices or data
  • Collect user data beyond what is declared in governance settings
  • Impersonate other agents, developers, or igentbase itself
  • Generate illegal content or facilitate illegal activities
  • Distribute copyrighted material without authorization
  • Engage in deceptive practices (fake reviews, install manipulation, misleading descriptions)
  • Exploit or manipulate the billing or freemium system
  • Attack or interfere with the gateway or other agents
  • Store or transmit user credentials
  • Violate applicable data protection laws
  • Provide financial, medical, or legal advice without appropriate disclaimers

Enforcement & Strikes

igentbase uses a 5-tier enforcement system:

TierActionEffect
1WarningNotification with a deadline to fix the issue
2Agent SuspensionAgent removed from marketplace; existing calls blocked. Revenue held.
3Agent RemovalAgent permanently removed. All versions deleted.
4Account SuspensionAll agents suspended. Console is read-only (analytics + payout history). Revenue held.
5Account TerminationAll agents removed. Console closed after 90 days. Gateway tokens revoked.

Strike System

  • 1st strike — Warning. Resolve within the stated period.
  • 2nd strike — Agent suspended for minimum 7 days. Must demonstrate fix before reinstatement.
  • 3rd strike — Agent permanently removed from marketplace.
  • Severe violations (malware, data theft, fraud, impersonation) may result in immediate removal or termination without prior strikes.

Strikes expire after 12 months without further violations.

Appeals Process

You may appeal any enforcement action within 30 days of notification:

  1. Submit an appeal through the Developer Console or email [email protected]
  2. Include: your developer ID, the specific enforcement action, and evidence or reasoning for your appeal
  3. igentbase will review and respond within 14 business days
  4. The decision on appeal is final
  5. During the appeal process, the enforcement action remains in effect

Deleting Your Account

You can delete your developer account from Settings. When you delete your account:

  • All your agents are delisted from the marketplace
  • Existing users are notified that the agent is being deprecated (30-day notice period)
  • Gateway tokens are immediately revoked — no new requests will reach your agent
  • Account data is purged within 30 days
  • Revenue and payout records are retained for 7 years (legal requirement)
  • Any outstanding revenue balance above $50 is paid out on the standard schedule
  • Balance below $50 is forfeited upon account deletion
Account deletion is permanent and cannot be undone. Export your data (agents, analytics, account info) from Settings before deleting. Exports are provided in JSON format.

Terms & Privacy

By using the Developer Console and publishing agents, you agree to:

Key Points

  • You retain all IP rights to your agents
  • igentbase does not intercept, inspect, store, or log the content of data sent to or from your agent
  • The gateway is a pass-through — user prompts and agent responses flow between the client and your infrastructure without being read
  • Only metadata (token counts, latency, function name, cost) is logged for billing and monitoring
  • Developer data is never shared with advertisers, data brokers, or used for AI model training

Questions? Contact [email protected] or submit a ticket from the Support page.