AI SDK
The AI SDK gives you programmatic access to Collate’s AI Studio — create personas and agents, invoke them via the API, and stream responses in real time. Available across Python, TypeScript, Java, and a standalone CLI.Available SDKs
| SDK | Package | Install |
|---|---|---|
| Python | data-ai-sdk | pip install data-ai-sdk |
| TypeScript | @openmetadata/ai-sdk | npm install @openmetadata/ai-sdk |
| Java | org.open-metadata:ai-sdk | Maven / Gradle |
| CLI | ai-sdk | Install script |
Prerequisites
You need:- A Collate instance with AI Studio Agents enabled
- A Bot JWT token for API authentication
Configuration
Set the following environment variables:| Variable | Required | Default | Description |
|---|---|---|---|
AI_SDK_HOST | Yes | - | Your Collate server URL |
AI_SDK_TOKEN | Yes | - | Bot JWT token |
AI_SDK_TIMEOUT | No | 120 | Request timeout in seconds |
AI_SDK_VERIFY_SSL | No | true | Verify SSL certificates |
AI_SDK_MAX_RETRIES | No | 3 | Number of retry attempts |
AI_SDK_RETRY_DELAY | No | 1.0 | Base delay between retries (seconds) |
Client Initialization
Manage Personas
A persona defines the behavioral instructions and personality of an AI Studio Agent. Each persona contains a system prompt that shapes how the agent responds. Multiple agents can share the same persona.Create a Persona
| Field | Type | Required | Description |
|---|---|---|---|
name | string | Yes | Unique identifier (alphanumeric, no spaces) |
description | string | Yes | Role and behavior description |
prompt | string | Yes | System prompt prepended to every agent conversation |
display_name | string | No | Human-readable name (defaults to name) |
provider | string | No | Default LLM provider: openai, anthropic, azure_openai (default: openai) |
List Personas
Get a Persona by Name
Manage Agents
An agent combines a persona’s behavioral instructions with Collate’s MCP tools to form a purpose-built AI assistant. Agents must be API-enabled to be invoked via the SDK.Create an Agent
| Field | Type | Required | Description |
|---|---|---|---|
name | string | Yes | Unique identifier (alphanumeric, PascalCase/camelCase) |
description | string | Yes | Purpose shown in AI Studio |
persona | string | Yes | Name of an existing persona |
display_name | string | No | Human-readable name (defaults to name) |
api_enabled | boolean | No | Must be true for SDK invocation (default: false) |
abilities | array | No | Allowed MCP tool names (all tools if omitted) |
prompt | string | No | Additional system prompt appended to persona’s base prompt |
provider | string | No | LLM provider: openai, anthropic, azure_openai (default: openai) |
bot_name | string | No | Collate bot for metadata operations |
search_metadata, get_entity_details, get_entity_lineage,
create_glossary, create_glossary_term, create_lineage, patch_entity
List Agents
Invoke an Agent
Send a message to an API-enabled agent and receive a response.Single Invocation
| Field | Type | Description |
|---|---|---|
conversation_id | string | Use for multi-turn follow-ups |
response | string | The agent’s text response |
tools_used | array | MCP tools the agent invoked |
usage | object | Token usage (prompt_tokens, completion_tokens, total_tokens) |
Streaming
Use streaming to receive real-time output as the agent generates its response.| Type | Fields | Description |
|---|---|---|
start | conversation_id | Agent started processing |
content | content | Text chunk from the response |
tool_use | tool_name | Agent is invoking an MCP tool |
end | - | Response complete |
error | error | An error occurred |
Multi-Turn Conversations
TheConversation class automatically manages context across messages.
Async Support (Python)
All sync methods have async counterparts with thea prefix:
| Sync | Async |
|---|---|
agent.call() | await agent.acall() |
agent.stream() | async for event in agent.astream() |
conv.send() | await conv.asend() |
Error Handling
| Code | Exception | Description |
|---|---|---|
401 | AuthenticationError | Invalid or expired JWT token |
403 | AgentNotEnabledError | Agent exists but is not API-enabled |
404 | AgentNotFoundError | No agent with the given name exists |
409 | CONFLICT | Agent or persona with the same name already exists |
429 | RateLimitError | Too many requests — retry after the indicated delay |
500 | AgentExecutionError | Internal error during agent execution |
CLI
MCP Tools
Collate exposes an MCP server that turns your metadata into a set of tools any LLM can use. Unlike generic MCP connectors that only read raw database schemas, Collate’s MCP tools give your AI access to the full context of your data platform — descriptions, owners, lineage, glossary terms, tags, and data quality results. The MCP endpoint is available atPOST /mcp using the JSON-RPC 2.0 protocol.
Available Tools
| Tool | Description |
|---|---|
search_metadata | Search across all metadata in Collate (tables, dashboards, pipelines, topics, etc.) |
semantic_search | AI-powered semantic search that understands meaning and context beyond keyword matching |
get_entity_details | Get detailed information about a specific entity by ID or fully qualified name |
get_entity_lineage | Get upstream and downstream lineage for an entity |
create_glossary | Create a new glossary in Collate |
create_glossary_term | Create a new term within an existing glossary |
create_lineage | Create a lineage edge between two entities |
patch_entity | Update an entity’s metadata (description, tags, owners, etc.) |
get_test_definitions | List available data quality test definitions |
create_test_case | Create a data quality test case for an entity |
root_cause_analysis | Analyze root causes of data quality failures |