Documentation Index
Fetch the complete documentation index at: https://docs.getcollate.io/llms.txt
Use this file to discover all available pages before exploring further.
Environment Variables Reference
The Hybrid Runner supports two ways to pass environment variables to ingestion pods.Setting Environment Variables
-
Via
config.ingestionPods.customConfig.containerParams.env: UsecontainerParamsto set environment variables on ingestion pods.
Runner Environment Variables
-
DYNAMIC_INGESTION_VERSION_ENABLED: When enabled, the Runner automatically resolves the ingestion pod image tag to match your Collate server version. Only disable this if you mirror Collate images to your own private registry and manage versioning manually:
Note: It is recommended to keep
DYNAMIC_INGESTION_VERSION_ENABLED value to true. This ensures ingestion fixes and updates are applied automatically without having to manage image tags yourself.Advanced Configuration
Use the options below to customise workflow behaviour, configure container settings, and run multiple Hybrid Runner instances in the same cluster.Defining Workflow Fields
Override workflow-level and container-level fields using custom configuration.-
Workflow Parameters (
workflowParams) UseworkflowParamsto override workflow-level fields. A typical use case is defining tolerations or pod affinity/anti-affinity:
-
Container Parameters (
containerParams) UsecontainerParamsto override container-level fields such as environment variables:
Note: Custom workflow fields require Hybrid Runner Helm chart version 1.12.5 or later.
Adding pod labels
Set pod labels on ingestion pods for pod security policies, cost attribution, or workload identification.Note: Pod labeling requires Hybrid Runner Helm chart version 1.12.9 or later.
Argo Workflows executor
Set pod labels viaconfig.ingestionPods.customConfig.workflowParams:
Simple Kubernetes executor
Use apodSpecFilePath override file to set labels and annotations on ingestion
pod templates. Create a YAML file with a metadata block alongside any other
pod spec overrides:
Running Multiple Instances in the Same Cluster
Argo Workflows is a cluster-wide application. When deploying multiple Hybrid Runner instances in a single cluster, only one instance should install Argo Workflows. For each additional instance, set:Configuring Node Scheduling for Ingestion Pods
By default, ingestion pods are scheduled on any available node in your cluster. If your cluster uses node taints to isolate workloads, you must configure tolerations and node affinity so ingestion pods can be scheduled on the correct nodes.If every node in your cluster has a
NoSchedule taint and no tolerations are
configured, ingestion pods will fail to schedule and remain stuck in Pending
state. This is the most common cause of scheduling failures after deploying the
Hybrid Runner.Choosing the Right Configuration Key
The configuration key depends on which executor your Hybrid Runner uses:| Executor | Environment variable | When to use |
|---|---|---|
| Argo Workflows (default) | ARGO_PIPELINE_TYPE_CONFIGS | Hybrid Runner deployed with installArgoWorkflows: true |
| Simple Kubernetes | SIMPLEK8S_PIPELINE_TYPE_CONFIGS | Hybrid Runner deployed without Argo Workflows |
Do not set both environment variables. Each executor reads from its own
configuration key. Setting
SIMPLEK8S_PIPELINE_TYPE_CONFIGS when using Argo
has no effect, and vice versa.Argo Workflows Executor
Add the following to yourvalues.yaml. Replace openmetadata-hybrid-runner
with the taint value used in your cluster:
The
toleration field accepts a single string value. The runner creates a fixed dedicated=<value> Kubernetes toleration for all ingestion pods of that type. For full Kubernetes toleration objects (multiple taints, custom operators or effects), use config.ingestionPods.customConfig.workflowParams instead.Simple Kubernetes Executor
If you’re using the Simple Kubernetes executor without Argo Workflows, useSIMPLEK8S_PIPELINE_TYPE_CONFIGS instead:
Supported Configuration Fields
Each pipeline type (automation, metadata, profiler, lineage) supports
the following fields:
| Field | Description |
|---|---|
toleration | A single taint value string. The runner creates a dedicated=<value> Kubernetes toleration for ingestion pods of that type. For full toleration control, use config.ingestionPods.customConfig.workflowParams. |
affinity | Node and pod affinity rules — supports nodeAffinity, podAffinity, and podAntiAffinity |
nodeSelector | Key-value labels to target specific nodes |
priorityClass | Kubernetes priority class name for the pod |
resources | CPU and memory requests and limits |
Prometheus Metrics
The Hybrid Runner exposes operational metrics in a Prometheus-compatible format via an HTTP endpoint. These metrics provide insight into agent state, activity, and performance.The available metrics may evolve over time. Inspect the
/metrics endpoint
directly for the latest set of available metrics.Configuration
Configure the metrics endpoint in your Helm values:port: Port on which the metrics endpoint is served (default:8989).path: HTTP path for accessing metrics (default:/metrics).
Accessing Metrics
Once configured, access metrics at:Example Metric
1.0 means connected, 0 means disconnected.
Hybrid Runner Images
This section covers how to host Collate images in your own container registry and how to manage image tags when you prefer to control versioning yourself.Hosting Your Own Docker Images
To mirror Collate images to your own container registry (for example, Google Artifact Registry), set the following values. Mirror both the Hybrid Runner image and the Ingestion pod image — both are required for the Runner to function correctly.-
Hybrid Runner image: Sets the repository, tag, and pull credentials for the main Runner pod:
-
Ingestion pod image: Sets the repository, tag, and pull credentials for the pods that execute
ingestion jobs:
Managing Your Own Image Tags
By default, the Hybrid Runner automatically resolves image tags to match your Collate server version (for example,om-1.11.1-cl-1.11.1). To manage your
own tags, disable automatic resolution using the DYNAMIC_INGESTION_VERSION_ENABLED environment variable. For configuration details, see Runner Environment Variables.