Skip to main content

Deploy Collate

Complete Prerequisites and Deploy Argo Workflows before this step.

IAM Roles (ROSA)

This section is ROSA/AWS-specific. On non-AWS OpenShift, skip to ECR Credentials. Inject AWS credentials through your own mechanism (static Secret via envFrom, HashiCorp Vault, or your cloud’s workload identity), and set secretsManager.provider: db in the Helm values.

IRSA for Collate Application

A single IAM role (openmetadata-rosa-role) covers all AWS service access for the Collate pod: Secrets Manager and S3. It binds to the openmetadata service account created by the chart.
AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
NAMESPACE="openmetadata"
ROLE_NAME="openmetadata-rosa-role"

cat > trust-policy.json <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_ENDPOINT}"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "${OIDC_ENDPOINT}:sub": "system:serviceaccount:${NAMESPACE}:openmetadata"
        }
      }
    }
  ]
}
EOF

aws iam create-role \
  --role-name "$ROLE_NAME" \
  --assume-role-policy-document file://trust-policy.json

# Secrets Manager
aws iam attach-role-policy \
  --role-name "$ROLE_NAME" \
  --policy-arn arn:aws:iam::aws:policy/SecretsManagerReadWrite

# S3 — asset uploads and Argo workflow logs
aws iam attach-role-policy \
  --role-name "$ROLE_NAME" \
  --policy-arn arn:aws:iam::aws:policy/AmazonS3FullAccess

ECR Credentials

Collate images are hosted in a private AWS ECR registry. The Collate team will provide ECR credentials — use them to create an image pull secret in the openmetadata namespace:
oc create secret docker-registry collate-ecr-secret \
  -n openmetadata \
  --docker-server=<ecr-registry>.dkr.ecr.<region>.amazonaws.com \
  --docker-username=AWS \
  --docker-password=$(aws ecr get-login-password --region <region>)
ECR tokens expire every 12 hours. You must refresh the collate-ecr-secret before expiry or implement automated rotation.

Create the Collate Namespace

oc new-project openmetadata

Create Required Secrets

The chart reads the database password from an existing Kubernetes secret:
oc create secret generic openmetadata-db-secret \
  -n openmetadata \
  --from-literal=openmetadata-mysql-password=<db-password>

Configure values-openshift.yaml

Add the Helm chart repo:
helm repo add open-metadata https://helm.open-metadata.org/
helm repo update
Update the following required fields in values-openshift.yaml:
FieldDescriptionExample
image.repositoryECR image repository (provided by Collate)<ecr-account>.dkr.ecr.<region>.amazonaws.com/collate-customers-<region>
image.tagImage tag (provided by Collate)om-1.12.0-cl-1.12.0
serviceAccount.annotationsIRSA role ARNarn:aws:iam::<id>:role/openmetadata-rosa-role
openmetadata.config.database.hostRDS endpoint<cluster>.<region>.rds.amazonaws.com
openmetadata.config.database.auth.usernameDB usernameopenmetadata_user
openmetadata.config.elasticsearch.hostOpenSearch endpoint<domain>.opensearch.amazonaws.com
openmetadata.config.pipelineServiceClientConfig.argoWorkflows.ingestionImageECR ingestion image (provided by Collate)<ecr-account>.dkr.ecr.<region>.amazonaws.com/collate-customers-ingestion-<region>:<tag>
Annotations in the snippet below indicate which values are ROSA/AWS-specific and which apply to any OpenShift:
# Collate image from ECR (provided by the Collate team)
image:
  repository: <ecr-registry>.dkr.ecr.<region>.amazonaws.com/collate-customers-<region>
  tag: <image-tag>
  pullPolicy: IfNotPresent

# ECR pull secret
imagePullSecrets:
  - name: collate-ecr-secret

# [ROSA] IRSA — chart creates the SA and applies the annotation automatically.
# On non-AWS OpenShift, remove the annotation and inject credentials via a static Secret.
serviceAccount:
  create: true
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::<AWS_ACCOUNT_ID>:role/openmetadata-rosa-role

# [Any OpenShift] Route replaces Ingress on all OpenShift clusters.
ingress:
  enabled: false
route:
  enabled: true
  tls:
    enabled: true
    termination: edge
    insecureEdgeTerminationPolicy: Redirect

# [Any OpenShift] Required for restricted-v2 SCC on all OpenShift 4.x clusters.
podSecurityContext: {}
securityContext:
  allowPrivilegeEscalation: false
  capabilities:
    drop: [ALL]
  runAsNonRoot: true
  seccompProfile:
    type: RuntimeDefault

openmetadata:
  config:
    # Database connection
    database:
      host: <rds-endpoint>.<region>.rds.amazonaws.com
      port: 3306
      driverClass: com.mysql.cj.jdbc.Driver
      dbScheme: mysql
      databaseName: openmetadata_db
      auth:
        username: openmetadata_user
        password:
          secretRef: openmetadata-db-secret
          secretKey: openmetadata-mysql-password

    # OpenSearch / Elasticsearch connection
    elasticsearch:
      host: <opensearch-endpoint>
      port: 443
      searchType: opensearch
      scheme: https

    # Argo Workflows pipeline client
    pipelineServiceClientConfig:
      enabled: true
      type: "argoWorkflows"
      metadataApiEndpoint: "http://openmetadata:8585/api"
      argoWorkflows:
        ingestionImage: "<ecr-registry>.dkr.ecr.<region>.amazonaws.com/collate-customers-ingestion-<region>:<image-tag>"
        imagePullSecrets: "collate-ecr-secret"
        serviceAccountName: "openmetadata"

    # [ROSA] AWS Secrets Manager.
    # On non-AWS OpenShift, set provider: db to use the database as secrets backend.
    secretsManager:
      enabled: true
      provider: managed-aws

extraEnvs:
  # [ROSA] AWS SDK v2 requires AWS_REGION explicitly.
  - name: AWS_REGION
    value: "us-east-2"

# [Any OpenShift] /opt/openmetadata/logs requires a writable emptyDir under restricted-v2 SCC.
extraVolumes:
  - name: logs
    emptyDir: {}
extraVolumeMounts:
  - name: logs
    mountPath: /opt/openmetadata/logs

Deploy

helm upgrade --install openmetadata open-metadata/openmetadata \
  --namespace openmetadata \
  --values values-openshift.yaml
For subsequent upgrades:
helm upgrade openmetadata open-metadata/openmetadata \
  --namespace openmetadata \
  --values values-openshift.yaml \
  --reuse-values

Optional: Enable Prometheus Metrics

serviceMonitor:
  enabled: true

Post-Installation

Verify the Deployment

# Watch pods come up
oc get pods -n openmetadata -w

# Check migration init container logs
oc logs -n openmetadata -l app.kubernetes.io/name=openmetadata -c run-db-migrations --tail=50

# Confirm IRSA env vars are injected
oc exec -n openmetadata deployment/openmetadata -- env | grep -E "AWS_ROLE_ARN|AWS_WEB_IDENTITY"

# Get the application route URL
oc get route openmetadata -n openmetadata -o jsonpath='{.spec.host}'

# Health check
curl -s https://$(oc get route openmetadata -n openmetadata -o jsonpath='{.spec.host}')/healthcheck
Expected health check response: {"status":"OK"}

Configure Reindexing

After the initial deployment, trigger a re-index from the Collate UI: Settings → Collate → Re-Index

Environment Variables Reference

VariableDescriptionDefaultRequired
AWS_REGIONAWS region for SDK v2 clients (Secrets Manager, S3). AWS_DEFAULT_REGION set by the chart is insufficient for SDK v2.set in extraEnvsYes (ROSA)
AWS_ROLE_ARNIAM role ARN — injected automatically by ROSA OIDC when SA is annotatedinjected by ROSAAuto
AWS_WEB_IDENTITY_TOKEN_FILEPath to OIDC token — injected automatically by ROSAinjected by ROSAAuto

Troubleshooting

Pod Stuck in Pending

oc describe pod -n openmetadata -l app.kubernetes.io/name=openmetadata
SymptomCauseFix
Insufficient cpu / memoryCluster at capacityReduce resources.requests in values-openshift.yaml
ImagePullBackOffECR pull secret missing or expiredRecreate collate-ecr-secret with a fresh ECR token
unable to validate against any SCCSecurity context incompatible with SCCsEnsure securityContext in values matches restricted-v2 requirements

Migration Init Container Fails

oc logs -n openmetadata -l app.kubernetes.io/name=openmetadata -c run-db-migrations
ErrorFix
Unable to load region from any of the providersAdd AWS_REGION to extraEnvs in values-openshift.yaml
Unable to contact EC2 metadata service / STS errorIRSA annotation missing or trust policy mismatch — verify with oc get sa openmetadata -n openmetadata -o yaml
Communications link failureRDS unreachable — check openmetadata-db-secret and RDS security group
Access denied for userWrong DB credentials in openmetadata-db-secret

IRSA Not Working (ROSA Only)

# 1. Confirm SA has the annotation
oc get sa openmetadata -n openmetadata -o jsonpath='{.metadata.annotations}'

# 2. Confirm IRSA env vars are injected into the pod
oc exec -n openmetadata deployment/openmetadata -- env | grep AWS

# 3. Test credentials from inside the pod
oc exec -n openmetadata deployment/openmetadata -- aws sts get-caller-identity
If step 3 fails: verify the OIDC provider ARN in the trust policy exactly matches your cluster’s OIDC endpoint, and the sub condition matches system:serviceaccount:openmetadata:openmetadata exactly.

Appendix: OpenShift vs EKS Differences

AreaEKSROSA (OpenShift)
OIDC providerCreated manually via eksctlBuilt into ROSA automatically
Pod securityPod Security Admission (PSA)Security Context Constraints (SCC)
SCC requiredN/Arestricted-v2 for Collate; anyuid for Argo
IngressKubernetes IngressOpenShift Route
TLSManaged by ingress controller / cert-managerManaged by OpenShift router
Image namesShort names allowedFully qualified required (e.g. docker.io/...)
ECR pullNode IAM role or pull secretImage pull secret using Collate-provided credentials
CLIkubectloc (superset of kubectl)
Logs directoryWritable by defaultRequires emptyDir volume for /opt/openmetadata/logs
AWS SDK regionAWS_DEFAULT_REGION sufficientAWS_REGION required for SDK v2 Java clients