Documentation Index
Fetch the complete documentation index at: https://docs.getcollate.io/llms.txt
Use this file to discover all available pages before exploring further.
Kubernetes Deployment
Deploy Collate on Kubernetes for production-grade scalability and reliability.
Prerequisites
- Kubernetes cluster (1.19+)
- kubectl configured
- Helm 3.x installed
- Minimum 8 CPU cores and 16GB RAM
Quick Installation
Add Helm Repository
helm repo add open-metadata https://helm.open-metadata.org/
helm repo update
Install with Default Values
helm install collate open-metadata/collate
Install with Custom Values
# Download values file
curl -sL https://raw.githubusercontent.com/open-metadata/collate-helm-charts/main/charts/collate/values.yaml > values.yaml
# Edit values.yaml as needed
vim values.yaml
# Install
helm install collate open-metadata/collate -f values.yaml
Configuration
Basic Configuration
# values.yaml
collate:
config:
database:
host: mysql.default.svc.cluster.local
port: 3306
databaseName: collate_db
username: collate_user
password: secure_password
elasticsearch:
host: elasticsearch.default.svc.cluster.local
port: 9200
airflow:
host: http://airflow.default.svc.cluster.local:8080
username: admin
password: admin
High Availability Setup
# High availability configuration
replicaCount: 3
resources:
requests:
memory: "4Gi"
cpu: "2"
limits:
memory: "8Gi"
cpu: "4"
# Database HA
postgresql:
enabled: true
auth:
database: collate_db
username: collate_user
password: secure_password
primary:
persistence:
size: 100Gi
readReplicas:
replicaCount: 2
# Elasticsearch HA
elasticsearch:
enabled: true
replicas: 3
minimumMasterNodes: 2
resources:
requests:
memory: "2Gi"
cpu: "1"
limits:
memory: "4Gi"
cpu: "2"
Storage Configuration
Persistent Volumes
# Storage configuration
persistence:
enabled: true
storageClass: "fast-ssd"
size: 20Gi
# Database persistence
postgresql:
primary:
persistence:
enabled: true
storageClass: "fast-ssd"
size: 100Gi
# Elasticsearch persistence
elasticsearch:
persistence:
enabled: true
storageClass: "fast-ssd"
size: 50Gi
Ingress Configuration
# Ingress for external access
ingress:
enabled: true
className: "nginx"
annotations:
kubernetes.io/ingress.class: "nginx"
cert-manager.io/cluster-issuer: "letsencrypt-prod"
nginx.ingress.kubernetes.io/ssl-redirect: "true"
hosts:
- host: collate.company.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: collate-tls
hosts:
- collate.company.com
Monitoring and Observability
Prometheus Integration
# Enable Prometheus metrics
metrics:
enabled: true
port: 9464
path: /metrics
serviceMonitor:
enabled: true
labels:
prometheus: kube-prometheus
Health Checks
# Health check configuration
livenessProbe:
httpGet:
path: /health
port: 8585
initialDelaySeconds: 60
periodSeconds: 30
readinessProbe:
httpGet:
path: /health
port: 8585
initialDelaySeconds: 60
periodSeconds: 30
Troubleshooting
Check pod logs:kubectl logs -f deployment/collate
kubectl describe pod <pod-name>
Database connection issues
Verify database connectivity:kubectl exec -it deployment/collate -- nc -zv mysql.default.svc.cluster.local 3306
Check ingress status:kubectl get ingress
kubectl describe ingress collate