Skip to main content

dbt Artifact Configuration

Using dbt Cloud? You don’t need storage configuration. Go directly to the dbt Cloud API guide for a simpler setup.
This guide is for dbt Core users. When using dbt Core, Collate needs access to dbt-generated artifacts (manifest.json, catalog.json, run_results.json) to extract metadata, lineage, and test results. Since dbt Core runs within your infrastructure, you must configure a storage method to make these artifacts accessible to Collate.

How It Works

If you’re using dbt Core, you need to set up artifact storage:
  1. Generate artifacts: Run dbt run, dbt test, and dbt docs generate to create manifest.json, catalog.json, and run_results.json
  2. Upload to storage: Configure your workflow to upload these files to S3, GCS, Azure, HTTP server, or shared filesystem
  3. Configure Collate: Point Collate to the storage location to pull and process the artifacts

Storage Options for dbt Core

Choose the storage method that matches your infrastructure:

Quick Setup Summary

AWS S3

# Create bucket
aws s3 mb s3://your-dbt-artifacts

# Upload artifacts (manual)
aws s3 sync target/ s3://your-dbt-artifacts/dbt/ --include "*.json"

# Or use automated Airflow DAG (see detailed guide)
View complete S3 guide →

Google Cloud Storage

# Create bucket
gsutil mb gs://your-dbt-artifacts

# Upload artifacts (manual)
gsutil -m cp target/*.json gs://your-dbt-artifacts/dbt/

# Or use Cloud Composer DAG (see detailed guide)
View complete GCS guide →

Azure Blob Storage

# Create storage account and container
az storage account create --name yourdbtartifacts
az storage container create --name dbt-artifacts

# Upload artifacts
az storage blob upload-batch \
  --destination dbt-artifacts \
  --source target/ \
  --pattern "*.json"
View complete Azure guide →

HTTP Server

# Upload via rsync to static file server
rsync -avz target/*.json user@server:/var/www/dbt/

# Or configure S3 + CloudFront for HTTPS access
View complete HTTP guide →

Local/Shared Filesystem

# Docker Compose volume mount example
volumes:
  - ./dbt/target:/dbt-artifacts

# Collate reads directly from /dbt-artifacts path
View complete Local guide →

Common Requirements Across All Methods

Regardless of which storage method you choose, you need:

1. Required dbt Artifacts

FileGenerated ByPurpose
manifest.jsondbt run, dbt compile, dbt buildRequired - Model definitions, sources, lineage, tests
catalog.jsondbt docs generateRecommended - Column names, types, descriptions
run_results.jsondbt test, dbt run, dbt buildOptional - Test execution results and timing

2. dbt Command Sequence

Run these commands to generate all artifacts:
dbt run          # Execute models, generates manifest.json
dbt test         # Run tests, updates run_results.json
dbt docs generate # Generate catalog.json with column metadata

3. Collate Configuration

After artifacts are accessible, configure Collate ingestion:

Troubleshooting Common Issues

IssuePossible CauseSolution
Artifacts not foundUpload failed or wrong pathVerify artifacts uploaded successfully to correct location
Access deniedInsufficient permissionsCheck IAM policies, service account permissions, or access keys
Stale metadataOld artifactsEnsure upload happens after dbt completes; verify timestamps
Missing lineageNo compiled_code in manifestRun dbt compile or dbt docs generate before upload
Missing column descriptionsNo catalog.jsonEnsure dbt docs generate runs and catalog.json is uploaded
Test results not showingNo run_results.jsonEnsure dbt test runs and run_results.json is uploaded
For storage-specific troubleshooting, see the individual guides.

Next Steps

  1. Choose your storage method using the decision matrix above
  2. Follow the detailed guide for your chosen method
  3. Configure Collate ingestion after artifacts are accessible
  4. Set up scheduling to keep metadata synchronized
Questions? See the main dbt Overview or dbt Troubleshooting guide.