MLflow Connector | Collate ML Model Integration

In this section, we provide guides and references to use the MLflow connector. Configure and schedule MLflow metadata and profiler workflows from the Collate UI:

Requirements
Metadata Ingestion
Troubleshooting

Requirements

To extract metadata, OpenMetadata needs two elements:

Tracking URI: Address of local or remote tracking server. More information on the MLflow documentation here
Registry URI: Address of local or remote model registry server.

Metadata Ingestion

Connection Details

trackingUri: Mlflow Experiment tracking URI. E.g., http://localhost:5000
registryUri: Mlflow Model registry backend. E.g., mysql+pymysql://mlflow:password@localhost:3307/experiments

Test the Connection

Once the credentials have been added, click on Test Connection and Save the changes.

7. Configure Metadata Ingestion

In this step we will configure the metadata ingestion pipeline, Please follow the instructions below

Metadata Ingestion Options

Name: This field refers to the name of ingestion pipeline, you can customize the name or use the generated name.

Mark Deleted Ml Models (toggle):: Set the Mark Deleted Ml Models toggle to flag ml models as soft-deleted if they are not present anymore in the source system.

ML Model Filter Pattern (Optional): To control whether to include an ML Model as part of metadata ingestion.

Include: Explicitly include ML Models by adding a list of comma-separated regular expressions to the Include field. OpenMetadata will include all ML Models with names matching one or more of the supplied regular expressions. All other ML Models will be excluded.
Exclude: Explicitly exclude ML Models by adding a list of comma-separated regular expressions to the Exclude field. OpenMetadata will exclude all ML Models with names matching one or more of the supplied regular expressions. All other ML Models will be included.

Enable Debug Log (toggle): Set the Enable Debug Log toggle to set the default log level to debug.

Schedule the Ingestion and Deploy

Scheduling can be set up at an hourly, daily, weekly, or manual cadence. The timezone is in UTC. Select a Start Date to schedule for ingestion. It is optional to add an End Date.Review your configuration settings. If they match what you intended, click Deploy to create the service and schedule metadata ingestion.If something doesn’t look right, click the Back button to return to the appropriate step and change the settings as needed.After configuring the workflow, you can click on Deploy to create the pipeline.

View the Ingestion Pipeline

Once the workflow has been successfully deployed, you can view the Ingestion Pipeline running from the Service Page.

If AutoPilot is enabled, workflows like usage tracking, data lineage, and similar tasks will be handled automatically. Users don’t need to set up or manage them - AutoPilot takes care of everything in the system.

Troubleshooting

MLflow Troubleshooting

Learn more about how to troubleshoot common MLflow connector issues and resolve configuration or ingestion errors.

Connectors

Connectors

MLflow Connector | Collate ML Model Integration

Requirements

Metadata Ingestion

Connection Details

Troubleshooting

MLflow Troubleshooting

Connectors

Connectors

​Requirements

​Metadata Ingestion

​Connection Details

​Troubleshooting

MLflow Troubleshooting

Requirements

Metadata Ingestion

Connection Details

Troubleshooting