Build a Connector
This design doc will walk through developing a connector for OpenMetadata Ingestion is a simple python framework to ingest the metadata from various sources. Please look at our framework APIs.Workflow
workflow is a simple orchestration job that runs the components in an Order. A workflow consists of Source and Sink. It also provides support for Stage and BulkSink. Workflow execution happens in a serial fashion.- The Workflow runs the source component first. The source retrieves a record from external sources and emits the record downstream.
- If the processor component is configured, the workflow sends the record to the processor next.
- There can be multiple processor components attached to the workflow. The workflow passes a record to each processor in the order they are configured.
- Once a processor is finished, it sends the modified record to the sink.
- The above steps are repeated for each record emitted from the source.
Step comes from this generic definition:
createto initialize the actual step.closein case there’s any connection that needs to be terminated.
Source
The connector to external systems which outputs a record for downstream to process.
Sink
It will get the event emitted by the source, one at a time.
Stage
It can be used to store the records or to aggregate the work done by a processor.
BulkSink
It can be used to bulk update the records generated in a workflow.