Skip to main content

Overview of Data Profiler

The profiler in Collate helps to understand the shape of your data and to quickly validate assumptions. The data profiler helps to capture table usage statistics over a period of time. This happens as part of profiler ingestion. Data profiles enable you to check for null values in non-null columns, for duplicates in a unique column, etc. You can gain a better understanding of column data distributions through the descriptive statistics provided. Watch the video to understand Collate’s native Data Profiler and Data Quality tests.

Profiler Tab

Get a complete picture of the Table Profile and Column Profile details.

Profiler Workflow

Configure and run the Profiler Workflow to extract Profiler data.

Metrics

Learn about the supported profiler metrics.

Auto Pii Tagging

Automatically detect and tag columns containing sensitive PII data during profiling.

Spark Engine

Use distributed processing with Apache Spark for large-scale data profiling.