Skip to main content

Pagination

The Collate API uses cursor-based pagination for list endpoints. This ensures consistent results even when data changes between requests.

Pagination Parameters

ParameterTypeDefaultDescription
limitinteger10Number of results per page. Main list endpoints commonly allow up to 1,000,000; individual endpoints can define lower caps.
beforestring-Cursor for previous page
afterstring-Cursor for next page
offsetinteger0Number of results to skip. Available on specific sub-resource endpoints, for example /v1/tables/{id}/columns. Not supported on main list endpoints; use before/after cursors instead.

Response Fields

The exact paging fields vary by endpoint and pagination mode:
FieldTypeDescription
totalintegerTotal count of matching resources
beforestringCursor for the previous page when available
afterstringCursor for the next page when available
offsetintegerCurrent offset position when the endpoint returns offset-based paging
limitintegerPage size when the endpoint returns offset-based paging

Examples

Basic Pagination

Basic Pagination
from metadata.ingestion.ometa.ometa_api import OpenMetadata
from metadata.generated.schema.entity.data.table import Table

metadata = OpenMetadata(config)

# Get first page
first_page = metadata.list_entities(
    entity=Table,
    limit=20
)

print(f"Total tables: {first_page.paging.total}")
for table in first_page.entities:
    print(table.fullyQualifiedName)

# Get next page using after cursor
if first_page.paging.after:
    next_page = metadata.list_entities(
        entity=Table,
        limit=20,
        after=first_page.paging.after
    )
    for table in next_page.entities:
        print(table.fullyQualifiedName)

Offset-Based Pagination

Use offset on sub-resource endpoints such as /v1/tables/{id}/columns. Endpoint-specific defaults and caps apply; table columns are limited to 1,000 columns per request and default to 50.
TABLE_ID="00000000-0000-0000-0000-000000000000"

# Get first 20 columns
curl -X GET "https://your-company.getcollate.io/api/v1/tables/${TABLE_ID}/columns?limit=20&offset=0" \
  -H "Authorization: Bearer $TOKEN"

# Get next 20 columns
curl -X GET "https://your-company.getcollate.io/api/v1/tables/${TABLE_ID}/columns?limit=20&offset=20" \
  -H "Authorization: Bearer $TOKEN"
Some endpoints that accept offset still return paging.before and paging.after as previous and next page markers. Follow the response fields returned by the endpoint you are calling.

Iterating Through All Results

Iterating Results
from metadata.ingestion.ometa.ometa_api import OpenMetadata
from metadata.generated.schema.entity.data.table import Table

metadata = OpenMetadata(config)

# Iterate through all tables
for table in metadata.list_all_entities(entity=Table, limit=100):
    print(table.fullyQualifiedName)
    # Process each table...

Filtering with Pagination

Combine pagination with filters for efficient data retrieval:
Filtering
from metadata.ingestion.ometa.ometa_api import OpenMetadata
from metadata.generated.schema.entity.data.table import Table

metadata = OpenMetadata(config)

# List tables from a specific database
tables = metadata.list_entities(
    entity=Table,
    params={"database": "prod.analytics"},
    limit=50
)

for table in tables.entities:
    print(table.fullyQualifiedName)

Include Fields

Control which fields are returned in the response using the fields parameter:
# Request specific fields
curl -X GET "https://your-company.getcollate.io/api/v1/tables?fields=owner,tags,columns&limit=20" \
  -H "Authorization: Bearer $TOKEN"
Common field options for tables:
  • owner - Include owner information
  • tags - Include tags and classifications
  • columns - Include column definitions
  • followers - Include followers
  • tableConstraints - Include constraints
  • usageSummary - Include usage statistics

Best Practices

1

Use reasonable page sizes

Start with limit=50-100. Larger pages reduce API calls but increase memory usage. Check the endpoint’s maximum limit before increasing page size.
2

Use the pagination mode from the response

For cursor-based list endpoints, use before and after sequentially. For endpoints that expose offset, advance by the page size or follow the next marker returned in paging.
3

Handle empty results

Check if the data array is empty. For cursor-based pagination, stop when after is null. For offset-based requests, stop when the response returns fewer records than requested or no next marker is present.
4

Request only needed fields

Use the fields parameter to reduce response size and improve performance.
5

Implement retry logic

Handle transient errors gracefully when paginating through large datasets.
For finding specific resources, consider using the Search API instead of paginating through all results:
# Search is faster for finding specific resources
curl -X GET "https://your-company.getcollate.io/api/v1/search/query?q=customers&index=table_search_index" \
  -H "Authorization: Bearer $TOKEN"

Search API

Learn about searching and filtering metadata