Distinct

What This Node Does

The Distinct node removes duplicate rows from your dataset, keeping only unique rows. It compares all columns by default or specific columns you choose, and supports first/last occurrence selection for deduplication. [SCREENSHOT: Distinct node on canvas showing “10,000 → 8,543 rows (1,457 duplicates removed)“]

When to Use This Node

Use the Distinct node when you need to:

Remove duplicate records - Clean datasets with accidentally duplicated rows
Find unique values - Get list of unique customers, products, or categories
Deduplicate before joins - Remove duplicates to prevent cartesian explosions in joins
Clean imported data - Remove duplicates from CSV uploads or sync connectors

Step-by-Step Usage Guide

Add Distinct node to canvas

Connect to upstream data

Choose All Columns or Specific Columns mode

All Columns: Row is duplicate only if ALL columns matchSpecific Columns: Row is duplicate if selected columns match[SCREENSHOT: Distinct mode selection]

Select columns (if using Specific Columns mode)

Check columns that define uniqueness[SCREENSHOT: Column selection for distinct]

Preview deduplicated results

Tips and Best Practices

Sort Before Distinct: To control which duplicate row is kept, sort first. Sort by date (DESC) → Distinct → Keeps most recent.

Specific Columns for Unique Values: To find unique values in one column, use Specific Columns mode with just that column selected.

All Columns for Exact Duplicates: Use All Columns mode to remove only exact duplicate rows (every column matches).

Distinct Before Aggregation: Remove duplicates before aggregation to ensure accurate COUNT results.

Check Duplicate Count: After running, check how many duplicates were removed. If 0, you may have wrong columns selected.

Use with Select: Distinct on fewer columns is faster. Use Select before Distinct to keep only needed columns.

Getting started

Centralize your data

Workflows

Dashboards

Collaboration

What This Node Does

When to Use This Node

Step-by-Step Usage Guide

Tips and Best Practices

Getting started

Centralize your data

Workflows

Dashboards

Collaboration

​What This Node Does

​When to Use This Node

​Step-by-Step Usage Guide

​Tips and Best Practices

What This Node Does

When to Use This Node

Step-by-Step Usage Guide

Tips and Best Practices