What This Node Does
The Distinct node removes duplicate rows from your dataset, keeping only unique rows. It compares all columns by default or specific columns you choose, and supports first/last occurrence selection for deduplication. [SCREENSHOT: Distinct node on canvas showing “10,000 → 8,543 rows (1,457 duplicates removed)“]When to Use This Node
Use the Distinct node when you need to:- Remove duplicate records - Clean datasets with accidentally duplicated rows
- Find unique values - Get list of unique customers, products, or categories
- Deduplicate before joins - Remove duplicates to prevent cartesian explosions in joins
- Clean imported data - Remove duplicates from CSV uploads or sync connectors
Step-by-Step Usage Guide
1
Add Distinct node to canvas
2
Connect to upstream data
3
Choose All Columns or Specific Columns mode
All Columns: Row is duplicate only if ALL columns matchSpecific Columns: Row is duplicate if selected columns match[SCREENSHOT: Distinct mode selection]
4
Select columns (if using Specific Columns mode)
Check columns that define uniqueness[SCREENSHOT: Column selection for distinct]
5
Preview deduplicated results

