What This Node Does
The Join node combines data from two datasets based on matching values in specified columns (join keys). Use it to enrich data by bringing in related information from another table—like adding customer details to orders or product information to transactions. [SCREENSHOT: Join node on canvas with two input connections showing “Joined 10,000 orders with 500 customers → 9,850 rows”]When to Use This Node
Use the Join node when you need to:- Enrich data - Add customer details to orders, product info to sales, category names to transactions
- Combine related tables - Merge orders with customers, transactions with accounts
- Lookup values - Add descriptions, labels, or metadata from reference tables
- Validate data - Verify records exist in reference table (inner join) or find missing matches (left join + filter)
Step-by-Step Usage Guide
1
Add Join node to canvas
2
Connect left input
Connect your primary dataset (e.g., orders) to the top input handle[SCREENSHOT: Orders node connected to top of Join node]
3
Connect right input
Connect your reference dataset (e.g., customers) to the bottom input handle[SCREENSHOT: Customers node connected to bottom of Join node]
4
Choose join type
Select Inner Join (only matches), Left Join (all left rows), Right Join, or Full Outer Join[SCREENSHOT: Join type dropdown with options]
5
Configure join keys
Select matching columns from left and right datasets. Add multiple key pairs for compound joins.[SCREENSHOT: Join key configuration showing left and right column selection]
6
Preview joined results
Tips and Best Practices
LEFT JOIN for Enrichment: When enriching your main dataset with lookup data, use LEFT JOIN to keep all main rows even if lookup fails.
Deduplicate Right Dataset: If the right dataset has duplicate join keys, deduplicate first with a Distinct node to avoid cartesian products and row explosion.
Check Row Counts: Note input row counts before joining. If output is 10x or 100x larger, you likely have a many-to-many relationship causing duplication.
Small Right = Fast: Keep the right dataset small by filtering and selecting only needed columns before joining. This dramatically improves performance.
Handle NULLs After Left Join: Use COALESCE in a Formula node after LEFT JOIN to replace NULLs with defaults:
COALESCE(customer_name, 'Unknown').Column Suffixes: When both datasets have columns with the same name, use the Suffix resolution option to add _left and _right suffixes for clarity.

