Join

What This Node Does

The Join node combines data from two datasets based on matching values in specified columns (join keys). Use it to enrich data by bringing in related information from another table—like adding customer details to orders or product information to transactions. [SCREENSHOT: Join node on canvas with two input connections showing “Joined 10,000 orders with 500 customers → 9,850 rows”]

When to Use This Node

Use the Join node when you need to:

Enrich data - Add customer details to orders, product info to sales, category names to transactions
Combine related tables - Merge orders with customers, transactions with accounts
Lookup values - Add descriptions, labels, or metadata from reference tables
Validate data - Verify records exist in reference table (inner join) or find missing matches (left join + filter)

Step-by-Step Usage Guide

Add Join node to canvas

Connect left input

Connect your primary dataset (e.g., orders) to the top input handle[SCREENSHOT: Orders node connected to top of Join node]

Connect right input

Connect your reference dataset (e.g., customers) to the bottom input handle[SCREENSHOT: Customers node connected to bottom of Join node]

Choose join type

Select Inner Join (only matches), Left Join (all left rows), Right Join, or Full Outer Join[SCREENSHOT: Join type dropdown with options]

Configure join keys

Select matching columns from left and right datasets. Add multiple key pairs for compound joins.[SCREENSHOT: Join key configuration showing left and right column selection]

Preview joined results

Tips and Best Practices

LEFT JOIN for Enrichment: When enriching your main dataset with lookup data, use LEFT JOIN to keep all main rows even if lookup fails.

Deduplicate Right Dataset: If the right dataset has duplicate join keys, deduplicate first with a Distinct node to avoid cartesian products and row explosion.

Check Row Counts: Note input row counts before joining. If output is 10x or 100x larger, you likely have a many-to-many relationship causing duplication.

Small Right = Fast: Keep the right dataset small by filtering and selecting only needed columns before joining. This dramatically improves performance.

Handle NULLs After Left Join: Use COALESCE in a Formula node after LEFT JOIN to replace NULLs with defaults: COALESCE(customer_name, 'Unknown').

Column Suffixes: When both datasets have columns with the same name, use the Suffix resolution option to add _left and _right suffixes for clarity.

Getting started

Centralize your data

Workflows

Dashboards

Collaboration

What This Node Does

When to Use This Node

Step-by-Step Usage Guide

Tips and Best Practices

Getting started

Centralize your data

Workflows

Dashboards

Collaboration

​What This Node Does

​When to Use This Node

​Step-by-Step Usage Guide

​Tips and Best Practices

What This Node Does

When to Use This Node

Step-by-Step Usage Guide

Tips and Best Practices