What This Node Does
The Formula node creates new columns based on calculations, transformations, and expressions using existing columns. Use it for mathematical operations, text manipulation, date calculations, conditional logic, and complex data transformations with SQL-like syntax. [SCREENSHOT: Formula node on canvas showing “Added 3 calculated columns: total_price, full_name, discount_rate”]When to Use This Node
Use the Formula node when you need to:- Calculate derived values - Total price = price × quantity, profit = revenue - cost, margin percentages
- Combine text columns - Full name = first_name + ” ” + last_name, extract email domains
- Apply business logic - Tiered discounts, status flags, conditional calculations
- Date calculations - Days since order, age from birthdate, extract year/month/quarter
Step-by-Step Usage Guide
1
Add Formula node to canvas
2
Connect to upstream data
3
Name your new column
Enter a descriptive name for the calculated column (e.g.,
total_price, profit_margin)[SCREENSHOT: Column name input field]4
Write your formula
Use SQL-like syntax with column names, operators (+, -, *, /), and functions (SUM, CONCAT, IF, etc.)[SCREENSHOT: Formula editor with example “price * quantity”]
5
Test and preview
Use the Test button to preview results on sample rows before running on full dataset[SCREENSHOT: Test preview showing calculated results]
Tips and Best Practices
Test with Preview: Always test formulas on sample rows before running on full dataset. Catch errors early with the Test button.
Handle NULLs: Use COALESCE or IFNULL to handle NULL values. Remember: any math with NULL returns NULL (
5 + NULL = NULL).Parentheses for Clarity: Use parentheses in complex formulas for clarity:
(revenue - cost) / revenue * 100 is clearer than revenue - cost / revenue * 100.Column Name Brackets: If column names have spaces or special characters, use brackets:
[Order Date], [Customer Name].Divide by Zero Protection: Protect against division by zero:
IF(quantity = 0, NULL, price / quantity) or use NULLIF(quantity, 0).Batch Multiple Formulas: Add multiple related calculations in one Formula node. It’s faster than using multiple separate Formula nodes.

