Advanced Joining Techniques: Back Joining, Partial Joining, Cross Joining

# Introduction
INNER JOIN and LEFT JOIN handle most SQL queries. A small category of problems requires others join types: calculating function that returns set row by row, sorting rows by existence in another table, and returning rows that cannot match in another table.
Three very unusual joints handle these cleanly. A LATERAL join allows a subquery to reference columns in the FROM clause from the beginning of the same FROM clause. Semi joins returned rows where a match exists in another table, without repeating those rows. Anti joins the return lines when there is no match.
Let's explore how we can use these patterns in practice.

# LATERAL Joins
A LATERAL subquery in a FROM clause can refer to columns from preceding tables in the same FROM clause. Except for LATERAL, the FROM query is evaluated independently and you cannot see those columns.
This is especially important when calling a set return function (returning multiple rows per input). Functions that return a set can be called from a SELECT list, but to use it row by row on a column from an external table within the FROM clause, LATERAL is required.
Common situations:
- It's calling
unnest()in an array column to get one row for each element of the array - It's calling
regexp_matches()with'g'flag to exclude all matches in each row - Combines the top-N-per-group result with a related FROM query
- Splits the JSON array line by line
// Example: Counting Word Occurrences
This is a Google question asks us to count how many times the words “bull” and “bear” appear in contents column. Similarities should be insensitive, and strings like bullish or bearing should be excluded.
The data: be google_file_store the table says:
| file name | content |
|---|---|
| draft1.txt | The stock exchange predicts a bull market that will make many investors happy. |
| draft2.txt | The stock exchange predicts a bull market … but analysts warn … we expect a bear market. |
| last.txt | The stock exchange predicts a bull market … a bear market. As always predicting the future market is uncertain… |
The code: regexp_matches() returns one line in parallel for each. Running it once in a row google_file_store and enumerate all the matches in the entire table, placing it in the FROM clause. I m again M anchors are there PostgreSQL word limits, which excludes “bull” and “carry”.
SELECT 'bull' AS word,
COUNT(*) AS nentry
FROM google_file_store,
LATERAL regexp_matches(LOWER(contents), 'm(bull)M', 'g')
UNION ALL
SELECT 'bear' AS word,
COUNT(*) AS nentry
FROM google_file_store,
LATERAL regexp_matches(LOWER(contents), 'm(bear)M', 'g');
// Output
| word | to enter |
|---|---|
| bull | 3 |
| a bear | 2 |
# Semi Joins
A subjoin returns rows from the left table where at least one match exists in the right table, with each row of the left table occurring at most. INNER JOIN repeats the rows of the left table where the right side has multiple matches. Semi-joins do not.
Two SQL implementations:
WHERE EXISTS (SELECT 1 FROM ...)WHERE col IN (SELECT col FROM ...)
EXISTS is the most common form because it handles join conditions for multiple columns and related subqueries without rewriting the query.
// Example: Finding High Value Customers
This question asks us to find customers who have placed at least one order over $100 and return their customer ID and name.
The data: Preview of online_store_customers again online_store_orders:
| customer_identity | customer_name |
|---|---|
| 1 | Alice Johnson |
| 2 | Bob Smith |
| 3 | Carol Williams |
| … | … |
| 10 | Jack Anderson |
| order_id | customer_identity | value | situation |
|---|---|---|---|
| 101 | 1 | 150 | paid |
| 102 | 1 | 200 | paid |
| 103 | 1 | 75 | paid |
| … | … | … | … |
| 115 | 9 | 450 | paid |
The code: The EXISTS subquery checks, for each customer, whether there is an order greater than $100. SELECT 1 convention because EXISTS only cares which row returns, not its contents.
SELECT
c.customer_id,
c.customer_name
FROM online_store_customers c
WHERE EXISTS (
SELECT 1
FROM online_store_orders o
WHERE o.customer_id = c.customer_id
AND o.amount > 100
);
If we used an INNER JOIN instead, customer 1 would appear twice in the result because the two orders are the same. EXISTS returns 1 customer once.
// Output
| customer_identity | customer_name |
|---|---|
| 1 | Alice Johnson |
| 2 | Bob Smith |
| 3 | Carol Williams |
| … | … |
| 9 | Ivy Taylor |
# Auntie Joins
An anti join returns rows from the left table where there is no match in the right table. It's the opposite of compact.
Two SQL implementations:
LEFT JOIN ... WHERE right_table.col IS NULLWHERE NOT EXISTS (SELECT 1 FROM ...)
Both produce the same result. NOT EXISTS generally produces a better query plan in modern versions of PostgreSQL and is more readable. The LEFT JOIN + IS NULL pattern is old and useful when you need columns from the right to find rows that don't match.
// Example: Free Users Without April Calls
This question asking us to refund free users who did not make any calls in April 2020.
The data: Preview of rc_calls again rc_users:
| User ID | call_id | call_date |
|---|---|---|
| 1218 | 0 | 2020-04-19 01:06:00 |
| 1554 | 1 | 2020-03-01 16:51:00 |
| 1857 | 2 | 2020-03-29 07:06:00 |
| 1525 | 3 | 2020-03-07 02:01:00 |
| … | … | … |
| 1910 | 39 | 2020-03-11 08:33:00 |
| User ID | situation | company_id |
|---|---|---|
| 1218 | for free | 1 |
| 1554 | not working | 1 |
| 1857 | for free | 2 |
| … | … | … |
| 1884 | for free | 1 |
The code: The date filter resides in the ON clause, not the WHERE clause. That difference is what makes this an anti-join. Putting a date filter on THERE will drop rows where the LEFT JOIN produces NULLs, and it will return to the INNER JOIN. With the filter ON, free users without a qualifying April call still generate a row, with NULLs on the right side, and the IS NULL check keeps only those rows.
SELECT DISTINCT u.user_id
FROM rc_users u
LEFT JOIN rc_calls c
ON u.user_id = c.user_id
AND c.call_date BETWEEN '2020-04-01' AND '2020-04-30'
WHERE u.status="free"
AND c.user_id IS NULL;
// Output
# The conclusion
This triple join solves situations where INNER JOIN and LEFT JOIN are awkward or incorrect:
- LATERAL is a function call that returns a row-by-row set within FROM.
- EXISTS gives you “matched rows” without the duplication caused by INNER JOIN.
- NONE or LEFT JOIN + DOES give you “unmatched rows” cleanly.
The pattern to remember is short. If an INNER JOIN repeats rows you don't want, use EXISTS. If you need rows that do not match, use NOT EXISTS or LEFT JOIN + IS NULL. If the FROM query requires a reference to columns from an external table, add LATERAL.
Practice this literally SQL interview questionsand the syntax becomes automatic.
Nate Rosidi he is a data scientist and product strategist. He is also an adjunct professor of statistics, and the founder of StrataScratch, a platform that helps data scientists prepare for their interviews with real interview questions from top companies. Nate writes about the latest trends in the job market, provides interview advice, shares data science projects, and covers all things SQL.



