SQL SELF JOIN
What is SQL SELF JOIN?
A SQL SELF JOIN is a type of join operation where a table is joined with itself. It allows you to combine data from a single table by creating a virtual copy of the table and establishing relationships between the original and virtual tables. Self joins are used to compare or combine data within the same table, often by creating relationships between rows within the table.
When you would use it
You would use a SQL SELF JOIN when you want to perform comparisons or retrieve data from a single table that has some form of hierarchy, relationships, or dependencies. Common use cases for SELF JOIN include:
-
Hierarchical data: When dealing with hierarchical structures, such as organizational charts or category trees, to retrieve parent-child relationships.
-
Versioning or tracking changes: For tracking changes in data, like version history or order tracking, by comparing records in the same table.
-
Recursive queries: To perform recursive operations, like finding all descendants or ancestors of a particular record.
-
Network or graph data: For working with data where nodes or entities have connections to other nodes in the same table.
Syntax
The syntax for a SQL SELF JOIN operation is as follows:
SELECT t1.column, t2.column
FROM table AS t1
JOIN table AS t2 ON t1.related_column = t2.related_column;
t1
andt2
: Aliases for the same table to differentiate between the original and virtual copies.related_column
: The column(s) used to establish relationships within the same table.
Parameter values
t1
andt2
: Aliases for the same table.related_column
: The column(s) used to create relationships or identify dependencies within the same table.
Example query
Suppose you have an "employees" table with a hierarchical structure, and you want to retrieve a list of employees and their respective managers. You can use a SQL SELF JOIN like this:
SELECT e.employee_id, e.employee_name, m.employee_name AS manager_name
FROM employees AS e
LEFT JOIN employees AS m ON e.manager_id = m.employee_id;
Example table response
Assuming the "employees" table contains the following data:
| employee_id | employee_name | manager_id |
| ----------- | ------------- | ---------- |
| 1 | John Smith | NULL |
| 2 | Mary Johnson | 1 |
| 3 | Sam Brown | 1 |
| 4 | Alice White | 2 |
The query mentioned earlier would return the following result:
| employee_id | employee_name | manager_name |
| ----------- | ------------- | ------------ |
| 1 | John Smith | NULL |
| 2 | Mary Johnson | John Smith |
| 3 | Sam Brown | John Smith |
| 4 | Alice White | Mary Johnson |
This result shows employees and their respective managers. Employees with no managers (like John Smith) have NULL values in the "manager_name" column.
Use cases
- Hierarchical data: Managing organizational charts, category hierarchies, and parent-child relationships.
- Versioning and change tracking: Tracking changes and version histories in a table.
- Recursive queries: Performing operations that require traversing a hierarchy or network structure.
- Network or graph data: Analyzing relationships between entities or nodes within a table.
SQL languages this is available for
SQL SELF JOIN is a standard SQL feature and is available in most relational database management systems (RDBMS), including but not limited to:
- MySQL
- PostgreSQL
- Oracle Database
- SQL Server
- SQLite
- IBM Db2
- MariaDB
The syntax and behavior of SELF JOIN are consistent across these database systems.