SQL: hierarchical recursive query

This post is merely a self reminder. Using hierarchical data in a SQL database is very common. It’s complexity clearly points that the management of hierarchical data is not what a relational database is intended for. The tables of a relational database are not hierarchical, but are simply a flat list. But practical usage of parent-child relationship data structure is quite useful.

I’ve used MySQL to do this. Any other RDBMS would also follow the same kind of approach.

Let’s assume I’ve regional data in a table where data types are Global-region, Region and Country. There is parent id in the table to maintain the relations; parent id of global regions are NULL. Let’s have a sample data set to understand the relations.

+-----+-----------+---------------------------+------+---------------+
| id  | parent_id | name                      | code | type          |
+-----+-----------+---------------------------+------+---------------+
|  14 |      NULL | Greater Americas          | zxc  | global_region |
|  15 |      NULL | Greater Asia              | zxs  | global_region |
|  18 |      NULL | Unclassified              | zxu  | global_region |
|  24 |        14 | North America             | xza  | region        |
|  25 |        14 | Central America           | xzb  | region        |
|  45 |        15 | Asia                      | xzs  | region        |
|  46 |        45 | Hong Kong                 | hn   | country       |
|  49 |        45 | China                     | cn   | country       |
|  50 |        42 | New Zealand               | nz   | country       |
|  51 |        42 | Norfolk Island            | nf   | country       |
|  52 |        45 | Bangladesh                | bd   | country       |
|  53 |        45 | India                     | in   | country       |
|  54 |        45 | Nepal                     | np   | country       |
+-----+-----------+---------------------------+------+---------------+

Let’s assume I’ve another simple table to store exchange data; where region_id is the foreign key-

+-----------+-----------------------------+------+
| region_id | name                        | code |
+-----------+-----------------------------+------+
|        60 | New York Stock Exchange     | NYQ  |
|        60 | OTC Markets Group           | OTC  |
|        49 | Shanghai Stock Exchange     | SHH  |
|        53 | Bombay Stock Exchange       | BSE  |
|        52 | Dhaka Stock Exchange        | DSE  |
+-----------+-----------------------------+------+

Now for the sake of simplicity let’s have a simple query- Get all exchange codes mapped to global regions, along with its respective regions etc.


SELECT r1.name global_region, r2.name region, r3.name country, e.name exchange, e.code code FROM region r1
LEFT JOIN region r2 ON r2.parent_id = r1.id
LEFT JOIN region r3 ON r3.parent_id = r2.id
INNER JOIN exchange e ON e.region_id = r3.id
WHERE r1.parent_id IS NULL
ORDER BY global_region, region, country;

Output is something like-

+------------------+---------------+---------------+--------------------------+------+
| global_region    | region        | country       | exchange                 | code |
+------------------+---------------+---------------+--------------------------+------+
| Greater Americas | North America | United States | New York Stock Exchange  | NYQ  |
| Greater Americas | North America | United States | OTC Markets Group        | OTC  |
| Greater Asia     | Asia          | Bangladesh    | Dhaka Stock Exchange     | DSE  |
| Greater Asia     | Asia          | China         | Shanghai Stock Exchange  | SHH  |
| Greater Asia     | Asia          | India         | Bombay Stock Exchange    | BSE  |
+------------------+---------------+---------------+--------------------------+------+

To get complete data set in csv-

SELECT r1.name global_region, r2.name region, r3.name country, e.name exchange, e.code code FROM region r1
LEFT JOIN region r2 ON r2.parent_id = r1.id
LEFT JOIN region r3 ON r3.parent_id = r2.id
INNER JOIN exchange e ON e.region_id = r3.id
WHERE r1.parent_id IS NULL
ORDER BY global_region, region, country
INTO OUTFILE '/tmp/region_exchange.csv'
FIELDS ENCLOSED BY '"' TERMINATED BY ';' ESCAPED BY '' LINES TERMINATED BY '\n';

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s