Modeling a Tree of Data in Rails

By Robert Boone | Mar 07, 2014

When writing web applications, the abstractions you use to model the data are very important. Most of the time a simple relational table structure will do the job you need without much of a problem but, there are times where you have to do certain programming tasks that need to relate to rows in its own table. The most common example of this type of problem is a threaded discussion, where you need to know all the children of a node or the parent of any particular node. Luckily, this is a problem that can be easily solved and you simply have to choose the best option for your data.

Approaches


Adjacency List

The Adjacency List is the most basic and popular type of tree structure. It adds one column to the database to store the parent of that row usually called parent_id.

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 ID  Post                          Parent ID
─────────────────────────────────────────────
  1  Hello, This is my first post
  2  Welcome to the forum                  1
  3  I second that. Welcome                1
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Gems that use this model

acts_as_tree

Nested Sets

Nested Sets are the next most popular and are a little more complex than the Adjacency List. An example table would look like this:

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 ID  Post                          LEFT  RIGHT
────────────────────────────────────────────
  1  Hello, This is my first post    1    8
  2  Welcome to the forum            2    4
  3  I second that. Welcome          5    7
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

The LEFT and RIGHT columns define the boundaries for that row’s children. So for ROW ID 1, anything between IDs 1 through 8 are its children. This is more efficient than the Adjacency List for finding all the children because it only takes one query. When the Adjacency List needs to make a query to find all the children, it must make a query at every node to see if it has any children. The downside is that inserting a node can be slow especially as the database gets larger.

Gems that use this model

Awesome Nested Set

Path Enumeration

Path Enumeration brings back the simplicity of a single column to represent the tree without the need to do as many queries to find all the children.

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 ID  Post                          PATH
────────────────────────────────────────
  1  Hello, This is my first post     1
  2  Welcome to the forum           1.2
  3  I second that. Welcome         1.3
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Path Enumeration can use a simple SQL LIKE statement to determine children. For example: LIKE '1%'. The query result would return every path that began with the ID one. If ROW ID 2 had children it would be something like this: LIKE '1.2%'. The PATH column is sort of like a file path that always shows the ancestors for that row. In databases without the direct support for Path Enumeration, the max size of a string can be a problem for deep hierarchies. Postgresql has support for this type of tree structure with a column type of ltree.

Gems that use this model

Ancestry

Closure Table

So far all the methods of storing the relationships between the parent and child nodes have been in the same table. The Closure Table uses a second table to store this data.

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 ID  Post
──────────────────────────────────
  1  Hello, This is my first post
  2  Welcome to the forum
  3  I second that. Welcome
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
━━━━━━━━━━━━━━━━━━━━━━━━━━
 ID  ANCESTOR  DESCENDANT
──────────────────────────
  1         1           2
  2         1           3
━━━━━━━━━━━━━━━━━━━━━━━━━━

This method uses the strengths of a relational database to store the tree. While there is some redundancy in the ancestor column, this method seems very flexible and efficient.

Gems that use this model

Closure Tree

Adjacency List with recursive query

This last type of tree structure is a return to the beginning. This is the same Adjacency List we talked about before, but with database support. Instead of having to run queries for each of the nodes to check for children, the database can do this for you making things much more efficient. Not all databases have this ability, but postgresql is one of them.

Gems that use this model

acts_as_sane_tree

Wrap up

As you can see there are many ways to model a tree in a database. Each has its own strong points and weaknesses. It’s up to the developer to see which is best for their application.

building_mobile_teams

Get in touch

Marketo Form