Trees in MySQL 5.0

by chromatic

Related link: http://jan.kneschke.de/projects/mysql/#3



I've seen two other ways of storing and retrieving hierarchical data structures into relational tables. One of them is using multiple queries to retrieve children, recursing in the application. Another way is to add metadata to each row to allow value comparisons when retrieving children.



The problem with the first approach is that it requires multiple trips to the database. The problem with the second approach is the bookkeeping required to modify the metadata when you add a child to the left of other children. (Think of inserting into a b-tree, for example. I suspect there's an algorithmic improvement lurking somewhere in the choice of numbers used in the metadata so as to reduce the need to rebalance frequently.)



The technique Jan demonstrated defines two stored procedures and calls one recursively to handle fetching children of the current node. That's one SQL query, which is nice, and it avoids the need to insert and adjust metadata on table writes.



That seems like a nice solution to me.



What do you think?


3 Comments

adamsj
2005-04-19 09:04:16
As a programmer, I like it
As an administrator, I'd want to compare its performance to that of making multiple trips.
chromatic
2005-04-19 09:17:43
As a programmer, I like it
The other session I attended, about performance and tuning, made the point that benchmarking is tricky. Sure, there's less network traffic, which is nice, but there are things to consider like holding read locks for longer (or is it longer?) and the amount of space taken up by the temporary table. (If it grows too large, it needs to write to the disk.)


Still, I look forward to trying it.

adamsj
2005-04-19 10:07:26
Yeah, it's a nice hack
Typically, when I've had to do this, I was working with the data dictionary, which means just one pull from the database. Then I've used Class::Struct to build a simple tree. That puts the simple processing on the desktop or the app server, not on the database. (Maybe it's just that I've typically worked with systems that were pretty well redlined, but I always want to take work off the RDBMS.)


If you're doing something else, the question of multiple pulls becomes more important.


I guess that's just a long-winded way of saying I want to try it, too.