Graying MySQL, and MySQL learns a second language (early conference report )

   Print.Print
Email.Email weblog link
Blog this.Blog this

Andy Oram
Apr. 19, 2005 06:27 AM
Permalink

Atom feed for this author. RSS 1.0 feed for this author. RSS 2.0 feed for this author.

URL: http://www.mysqluc.com/...

MySQL is graying in several metaphorical ways. Of course, it is simply getting older--aren't we all? But it is by no means over the hill. More significantly, its adherents are getting less colorful and reflect instead the grayness of the corporate settings it is conquering. Finally, MySQL is graying the distinctions that separated it from Oracle and other heavy-duty database engines. MySQL, in short, is becoming conventional.

The early achievements of this disruptive technology were to bring a high-performance relational database down from the top shelf where only those of means could afford it, and put it in the hands of students, enterpreneurs working out of their homes, and modest web site developers. This was a revolution dubbed situated software by Clay Shirky. Although MySQL was already being used by sites that could afford more expensive databases (and the computer systems and expert administrators who came in tow), these did not drive its initial popularity.

Now MySQL AB has built a formidable marketing machine and carried their product into the database mainstream, following a path similar to Linux. Their trappings are starting to evince familiar themes. They have salespeople in at least a dozen cities around North America. Their new support and update mechanism, MySQL Network, reminds me of a similarly named support system from Red Hat. MySQL's development of an online FAQ called a Knowledge Base, and the slogan "MySQL Everywhere" plastered all around this conference, are reminiscent of another large software vendor.

But MySQL AB has not forgotten the little guys who want a DBMS that runs lean and fast, with near-zero administration. These users will probably continue to be its largest base. Significantly, under the conventional trappings I mentioned, I believe MySQL AB is still structured in a fundamentally different way from a conventional propriety vendor, and is still behaving like a network of brilliant independent software developers. They have always listened closely to their users--you can see that at their conferences, where dozens of developers turn up in distinctive shirts and attract flocks of petitioners for new features--but they now are listening to paying customers in the same intense, investigative manner.

For instance, I saw one of their leading engineers walk around an evening reception recruiting representatives from international customers to sit in on a session about internationalization, just so he could hear their perspective on some problems he had been told by other customers.

It takes a certain financial and time commitment to attend a conference, so for those who pony up the money to do so, the theme at this one is "bigger and better." Sessions on Java interfaces, clustering, scaling, high availability, and replication decorate the calendar for the next few days. One panel is even called "Challenges in the Enterprise."

And what are the newest features MySQL is pushing hardest? There are no breakthroughs here (and I wouldn't expect any, because relational databases are a mature area in a research sense). The announcements focus on things that competitors have had for years: stored procedures, triggers, views. MySQL is not leading the conquest of new territories. Rather, MySQL is catching up. That's something they're proud of, and rightfully so.

I attended one session last night on a feature that will implement a tiny snippet of the SQL standard, XPath support. In effect, MySQL, which has always understood the SQL language, is learning a second language--not a natural language (although MySQL offers increasing support for character sets and other internationalization features) but the complex world of XPath.

I find this feature an odd way to support XML. Most XML users carry out XML/database interaction by using Java or some other programming tool to break down the XML into constituent pieces of test and store those pieces in a database structure that mirrors the XML. But SQL's XPath support buries the XML without alteration into a field in the database.

The idea of XPath support in the database is that you start by storing a string such as <p>Why do <em>you</em> want to represent <em>structured text</em>?</p> bodily in a text column. This text column can be any standard text datatype in SQL (although MySQL will add a special XML tag eventually, to support validation and some optimizations).

In itself, this doesn't help deal with XML. But MySQL will also provide a couple functions such as ExtractValue and UpdateXML that manipulate the XML with XPath queries. You could tell it to extract or change, for example, the second <em> entity in the string just shown. Full text searches can reduce the time it takes to search large collections of XML by two orders of magnitude, in comparison to database queries without indexes.

The design of the XPath support is oddly disconnected from the traditional structures of a relational database. As already shown, the storage model jams all the XML into a single column, so that the XML structure is handled independently from the schema of the table. Furthermore, an XPath query that returns multiple strings from different parts of the XML document concatenates them together, space-separated, in a single row. I would have expected them to be granted individual rows in the results.

There are many uses for XPath support in a database. One could extract and display all the titles of different documents. One could run a traditional SELECT to retrieve data from other columns or tables and join it to XML content. One could find everything within <price> tags and let the database perform some calculations such as averaging. The more XML processing you can do in the database, the less data has to be sent over the wire to the client.

This new MySQL feature--not planned until 5.1 or even later--is probably less useful with data-crunching XML (which has many small pieces of text within multiple tags) than with documents, which are flatter and have a high ratio of content to tagging. However, one participant in last night's BOF suggested the feature could be applied to storing SOAP queries too.

MySQL's turn to the mainstream is being reciprocated by its intended audience. Attendance at yesterday's tutorials was impressive; a couple tutorials sold out, and the halls were filled with people at break time. Today's sessions and exhibitors will draw even more.

Andy Oram is an editor for O'Reilly Media, specializing in Linux and free software books, and a member of Computer Professionals for Social Responsibility. His web site is www.praxagora.com/andyo.