The path to unified interaction

by Jono Bacon

Something that I always find fascinating is watching people who have never used a particular product, technology or service experience it when they have no idea how it works. I will always remember when I was younger trying to explain to my Nanna how to use the (then amazing) Grolier Encyclopedia on CDROM. I wanted to show her that I could learn things for school without necessarily having my nose stuck in a book. I sat there and explained to her how you type in what you want to know and then it searches for it on the CD and presents you with some content to view. Being 13 and not a usability expert, I never even considered that she didn't know how to use a mouse; it seemed so obvious to me.


I sat there and moved the mouse back and forth and explained that when you move the mouse forward the pointer goes up and when you move it back it goes down. It appeared as if my sensitive instructions went in one ear and out the other and as she re-arranged her teeth she moved the physical mouse up and down in the air. To my Nanna she could not map in her brain the movement of the mouse forward and the pointer upwards and she knew nothing of how the mouse worked; she could not perceive that mice only works on a flat surface. My Nanna was an intelligent person and she had read more books than you can imagine, but it seemed that the mouse was fundamentally unintuitive to her.


The reason why this story jumped to the front of my mind was when I was reading Preston Gralla's O'Reilly Weblog entry about him using Linux for the first time. Preston is obviously an intelligent chap and he knows Windows XP inside out, but when he used Linux it just didn't sit right with him. Although it probably made technical sense to him, there was something about the system that just did not enthuse him to dump Windows XP, burn all copies of his book and tattoo a penguin on his posterior. This was the same kind of effect that I experienced many years back when that mouse just did not sit right with my Nanna. It makes me wonder what determines people to have preferences and choice.


The Linux desktop


Five years ago I remember reading how people were predicting that 2000 was the year of the desktop. These prophets were seeing the demise of the mighty Microsoft and Linux becoming a dominant system on the corporate and even consumer desktop. They were wrong. The Linux desktop is still in a heady time of rampant development, re-focus and innovation. I have been particularly impressed with many of the developments occurring on the desktop side of the fence such as freedesktop.org, X.org, GNOME and Project Utopia, but we still have a way to go before all of this integrates together into a single unified vision that sits right for the user.


Something I have rambled on about before is how the Linux community are great at creating frameworks; you only have to look at KDE for an example of this. There is DCOP, KParts, IOSlaves and aRts to name a few. What is interesting about KDE is that the developers seem to have a single shot vision of how the desktop should work - when you install tarballs of KDE there is little difference between that KDE and the KDE included with distributions. The KDE team set forth to create a single unified desktop that should work across distributions. This makes sense, but is it the best way forward?


We have already established that people react to technology in different ways. If we take away any kind of political agenda and put the perfect installation of KDE in front of people on a nice fast machine, the desktop will be great for some but not others. Others may prefer GNOME, Afterstep, Windows, Mac OS X, a command line, a VR headset and glove or simply not take to computers at all. The problem is that the entire KDE desktop and all of its software is all designed around this single shot approach at usability; if someone doesn't like KDE that software is pretty much useless to them. Yes, you can run KDE applications without KDE but they are slower to load, look different and behave differently.


I am not picking on KDE for any particular reason, and all of the concepts here apply to every interface in question; I just picked KDE randomly. The point is that the functionality of an application should be broken down into a series of interactions that are fundamentally based upon the interface that you are accessing the application from. The current state of the desktops are not actually that integral to using the applications that run on them. If you load the GIMP on GNOME, there is no functional difference to running the GIMP in KDE. If you load an image file, you still need to use the file open dialog box that is common to the GIMP and not the KDE equivalent - the application still looks and behaves differently.


If you look at a typical application, you can extract the fundamental concepts of interaction from that application and put these interactions into clearly defined theoretical boxes. These interactions and visual representations should really be abstracted out of the application into a generic means of recreating that application in the desktop of choice. This way you could run any application with this technology in any interface and the application would adjust to the native desktop that you are familiar with. This could include changing the native font, using the native theme, changing the icons, using native dialog boxes, responding to interactions in ways that are common with the host environment and respecting usability guidelines. You could theoretically take this concept and apply it to Project Looking Glass too; a 2D representation of a button could be natively represented in Project Looking Glass as a 3D button. The kind of functionality I am discussing here needs to be implemented at a toolkit level; this would ensure that the native binding of interaction and visual representation of the application could be applied to all software written in that toolkit.


Making it happen


In many ways, this kind of flexibility is something that the freedesktop.org project is there for, and this would greatly reduce the sheer amount of redundancy between applications developed for different desktops. Many people seem to stick to the applications that fit into their desktop environment as they feel more integrated. As an example, I love Quanta and I also love Bluefish, but I should be able to have both fit into my native desktop and look and feel like a native desktop environment. As a user, why should I care that one is written in Qt and the other in GTK?


There is a lot of rhetoric and discussion going on about the merits of Open Source usability at the moment, and if we don't try to abstract applications out to merge into native desktops, this usability is going to be largely lost in most cases. This is not Windows or Mac OS X; we don't have a single graphical interface for our Operating System, and as such we need to adjust our software so that we can support these different graphical environments. This will not only make all of the software more integrated, but it will put software in peoples hands in an environment that is familiar to them. Sure, there are some serious technical challenges to this approach, but if there is one thing we have in this community, it is technical ability.

Valid points or pointless rubbish? Share your thoughts and discuss the merits of unified interaction...


5 Comments

pdamoc
2004-07-20 13:16:50
how about this
OSS developers get together and create a new toolkit. This toolkit provides easy skinning by default. With this toolkit they create some mega widgets, let's say something like Gecko, Scintilla or Gimp's canvas. Then they create an enviroment where scripters could easily glue all the components together using something like python. Then the stage will be set for true unified interaction, also people could contribute on different levels... some may start with python scripting and move to C... others may start with a background image and move to full themes.
mondalaci
2004-07-20 16:17:00
The lack of abstraction and cooperation
I think there are primarily two barriers two overcome:


The lack of abstraction: Every desktop has its own more or less well defined standards on how things should work. Yes, I'm talking about policy. As I see this issue, most developers think in terms of policy, instead of mechanisms.


Another significant problem seems to be the lack of cooperation between desktops to define standard interfaces to doing thigs. It's like every projects wants to create their own universe, while excluding or ignoring others.


I'm irritated by UI inconsistency also. This is an issue we'll need to address in the future.

l0b0
2004-07-21 04:23:49
The XForms alternative
Perhaps what is needed is something like XForms, where the user agent (in the OS' case, window manager) determines how to display abstract UI contents.
Take the standard menus for example: These can be coded as a hierarchical structure, some (like bookmarks folders) referring to external sources, others specifying things like a _preferred_ keyboard shortcut (which should be overridden by the OS if already used), text label, icon, target, etc..
I'm not pushing for XForms here, just the idea that the application should specify as little as possible how it should be displayed, and leave the rest to the OS / window manager.
shogun70
2004-07-21 05:13:59
separation of powers...
Hmmm.... sounds like you are suggesting the separation of the engine from the interface, content from presentation, church from state (oops, I meant state from church).


Isn't there already lots of research and development in this area? By a few different software and standards organizations?


cheers,
SDH

jonobacon
2004-07-22 01:46:01
The lack of abstraction and cooperation
(I blogged about this at http://www.jonobacon.org/viewcomments.php?id=368)


Some good feedback on The path to unified interaction has come through, and I was interested to read the comments when it was linked at OSNews. Although I was pretty much expecting this to turn into a KDE vs GNOME flamewar, some people did get the spirit of my article and discussed it where they could.


One of the comments posted was by modalaci and cited two core barriers that we need to overcome in creating a system such as this:


  • The lack of abstraction: Every desktop has its own more or less well defined standards on how things should work. Yes, I'm talking about policy. As I see this issue, most developers think in terms of policy, instead of mechanisms.

  • Another significant problem seems to be the lack of cooperation between desktops to define standard interfaces to doing thigs. It's like every projects wants to create their own universe, while excluding or ignoring others.

I agree here. Policy and co-operation are key problems that need to be focused on, but I am confident that there are real and technically viable methods in which these issues can be overcome.


If we are looking at a technical solution to this problem, we have two core options to allow this kind of integration between the toolkits:


  • Standardise of middleware. To do this you will need to manage the drawing engine, configuration management, resource management (such as dialog boxes and icons) and other common entities. Graphical environments will need to rely on this middleware to be compliant.

  • Build software bridges. Another option is to build a middleware bridge that maps from one system to another. An example of this would be mapping the configuration from the GNOME configuration store to the KDE configuration store. This method strikes me as error prone and would still rely on API stability on both sides of the bridge

This is an area where freedesktop.org have the chance to play a critical role. If some form of middleware technology is developed to manage these abstracted layers, there is no excuse why the different desktops should not use them. With all honesty though, I can certainly foresee that they will not use them. If there is one thing that annoys me about the free software world, it is that people preach endlessly about not re-inventing the wheel, and then they go and develop yet another framework. Why do we have so much duplication? KParts and Bonobo, DCOP and dbus, GNOME panel and Kicker, aRts and ESD - framework after framework after framework. The only people that care about this are developers, not users. I am all for choice and competition, but we do need at least a modicum of co-operation, and I don't just mean drag and drop.


I think these problems are critical to the adoption of Linux. I would be interested to hear more thoughts on the subject and less KDE vs GNOME bitching.