The path to unified interaction
by Jono Bacon
I sat there and moved the mouse back and forth and explained that when you move the mouse forward the pointer goes up and when you move it back it goes down. It appeared as if my sensitive instructions went in one ear and out the other and as she re-arranged her teeth she moved the physical mouse up and down in the air. To my Nanna she could not map in her brain the movement of the mouse forward and the pointer upwards and she knew nothing of how the mouse worked; she could not perceive that mice only works on a flat surface. My Nanna was an intelligent person and she had read more books than you can imagine, but it seemed that the mouse was fundamentally unintuitive to her.
The reason why this story jumped to the front of my mind was when I was reading Preston Gralla's O'Reilly Weblog entry about him using Linux for the first time. Preston is obviously an intelligent chap and he knows Windows XP inside out, but when he used Linux it just didn't sit right with him. Although it probably made technical sense to him, there was something about the system that just did not enthuse him to dump Windows XP, burn all copies of his book and tattoo a penguin on his posterior. This was the same kind of effect that I experienced many years back when that mouse just did not sit right with my Nanna. It makes me wonder what determines people to have preferences and choice.
The Linux desktop
Five years ago I remember reading how people were predicting that 2000 was the year of the desktop. These prophets were seeing the demise of the mighty Microsoft and Linux becoming a dominant system on the corporate and even consumer desktop. They were wrong. The Linux desktop is still in a heady time of rampant development, re-focus and innovation. I have been particularly impressed with many of the developments occurring on the desktop side of the fence such as freedesktop.org, X.org, GNOME and Project Utopia, but we still have a way to go before all of this integrates together into a single unified vision that sits right for the user.
Something I have rambled on about before is how the Linux community are great at creating frameworks; you only have to look at KDE for an example of this. There is DCOP, KParts, IOSlaves and aRts to name a few. What is interesting about KDE is that the developers seem to have a single shot vision of how the desktop should work - when you install tarballs of KDE there is little difference between that KDE and the KDE included with distributions. The KDE team set forth to create a single unified desktop that should work across distributions. This makes sense, but is it the best way forward?
We have already established that people react to technology in different ways. If we take away any kind of political agenda and put the perfect installation of KDE in front of people on a nice fast machine, the desktop will be great for some but not others. Others may prefer GNOME, Afterstep, Windows, Mac OS X, a command line, a VR headset and glove or simply not take to computers at all. The problem is that the entire KDE desktop and all of its software is all designed around this single shot approach at usability; if someone doesn't like KDE that software is pretty much useless to them. Yes, you can run KDE applications without KDE but they are slower to load, look different and behave differently.
I am not picking on KDE for any particular reason, and all of the concepts here apply to every interface in question; I just picked KDE randomly. The point is that the functionality of an application should be broken down into a series of interactions that are fundamentally based upon the interface that you are accessing the application from. The current state of the desktops are not actually that integral to using the applications that run on them. If you load the GIMP on GNOME, there is no functional difference to running the GIMP in KDE. If you load an image file, you still need to use the file open dialog box that is common to the GIMP and not the KDE equivalent - the application still looks and behaves differently.
If you look at a typical application, you can extract the fundamental concepts of interaction from that application and put these interactions into clearly defined theoretical boxes. These interactions and visual representations should really be abstracted out of the application into a generic means of recreating that application in the desktop of choice. This way you could run any application with this technology in any interface and the application would adjust to the native desktop that you are familiar with. This could include changing the native font, using the native theme, changing the icons, using native dialog boxes, responding to interactions in ways that are common with the host environment and respecting usability guidelines. You could theoretically take this concept and apply it to Project Looking Glass too; a 2D representation of a button could be natively represented in Project Looking Glass as a 3D button. The kind of functionality I am discussing here needs to be implemented at a toolkit level; this would ensure that the native binding of interaction and visual representation of the application could be applied to all software written in that toolkit.
Making it happen
In many ways, this kind of flexibility is something that the freedesktop.org project is there for, and this would greatly reduce the sheer amount of redundancy between applications developed for different desktops. Many people seem to stick to the applications that fit into their desktop environment as they feel more integrated. As an example, I love Quanta and I also love Bluefish, but I should be able to have both fit into my native desktop and look and feel like a native desktop environment. As a user, why should I care that one is written in Qt and the other in GTK?
There is a lot of rhetoric and discussion going on about the merits of Open Source usability at the moment, and if we don't try to abstract applications out to merge into native desktops, this usability is going to be largely lost in most cases. This is not Windows or Mac OS X; we don't have a single graphical interface for our Operating System, and as such we need to adjust our software so that we can support these different graphical environments. This will not only make all of the software more integrated, but it will put software in peoples hands in an environment that is familiar to them. Sure, there are some serious technical challenges to this approach, but if there is one thing we have in this community, it is technical ability.
Valid points or pointless rubbish? Share your thoughts and discuss the merits of unified interaction...
how about this
OSS developers get together and create a new toolkit. This toolkit provides easy skinning by default. With this toolkit they create some mega widgets, let's say something like Gecko, Scintilla or Gimp's canvas. Then they create an enviroment where scripters could easily glue all the components together using something like python. Then the stage will be set for true unified interaction, also people could contribute on different levels... some may start with python scripting and move to C... others may start with a background image and move to full themes.
The lack of abstraction and cooperation
I think there are primarily two barriers two overcome:
The XForms alternative
Perhaps what is needed is something like XForms, where the user agent (in the OS' case, window manager) determines how to display abstract UI contents.
Take the standard menus for example: These can be coded as a hierarchical structure, some (like bookmarks folders) referring to external sources, others specifying things like a _preferred_ keyboard shortcut (which should be overridden by the OS if already used), text label, icon, target, etc..
I'm not pushing for XForms here, just the idea that the application should specify as little as possible how it should be displayed, and leave the rest to the OS / window manager.
separation of powers...
Hmmm.... sounds like you are suggesting the separation of the engine from the interface, content from presentation, church from state (oops, I meant state from church).
The lack of abstraction and cooperation
(I blogged about this at http://www.jonobacon.org/viewcomments.php?id=368)
I agree here. Policy and co-operation are key problems that need to be focused on, but I am confident that there are real and technically viable methods in which these issues can be overcome.
This is an area where freedesktop.org have the chance to play a critical role. If some form of middleware technology is developed to manage these abstracted layers, there is no excuse why the different desktops should not use them. With all honesty though, I can certainly foresee that they will not use them. If there is one thing that annoys me about the free software world, it is that people preach endlessly about not re-inventing the wheel, and then they go and develop yet another framework. Why do we have so much duplication? KParts and Bonobo, DCOP and dbus, GNOME panel and Kicker, aRts and ESD - framework after framework after framework. The only people that care about this are developers, not users. I am all for choice and competition, but we do need at least a modicum of co-operation, and I don't just mean drag and drop.