Beyond the dimension

by Jono Bacon

We are facing an interesting time in the Open Source desktop world. Not only are a number a of interesting technologies being developed for making our computers work more transparently, but a new technology has been Open Sourced recently that provides a new playground for a new way of thinking about the Open Source Desktop; this technology is Project Looking Glass.

Project Looking Glass (PLG) is a technology that was created by Sun to create a 3D desktop environment. The environment gives you the ability to perform simple operations such as flipping windows, changing the perspective and view of an object and other functions. The software was created by Sun to explore the possibilities for 3D based applications and a 3D based desktop, and although fairly useless in its current incarnation, the prototype provides a level of usable framework to create 3D applications and experiment with a new way of interacting with software.

The aim of this article is to discuss some ideas and concepts for making use of a 3D environment. Before I continue, there should be a few disclaimers however. First of all, I am no usability expert, and I am actually fairly cynical about certain aspects of usability theory. As such, you should take my ideas here as simply ideas - they were in no way researched and are not backed up with data to prove their usefulness. Secondly, the ideas here can apply to any 3D environment or software, and not specifically PLG. Feel free to make use of these ideas in your own 3D environment.

I believe that a 3D environment could be useful. There has been much discussion on the net about the worth of a 3D interface, particularly considering that it is confined within the remit of your 2D screen and typical 2D input devices; keyboard and mouse. Although I share some cynicism to a point, I also do believe that people can perceive 3D sufficiently on a screen to interact with it. You only have to look at how we perceive 3D in video games and movies to see this. I think the biggest challenges that we face are not with perception, but with regards to the input and architecture of the environment.


I think it is fair to say that it is unreasonable to expect users of a 3D interface to go out and buy a special input device for their computer. We are not aiming to build a Minority Report type system here; the aim is to create a level of useful 3D interaction that is as familiar and intuitive as possible. I do believe that the mouse is useful here.

3D interfaces are based around three axis points:

  • X axis. This is a width line from left to right and vice versa

  • Y axis. This is a height line from top to bottom and vice versa

  • Z axis. This is a depth line from far to near and vice versa

When considering our input mechanism, we need to take into account these axis requirements. In addition to this we need to consider the selection requirements. I believe that selection will be as simple in the 3D space as it is in a 2D space; you need to be able to select something (such as loading an application when double clicking an icon) and you need to able to hold something (such as dragging an icon by single clicking, dragging and releasing). The only other possible requirement is a context menu, but then I am rather skeptical of these, and I think a better solution can be achieved in the 3D space with semi-transparent overlays.

With these considerations, one choice of input could be:

  • Left Mouse Button. Click and hold to move the X axis as if a camera was panning. Double click to flip to the other side (as if you had a mirror image looking back at you)

  • Right Mouse Button. Click and hold to move the Y axis as if a camera was panning. Double click to flip to the other side (as if you had a mirror image looking back at you)

  • Middle Mouse Button. Selection button. Double click to select an object to work with, and single click t select and object and move it

  • Left+Right Buttons together/Scroll Wheel. Click and hold to move the Z axis as if a camera was panning. Double click to flip to the other side (as if you had a mirror image looking back at you). The scroll wheel will move you forward when you scroll forward and back when you scroll back

Although I have suggested which button can do what, these combinations can obviously be changed. The main point I am making is that you need a selection button and a means to control each axis. Some people have suggested using a Shift/Ctrl/Alt key in combination with the mouse, but I think this feels a little clumsy.

3D representation of objects

The 3D interface will never amount to anything if we don't consider some specific use cases and how the interface can be best used. I think the key to defining 'best used' is to clearly separate out 2D and 3D functions. I see no point in making everything 3D; some things are inherently 2D (such as creating a word processed document) and the interface should allow you to edit your document in a 2D window as if you were using KDE/GNOME.

I think the true value of 3D comes in when we consider how we interact with objects. A while back a friend of mine told me about John Siracusa's analysis of the spatial finder, and I found his commentary on how we interact with objects interesting. A 3D interface really allows us to take this concept and raise it to the next bar - in the 3D space we can truly interact with the object and not simply interact with iconic representations of objects.

Let us take for example, a file. In most current GUI's, a file is represented by an icon. This icon can be interacted in the sense of moving it to different locations and clicking on it to load the file into a viewer. In the 3D space this file could be literally an accurate representation of the file itself. In this sense we could represent some of the following types of file:

  • Paper Document. A document would look like a paper document, complete with the text of the document displayed on the front. We could use the 3D space the visualise the depth of the document with the number of pages; a document with 600 pages would look a lot fatter than a 3 page document. This way you can visualise what kind of document you are looking for and how long it will take to read. We could also use the paper size and shape to represent the kind of document - a business card will look different to an A3 poster for example

  • Video. A video file could be represented by a 3D TV icon that is actually playing the video while you are navigating your files. I see no point in distinguishing between one video format and another. I did consider the value of having a Quicktime video playing on a Mac icon, but the user should not need to care about this. A video is a video and it is up to the player to care about file formats

  • Devices. Devices could be really interesting. If you plug a digital camera into your USB port, you should see a digital camera appear on the screen. The digital camera icon should then be able to be selected (it will then zoom into the back of the camera) where you can flick through the pictures on the desktop's camera screen. This is tying together the concept of taking off pictures from a device - the user will naturally want to look at the 3D representation of the device to maker this happen. The current method of selecting a drive or the pictures coming off automatically is rather unintuitive; the user needs to interact with a virtual representation of the device

Some icons will obviously be 2D by their very nature. A .png or .jpg image is obviously a flat 2D image and is represented as such, but the key is in providing a realistic representation to use of what type of content the object is. As an example, the user needs to see an intrinsic link between the document they type into and the document that comes out of their printer.

Application use cases

Before we can consider any kind of development effort, we need to come up with some ideas for how 3D applications will work. We need to formulate these ideas into use cases that can be clearly discussed and debated over. Here are some ideas:

File management

If there is something that humans seem to have no problem understanding is that of drawers, cupboards, fridges and other square boxes with a door on the front. We also understand pigeon holes, boxes, containers and other methods of putting one object in another. We also innately understand that if you put two objects in a box you only need to move the box to move both objects. This can be useful for dealing with directories and moving files around.

I think what we need to create in this kind of interface is a number of of visual representations of real world storage containers. As an example, a hard disk could be represented as an office/storage room (we need to visually suggest that the hard disk is bigger than anything on it, so we need to visually represent the actual disk as a larger room). Within this room we then have a number of storage cabinets (directories) in which the files can be stored. Moving a cabinet from one room to another should be as simple as dragging it over from one room to the other. With the metaphor of cabinets we can also have different types of storage container for different types of information. A typical My Documents type directory could be a filing cabinet for example.

With this kind of metaphor I want to steer clear of someone walking into a 3D room and in a Doom III style manner and moving a hand around to pick up files. This whole metaphor is based on iconic meaning tied in with a real world relationship between the objects. Here is a use case:

  • The user clicks on the file room icon and we see two rooms (for two hard disks) appear on the main body of the desktop

  • The user clicks on one room and the two main rooms shrink to a larger size. Inside this room we have a 3D representation of a normal room with file cabinets. If the user clicks on a cabinet the full room goes very transparent and the cabinet increases in size and we can look inside at the contents

Creating content. E.g. burning a CD

The concept of burning a CD follows my ideas for creating any type of simple content. For this we need to identify the core components of the object we are creating, and put on the screen a simple template that allows the user to click on the relevant part of the object to change it. For a burnable CD we will typically have the CD itself and a cover. We may also have a cover for the back of a CD case. Here is the use case:

  • The user wants to create a new CD, so he/she opens up the media store cabinet and drags a blank CD onto the desktop (the media cabinet will classify data and audio CD's as separate disks - the user just selects the relevant type of disk)

  • When the CD appears on the desktop it is in an isometric view so the user can click on the extruded cover or the CD itself

  • When the user clicks on the CD, he/she will be taken to the file store if a data CD is being created, or to the music library if an audio CD is being created. The user can then drag files onto the CD and an overlay box will say what files are on the CD and how much space is left (a visual representation of space should be used)

  • With the CD layout ready, the user can then drag the CD into the CD press machine icon which will then burn it. When the CD is finished the computer will do some checks to see if the same files that were requested to be burned on the CD can be read; if they cannot be read the computer will mark the CD as damaged and ask if another CD from the media store can be used

What could be useful for this case is that when the user buys some new CDs he/she is encouraged to add them to the media store - this way the computer can let the user know when he/she is running out of media. This is particularly useful with the computer checking if the CD's are working or damaged when the burning process is finished.

Device handling

When a user plugs in a device, it should be visually represented on the screen. This will make an intrinsic link between the physical device and the virtual device, although they may look different physically (this is the biggest problem). With this device on screen, the user should be able to interact with it in a similar way to the real device. Let us assume we are plugging in a digital camera:

  • The user plugs the camera into the USB port. In the top right corner of the screen a 3D representation of the camera shows up and starts spinning around

  • When the user clicks on the camera, the camera appears in the body of the desktop much larger

  • The view will now zoom into the camera screen and the user can view images in a series of thumbnails

  • The user can then drag the pictures to their photo album (this photo album will be a visual representation of a book and the user can add pictures to the different pages of the book)

This system is not radically different to the current method of viewing pictures on a drive, but we are connecting together the concept of pictures on the device and actually dragging them to somewhere useful.

These use cases are not necessarily the right way to do things, but they provide a starting point for discussion. With more consideration and some prototypes we can better target the 3D aspects of the interface in the applications and make these use cases more representative of how we physically interact with the world.


I firmly believe that the 3D desktop environment has some great potential, but it needs to combine the best elements of the 2D methods we currently use and the innovative 3D ideas we will consider in the desktop of the future. This article has been written to hopefully pique the interest and ideas of people to think about how we can create an interface that is far easier to use and more representative of the real world.

The biggest challenge when implementing an interface such as this is how far you represent reality. As an example, when you plug your camera in and look at the pictures on the virtual screen, you should really be able to use the functions on the camera as if it was the physical device, but the software limits this potential to merely grabbing pictures and maybe taking a few shots. In this sense the physical representation cannot be fully imitated - we simply need to get a good batting average.

I would love to hear your thoughts on all of this, so feel free to get in touch with me or scribe your thoughts down in the comments box below. I am as interested to learn new ideas as much as coming up with new ideas; this could really mark a new wave in the Open Source desktop revolution.

What do you think? Feel free to share your ideas, views, opinions and money below...


2004-07-07 04:54:35
Practical Issues
While the idea of a 3D desktop is attractive at first, I'm highly skeptical.

The metaphor of real-world objects has been tried before many times, albeit not in 3D. I remember using a document library program that ahd an optional intrface that was a picture of a real library, with clickable zones such as the bookshelf, stack of notes on a desk, index cupboard, etc. It was a complete pain in the arse to use, and other similar attempts I've seen were equaly dismal. I don't thin real 3D, as against pseudo-3d art will add much and more likely reduce the accessability of such interfaces.

The fact is that displays are inherently 2D devices, therefore using 3D transformations are playig against the inherent strengths, and even nature of the device. Angling a window away from the user in 3D reduces the space it takes up, sure, but it also distorts the content of the window in a way that simply scaling it in 2d to a smaller size doesn't.

I remember in the early 90s reading an artucle saying that maybe game pads would replace keyboards and mice because they seem more intuitive ocntrolls to younger users. Of course 10 years later we're still using the same keyboards and mice. In fact any serious FPS player will tell you that no game pad ever invented beats a keyboard and mouse as a controll interface. Yet we play 3D games not because they are easy, but precisely because they are hard. Enemies hide behind environmental objects, or sneak up behind us yet some games 'cheat' using 2D radar displays and HUDs because 2D is a better way to access critical information.

Simon Hibbs

2004-07-07 08:13:17
3D interfaces
Extrapolate a square screen into a 3d cube and you get eight corners which will fill up with the dust-bunnies of daily use. How many people actually keep an orderly and maintained home or office that makes efficient use of that entire volume of space? Three dimentions allow for uniquely useful presentation of information, but it's not a good metaphor for long term storage. People's memories are simply not that good. 2D puts visual cues and reminders right in you face, 3D will provide more ways to misplace things. Search mechanisms are the future. Presenting search results in 3D is intriguing, but 3D as a desktop paradigm is a toy until there are actual 3D interface devices to manipulate them.
2004-07-07 10:50:05
Some constructive criticism
"With this device on screen, the user should be able to interact with it in a similar way to the real device."

I'd strongly disagree with this concept. If plugging in my iPod opened up a virtual floating 3D iPod, complete with wheel and miniature screen, somehow I can't help but think that would be a vastly inferior interaction to iTunes simply opening and showing me my music in a good old 2D list.

And again, the camera example seems very ineffecient when the computer screen can easily allow you to view all the pictures on the camera as tiled thumbnails in a boring old 2D window. A camera works the way it does, "flicking" through photos one by one, because its display is small, and it's meant to be held in your hand. I don't want a representation of the physical camera device on my screen that's harder to interact with because I now have one cursor in place of ten fingers.

I think both of these are an example of the common fallacy of trying too hard to mimic a real-world device in the name of making things "intuitive." (Ever use QuickTime 3's terrible volume wheel?)


To my way of thinking at least, your CD burning use case is backwards. When I want to burn a CD, I start with, "I have some items and I want to put them on a CD." Needing to go somewhere else (a media cabinet) and make decisions about CD format, etc. before I can select my items to burn, grates against my normal workflow. It seems more natural to say, "put this file, and this file, and that file" -- which are likely already sitting in front of me because I'm doing stuff with them -- on a new CD that I am going to make; worrying about format, inserting physical blank media, etc. should all be delayed until the last possible moment.

"I have some items -> now put them on a CD" just feels more psychologically natural in most cases than "I want to burn a CD -> now what should I put on it?".

2004-07-07 15:29:10
Application Preferences and Pipes (Links)
I'm not sure if this is relevant in this context, but I was wondering if you might also include additional use cases...

For a given application, which in normal operation would be a 2D application, would it also be relevant to include the main view as the main window, but moving the focus to another side could allow preferences to be displayed. For example, if I have a web browser interface, and I am interested in configuring the proxy interfaces, turn the item to a specific side which contains the interface for setting the proxy. You could have a flat surface on on side representing the focus, and on the back have all the possible alternatives (proxy server settings, mail server settings, presentation level settings, etc). This could potentially turn into a very large almost spherical shaped item requiring some other mean of moving around though.

Another idea I would like to propose is the concept of linking or piping output from one application to the other like those found in command line piping scenarios or windows object linking embedding. This could potentially be a pipe or some form of link (hose, wire, etc) which has the output of a video screen linked to a video recorder. You could like a document to a publishing application so when a local changes is made, the file automatically gets passed to the publisher and you have an updated web site.

I think the possibilities here are interesting.

2004-07-07 20:13:49
What we need is a state of "balance"
Its a state of "balance" we're after. Making a "3D everything" desktop would serve as much purpose as putting 20 gas pedals in your car...(one for each direction you might want to go)...instead we put 1 gas pedal and a steering wheel ;)

Any idea for an improved desktop should try to make a balance between intuitiveness and efficiency. Obviously the definition of "ease of use" will vary greatly between users and therefore the desktop should be adjustable enough to accomodate for users on both ends of the "techno" scale.

Perhaps the best attribute for a 3D desktop will be the more efficient use of "real estate" on the same old desktop, which, in turn, will give the perception of a larger and more user friendly desktop....not to mention that it looks pretty slick ;) My guess is that any intuitivness will come from the creativity of the software developers utilizing the new 3D technology...well, some things never change.

2004-07-07 21:12:08
Re: comments on Jono's ideas and Simon's response
I think Jono has some good ideas. It would be great to have someone try to prototype some of them and see how they actually play out in practice. That is one of the reasons why we at Sun made the source of Project Looking Glass public.

In response to Simon's concerns, I agree that in a 3D interface there is an additional degree of freedom for screwing up the user's experience. So in designing a 3D interface, one must be extra careful to "do no harm" (as they say in the medical profession). But I also feel that we are on the cusp of an incredible time of brainstorming and experimentation. It's a time of trying out weird ideas and allowing creativity to flow.

I also have a comment about the appearance of slanted text. The reason why slanted text doesn't look nice is because it is using simple linear filtering. There are better types of filtering which can be used when the window is slanted, such as anisotropic filtering, which is does a better job of filtering high frequency content (such as text characters) when they are displayed slanted in perspective. Many graphics cards support this type of filtering today. Also, because this type of filtering is more expensive, it is best not to use it for windows which are parallel to the image plane, but instead to use it only when for windows which are slanted.

As for Simon's comment that displays are inherently 2D, there are indications that this
may be starting to change. I just saw a glass-less
stereo LCD display on a Sharp laptop that was very cool. It has a "sweet spot" about 1 inch wide. When your eyes are in that position, you can focus in and see what ever is displayed in actual and true 3D, without having to where any glasses. It still has a way to go, though. I found that viewing an medical image of a skeleton was awesome, but trying to play a car racing game was hard on the eyes, because when you begin to go around a turn your body tends to lean and your eyes drift out of the sweet spot. So your eyes are constantly working to try to refocus. The guy from Sharp said that they are going to be working on improvements to widen the sweet spot in the future. The point of my mentioning this is that true 3D displays are not so far off in the future as I had previously thought.

2004-07-08 00:02:06
User interaction is key
I work with maps and mapping data day-in and day-out. I even use products to visualize landscape in 3D perspectives and for real-time "fly-throughs". It's fine to do, but I find that the hardware side of things is more limiting than the software side. The 3D desktop concept is necessary, to be sure, to open up some types of applications. But I think it's the cart before the horse. Really good alternatives to inherently 2D devices (mouse, pointer, even gaming steering wheels) need to get more exposure.

I think it was about 8 years ago I learned about a friend using a the "space ball" 3D mouse. A ball you can move in 4 degrees of freedom or something like that. Have I ever seen one? No.

I suggest that the linear thinking that our minds appear to be used to with 2D displays, also appear to appreciate linear input devices.

I'm not suggesting space balls here (neither the mouse nor the movie :) but think that 3D interaction tools are lagging behind the 3D desktop type comments.

I'd like to see the 3D desktop, coupled with some kind of more "natural" (not "intuitive") tools for the purpose of navigating through spatial data and maps.

In case you missed my last weblog, this site has an interesting and pretty real example of a user interface that gives some of the freedom need for truly 3D tasks.


2004-07-08 02:37:15
Borrow ideas from PDAs
2 mouse buttons are actually hard to understand for some people, let along 3 buttons! I'm more into a tabbed window and pulldown applets interface, where everything can be seen and tools are a click away.

Mike Austin

2004-07-09 12:06:02
Re: Beyond the dimension
The first I read about 3D desktop environments was the 3Dwm project ( or Most of the reactions to it were negative. And if all you are going to do is display a 2D environment in 3D, I would agree that is pointless. While I am sure that the kinds of interfaces Jono describes will be implemented on Project Looking Glass, I don't see that it adds much value to the user. It is basically the difference between a simple progress bar and a piece of paper flying between 2 folders. However, I respect his at least making the effort to think about it.

My own thoughts on the subject are that it would be easier to develop 3D applications in a true 3D environment (which is sort of where Jono was heading). One such app would be a national trucking dispatch center. If the trucks had computers that included GPS, then the central dispatch center could keep track of their location and current inventory. A 3D app would allow a dispatcher to see the location of the trucks from above, but by tilting the display, it could provide visual information such as the available space, how long the driver has been on the road, etc. This would help a dispatcher to quickly route the most appropriate truck to pick up new orders making more efficient use of the fleet. This kind of app might also be adaptable to large warehouses, expanding the market. (Of course, these kinds of apps already exist and are probably adequate without being 3D. But it does feel like the kind of app that would make sense in 3D.)

The most basic issue (which Jono alluded to) is the widget set. What kind of widgets would be necessary to build 3D applications? If you go for specific widgets that represent real objects, you are limited to building apps for those objects. However, if you have more general widgets that provide at-a-glance information about the real objects they represent, such as quantity available, status, type, access, etc. using visual values such as skins, colors, shapes, contents, etc. then many more apps, that can't even be imagined yet, will be built.

I believe that a few specific applications will have to be successful before any significant development is aimed at a 3D platform. But once this happens, the flood-gates will be opened, and the early adopters will be riding the crest of the wave. Could this be the "killer app" for Linux?

2004-07-09 17:03:04
Misc comments on Jono's ideas and others response
Great discussions, Jono and others! While reading them all, some comments popped up on my mind. Please let me share them
(although it became long)...

Here is short summary of my thoughts:
- "3-D" doesn't imply Virtual Reality experience
- We need to relieve ourselves from traditional WIMP metaphor
when to think about 3-D UI
- Visual richness/affection to UI matters

I totally agree with gaussian's point in his comments about "what we need is a state of 'balance'". It is very easy and attempting to think about making 3-D desktop environment like virtual reality, or mimic what first person shooting games do, but I believe it is not the right direction. As simon_hibbs indicated in his comments, there are lots of such attempts that didn't work.

Interaction with real life environment has completely different constraints than 3-D desktop environment. Computer UI usually has tight restrictions regarding input and output (or interaction with the environment) compared to the real life. Where as computer-based (3-D) environment offers something what is almost impossible in the real life, like searching through all the files in reasonable amount of time and scaling/morphing things on the fly, as monkeyt touched a bit in his comments (and search is one big feature in Microsoft Longhorn). I think we need to use our brain a bit harder to identify right metaphor for 3-D desktop environment paying attention to its constraints.

You may say 3-D is hard to navigate. It is true in general. We can give up the idea by just saying that, but my interest is to come up with idioms/metaphor that helps us to achieve operation we want in reasonable manner. Those idioms may not reflect the real life at all... well, I hope I know the answer for such a metaphor, but I don't have anything concrete yet...

Here is a related article, FYI:

Ben Shneiderman (2003),
Why Not Make Interfaces Better than 3D Reality?,
November/December 2003, IEEE Computer Graphics and Applications,

I guess you have heard about WIMP (window, icon, mouse and pointing device) metaphor which is the basis of today's 2-D desktop. This is a great metaphor. Without it, we just had a blank sheet of a paper on the screen. We can do whatever we like with it, but I'd have just gotten lost. The WIMP metaphor provides some restrictions on the way we interact with the environment, but eases our interaction significantly. I guess my interest is to find out 3-D version of it.

The WIMP is designed based on the computational constraints we has more than 20 years ago. Now, we have completely different constrains. Especially 3-D graphics is cheap. Requirements are also different. These days we got more things (e.g. files, applications, etc) we need to operate on. I think we need to overhaul the metaphor a bit (a deep cascading application start menu is
an example of pushing too hard on WIMP's "menu" that is designed when we had relatively small number of things to operate).

But, it is very difficult for me to think about 3-D metaphor, since our (or, at least, my) way of thinking is all contaminated by today's WIMP UI. With that regard, I appreciate and enjoyed Jono's attempt to think in an out-of-box way. SANSing asks in his comments "What kind of widgets would be necessary to build 3D applications?" and gives some great ideas. I also guess we will come up with very different metaphor than today's WIMP, especially leveraging today's computer graphics expressive capabilities.

I guess I went too long already, so I'll wrap up quickly.

SANSing also says that "if all you are going to do is display a 2D environment in 3D, I would agree that is pointless." Well, if we display the entire desktop on one quad (which is what 3Dwm does), it is less exciting. But, if we display each 2-D window in 3-D space, and tries to add some additional value leveraging 3-D-ness, I believe the thing becomes much interesting (this is what project looking glass is doing).

Getting back to visual richness/eye candy discussion, you might find the following article interesting:

Norman, D. A. (2002),
Emotion and design: Attractive things work better.
Interactions Magazine, ix (4), 36-42.

I liked the following quote:

Wash and polish your car: doesn't it drive better?

Also, I took a story of initial reaction to color display encouraging for me to pursuing possibilities in 3-D desktop environment (even though I hear lots of disagreement with it ;)

At last, let me express my thankfulness to Jono and other folks who provided comments on this interesting discussions. I enjoyed reading them very much :)

2004-07-12 19:29:59
3D interfaces
I would go further than that.

In the home you can organize things however you want. You can even be disorganized and rely on your memory to find most things (except that book you borrowed three years ago before your last move).

In an office a responsible employee is organized.
What if he is sick, and others have to look for that document he was working on?
What if he leaves and someone else has to take over his project?
If he has been organized then these things will be possible.
They will be straight-forward if the office is organized, has reasonable procedures, and staff who have been trained in them.

Computers could make the reasonable task of searching an organised information store easy, and searching a disorganized mess possible.

The last thing we want is a way to make a disorganized mess even more so.
A more intuitive interface should merely be a part of helping people to understand structure, and of facilitating progress towards better organization of information.