Linux and Audio Production: Simplicity Required

by Jono Bacon

I am a musician. I spend a reasonably large proportion of my life creating and recording music. My home studio has all the marks of a musician - guitars, drums, mics, mixers and of course a computer. Despite the fact that pretty much all of my computers run Linux, the studio box is running Windows 2000 so I can use my sound recording tool of choice; Cubase.

From a Open Source consultant and advocates perspective, that computer is obviously a chink in the armour. To replace it with a Linux box and achieve the same results is a real challenge though. There are simply no multi-tracking applications on Linux that provide a comparative experience in terms of functionality and integration. Don't get me wrong, there are certainly efforts going in to this area and applications such as Ardour, Wired and Rosegarden, but these tools face a number of uphill battles in winning me over. The interesting point is that the challenge is not focused so much on features but on usability and integration.

It is fair to say that the requirements for audio engineering are fairly complex. The need to record audio at different levels of quality, layer on further tracks, mix them, apply effects, edit waves, perform overdubs and mix down are all essential requirements for the sound engineer and musician. Each of these features is no or less important than the other, and they all play a key role in creating a quality recording. When you read through the feature list for many of these tools, they offer the kind of features that I am talking about above. They certainly allow you to record tracks, cut them, adjust their volume and EQ in the mixer, apply some effects and mix them down. On a hard technical level, the feature list is largely satisfied - it is the 'soft' requirements that are the issue.

When you are identifying the requirements in any kind of software development, it is always essential to prioritise both the hard technical issues and also the 'soft' social issues. As much as supporting the above features is essential, it is also essential to match the mental mode of operation that the user operates in. When people are recording music, this mode is creative, and technology is typically relegated to unimportance - it should just work. When I am making music, I don't care for technology. I don't care about the spec of amps and guitars, I don't care about the technical characteristics of the mixer; I just want to plug in and record. The time between the birth of a song and getting it down on disk must be short - the creative mind is hampered tiny technical issues, and these issues are unacceptable. As such, any technical barrier in front of creativity is a real issue, and this where the Open Source solution really needs focus. The problems here are not just for those who create the multi-track software applications, but for the entire software stack from the kernel up.

The two flaws; integration and usability

Integration is a key problem in the current Open Source offering, and this is a responsibility of both the application developers and the distributors. If you try to run one of the many multi-track applications, it will need to talk to one of any number of sound systems. This not only requires me to understand what a sound system is, but I also need to dig through the documentation and determine which one it is, how I run it and in which mode. As I am sat there with my guitar resting on my lap, this is one of those frustrating technical barriers.

The integration issue is proportionate throughout the entire system. If I plug in a USB sound card, I want all of my applications to make use of it. Not only that, but I want to be able to configure the sound card from within my application. If you have a simple single channel sound card, the audio mixer will suffice, but if you have a complex card with 10 ins and 10 outs and multiple recording modes, you need an application to manage this. This is where cards such as the M-Audio Delta range fly on Windows - they come with a little control application to manage these parameters. You can certainly control levels with the ALSA mixer, but it will not allow you to deal with the many other options for the card.

The components in the system that do not affect the production of a recording from an interaction perspective need to be fundamentally invisible from the user.


The second issue is usability. Multi-track tools are renowned for being complex to use. This complexity is not necessarily an issue with the concept of recording audio into tracks, but the issue of having the requisite knowledge to spit shine the track with EQ, dynamics an effects to get the best out if it. This knowledge sits outside of the application. The same can be said for IDE's - creating a project in an IDE is fairly straightforward; the challenge lies with understanding the code - an entirely separate issue.

The solution to this problem is presets. The vast majority of users who record music are recording within the established remit of a genre. As an example, I record a lot of metal. This genre has some common traits when recording - the guitars are present and fairly scooped, the vocals are up front but slightly recessed in the mix, the bass drum plays a prominent role and requires a 'clicky' tone with high mid-range. I also record the entirely opposite ambient/classical style, which also has modes of practise - warm acoustic stringed instruments that are layered and panned throughout the stereo field, very present and up front vocals, plenty of reverb and delay etc.

Each of these modes of practise can be reasonably implemented in sensible defaults throughout the entire application. This not only applies to effects, but to other areas. Some ideas:

  • You could create a new song based on genre. As an example, for a rock band there are typically two guitars, a bass, vocals and guitar. This feature would create the tracks, name them and apply the default effects and panning.

  • All effects need sensible defaults. The common effects such as chorus, reverb, compression, limiting, wah, flanger and others can all have reasonable defaults, and tools such as Cubase do include some impressive defaults.

  • Mixing can also have reasonable defaults. EQ is a science that many don't understand, and a solid set of defaults can satisfy both common mixing needs and special effects such as simulating AM radio and phone lines.

Many of the issues of usability can be easily solved by identifying the kind of steps required to achieve a common goal. For many people who record, they are often stood up holding an instrument in a small room filled by a band. Interactions with the computer need to be kept to a minimum. The kind of visual interface requirements for recording and the requirements for mixing are entirely different. Recording is simple - you need to manage the stream of audio coming into the computer and assign it to a track, with some minimum level management. Mixing is entirely different beast in which the entire range of features in the tool need to be readily at hand. Mixing is a process that you conduct on your own with a beer, recording is a process you conduct with amps, guitars and band members to contend with.

The application should also hook into the desktop be intuitive. Although Ardour has been touted as one of the tools with the greatest potential, a real sticking issue is the fact that it looks so drastically different from the rest of my GNOME driven desktop (Ardour uses GTK) and is rather unintuitive. With some experience behind me of using Cubase, Cakewalk and Magix Audio Studio, I suspected Ardour with be a cinch to pick up - unfortunately I found it impossible to be productive straight away. If I can't use it, how is someone with no knowledge of audio recording supposed to use it? Ardour is certainly not the only offender here, and this seems to apply with a number of tools.

Finding a solution

The solution to the problem is integrating key, predictable components and making them work flawlessly. In all honestly, if I cannot download the software and make it work straight away without tinkering around with sound servers and such, it will not get a look in. Period. When you download and use Firefox it just works, when you use it just works, when you use The GIMP it just works - when you use Cubase it just works.

Part of this challenge is using comprehensive frameworks for building applications. It seems that GStreamer is becoming a very prominent framework with good support from a range of different applications for different desktops. In addition to this, HAL and DBUS are becoming the de-facto solution for managing hardware. with this in mind, hardware specific issues should really be directed to the kernel and HAL/DBUS teams. This will ensure that changes will propagate upwards through the stack and ease integration. From some discussions with the GStreamer and HAL teams, it seems that the kind of plug and play philosophy regarding hardware and software is becoming reality. With GStreamer and HAL shipping with all distributions, there is the opportunity for the application to just work. The work can then concentrate on being a great multi-tracker.

I am convinced the the problems discussed here have readily available solutions, but I think opening some dialog with the providers of different parts of the stack needs to happen to allow the solution to develop. Creating an integrated and usable system for audio engineering is something that will require cooperation from different parts of the community. This has worked elsewhere with other problem domains, and I see no reason why it cannot work here. Lets see how the story pans out...

What are your thoughts and experiences? Can audio production on Linux get easier? Can we achieve the simplicity experienced on other systems? Are there any interesting developments occurring that will solve these problems? Share your thoughts below...


2005-07-22 13:10:51
Sound Servers
Great article. I just started working with audio and video on Linux. I have been using Linux for years. The issue that I think was not mentioned here is the sound server. Linux is not going to make it as a desktop let alone a audio workstation if something doesn't change with regards to a reliable multi-channel sound server. Just getting your instant messager to notify you of a message while you are playing music from XMMS can be a huge challenge in and of itself. Applications like Skype and Real Player don't even talk to the sound server and cause other issues.

To complicate thing more, you need a low latency sound server for audio recording. Jackd seems to be the front runner at this point. It is good once it is running, but patching the kernel and configuring it can be tough. Then on top of that to get your other programs to have sound when running Jackd you have to map them through Arts or ESD. Now you have two sound servers running on top of your sound drivers. GStreamer is making headway for codecs, but its Jackd support is not complete yet.

Like many areas of Linux, until a standard sound server is accepted it is going to be diffcult to acheive the easy of use that you have discussed here, further limiting Linux's acceptance as a workstation at any level.

2005-07-22 13:56:46
Thank you
This is is the ONLY area left to windows in my computing world. Wired looks very promising but still has a way to go.
Im totally wishing the Cubase or even Cakewalk would
code up a Linux version. Of course until the audio subsystems unify and get better then what can be done.

Thanks for the article!

2005-07-22 20:25:47
Linux for Audio
Yes! This is exactly why I have not been able to migrate to Linux. I have tried several times in the last 7 years to see if Linux can fully meet my needs and the one of the two reasons I can't are the lack of "musical" audio applications. Your article speaks well to the specific problems with audio apps on Linux.
2005-07-23 03:15:11
Professional Music
" When I am making music, I don't care for
technology. I don't care about the spec of amps and guitars, I don't care about the technical characteristics of the mixer"

This seems a strange comment.

As a studio engineer we spend time sourcing just these very things for a studio!

As regards getting instant messenger and skype, use a desktop machine not an Audio workstation.

Ardour and JAmin are there as regards sound and usability without a doubt.

Yes you have to work with it and learn the system.
Fervent software has a live CD you can try and XMMS and such work as is.

You can load this to the HD and you have a fully working Audio workstation.

If you engineers are worried about their IM maybe they should be working from home!



2005-07-23 06:08:48
some useful links Jono forgot to mention (major resources list) (AGNULA/Demudi distro) (Planet CCRMA packages) (Studio To Go live CD) (mail list for Linux audio developers) (mail list for Linux audio users)

2005-07-23 06:29:35
another POV
If anyone reading this article would like to read how someone does make music with Linux audio software, feel free to cruise to and read my on-going series of articles (At The Sounding Edge). Column #18 recently went on-line, it's a brief profile of the neat FreeWheeling looper.

O'Reilly really needs someone with Linux audio savvy to write for them. Nothing personal, Jono, I understand that it isn't your main domain. I also agree with some of your points, though I think you might have got a better idea of the current state of usability issues if you had consulted the Linux audio mail list archives.

I should also have added this URL to my previous list: (Jan Weil's nice Wiki/blog for Linux audio software users)

Finally, I must note that there's a lot of music being made with Linux these days, as a search through the Linux audio users mail list archives will show. Major tools include Ardour, JAMin, Rosegarden, MusE, ecasound, and a host of other audio/MIDI apps and utilities.

Best regards,

Dave Phillips

2005-07-23 06:39:29
Linux Audio
I think the objectiveness of this article is tinged with familiarity with the product used by the author. Someone new to audio software on a windows/mac computer would face the same learning curve as using a linux based computer.

Also possibly more research was called for as there are some ommissions I feel.
Missing are the audio distibutions , which is installed on a Redhat or Fedora core, Studio To Go from Fervent and just started 64Studio plus audio related groups at Gentoo and PCLinuxOS. These all include integrated audio packages and low-latency kernels.
Jackd does not get a mention and envy24control (as an alternative to alsamixer) should be used with ice1712 based cards, MAudio among them.
Windows needs to configured properly to achieve the best performance
Some studios run linux now.

Regarding hardware this an ongoing issue for linux in general with the majority? of manufacturers still not providing drivers or specifications so that drivers can be written.

"when you use Cubase it just works" is not exactly accurate as some of the related mailing lists reflect. This also applies to the other win/mac audio software products. There are people whose experience with the win/mac audio software has led them to decide to move to linux.

Finally linux audio is getting there. It may not have the presentation of paid-for products but I for one prefer function over form.

2005-07-23 16:15:26
Professional Music
I think that the issue about me not caring about the technical specs of gear is possibly because I fall into both the musician and sound engineer camps.

A lot of it is simply to do with creativity. When I play and record music, I am in a creative mode, and this mode is different to when I am deeply involved with technology.

As for the comment on IM, I don't get what you are referring to?


2005-07-23 16:16:05
some useful links Jono forgot to mention
Thanks for adding these. I am sure they will be useful for readers of the article. :)
2005-07-23 16:24:57
another POV
Don't get me wrong, I am not denying the fact that there is a lot of effort going into music software on Linux.

The point I am making is regarding the usability of out of the box music software today. If I take a typical Linux distribution and install a package of Ardour and Jack, it does take some knowledge and poking around mailing lists and documentation to get it in a state that will record. This is my point. With other types of application there is no pre-requisite poking to be performing.

I am certainly keen to learn how these issues are being approached, and although I have read up on the subject fairly well, I understand there is still lots more to learn.

As I stated in my article, feature-wise, much of it is there. I think the problems with usability still stands though. I did give Ardour a series try, and I just found it unintuitive. I think there is plenty of scope for improvements, and I would be keen to help feedback some of these thoughts to the Ardour hackers. :)

I am very, very open to hear discussion on all of this. I understand that many will agree and many will disagree, and I am more than happy to engage in some constructive discussion and look at all aspects of the discussion. Thanks for your interesting comments.

PS - Thanks for the link to the Audio Blog. Subscribed in Blogines now. :)

2005-07-23 16:30:53
Linux Audio
Thanks for your interesting comments.

I think much of your post is getting to the heart of what I consider the issue. By saying there needs to be more reseacg, you demonstrate that a typical user does need to perform some fairly considerable reading to understand how to get cracking with a Linux based audio workstation. There is a pre-requisite of research that needs to occur with Linux, and this research is not required with other systems - this is my point.

I am well aware of pretty much all the tools you made, but the user should not need to have to dig through sites, mailing list archives and documentation to discover this. The tools should really be ready to run, and the custom Linux distros for music production do go a long way to solving this, but this seems to be an all or none solution. Should I really run an entire system customised for music production and nothing else?

Sure, this is fine for a dedicated studio computer, but this may be more impractical for the bedroom amateur.


2005-07-25 07:51:33
Imagine this X servers!
To me you're partly right. Linux really needs a unique sound server at the basic OS level. Just imagine if each desktop had its own X server: you could not run any KDE app inside Gnome or Xfce. Moreover, if this server would accept one app only at the same time, you could not display several windows in your screen!

With sound servers, the reality is not so far... You're running esd? No KDE app is able to play a sound. You're listening to an Internet radio in Gnome? Don't expect watching (hearing I should say) small videos simultaneously. Perhap's this is possible, but this certainly requires lots of Internet searches to get the good system tuning... Want to edit sounds? Then you catch an error because the app default sound server cannot access hardware: esd already handles it.

It's a shame when you consider the existing numerous high end audio apps. Nevertheless I remain confident about Linux audio future: ALSA was integrated in the kernel last year only, so be patient and let the time needed to make things better!

2005-07-25 09:16:02
Linux Audio
"The tools should really be ready to run, and the custom Linux distros for music production do go a long way to solving this, but this seems to be an all or none solution. Should I really run an entire system customised for music production and nothing else?

Sure, this is fine for a dedicated studio computer, but this may be more impractical for the bedroom amateur."

That's the good thing about Studio to Go! Because it's a live CD you can use the computer for whatever else you want, then you pop in the CD and reboot when you want to make music, changing your everyday computer into a dedicated studio computer.

2005-09-24 07:41:31
Sound Servers
I come from a Apple-centric background and what I can say is this in regards to Linux as a pro-audio plattform: What Linux has going for it is the possibility of radically modifying its infrastructure for a specific purpose. This Linux can do. Windows can not and neither can Mac. I've run real-time audio workstations in all three environments and while the unix underbelly of mac OS X drives amazing speed and realiability, (way beyond Windows) it is still held back by the fact that it's doing so through the constraints of a full featured operating system with bouncy colorful animations. The operating system itself is governing resouces, and very smartly no less, but... If a custom audio build or distrobution of Linux could be engineered, for the soul and total purpose of driving ONE audio suite, and that used ONE univeral audio server, that would be ideal- and I'm sure it's possible. No musician or sound engineer would complain about having to set up a separate install of Linux if they knew it was a streamlined sound machine utilizing every "bit" of hardware resource for real-time audio recording and playback. I would stare at a static black and white screen if I thought it would outperform the Mac systems. The only thing near to this so far I've seen is a commerical samper for windows only (ironically) called Giga. Giga can loop, process and fire a tremendous number of simultaneous samples with very little latency, simply because the programmers found a way to grant the application direct access to the hardware, punching a hole through Windows so it could not control. So, if there are any Linux developers out there I take of my hat to you for your patience and dedication but to build a pro workstation please sacrifice everything else. Strip off the multi-server/multi-user/multi-color/muti-GUI, and multi-purpose stuff. Simple but detailed, integrated software that does not require bugfix updates every seven minutes or constant references to as many user forums to troubleshoot something every week; This is what a professional audio engineer/recording artist needs. A system that just works and is not Beta version .0003. Build a Linux engine that drives audio, like a f***ing unix racehorse and does not try to do anything but and you'll find a while lot more friends in the Linux universe (who already know unix code).
2006-02-24 12:34:36
Try Dyne:bolic
You may find the bootable Linux CD available at the cure that ails your integrated Linux digital audio workstation woes. It is painless to download the iso image of the disk, burn it to CD and reboot your windows box from this disk to test its abilities.

It creates a low latency kernal Linux OS in a RAM drive without touching your hardrive. It comes with a suite of integrated opensource audio, video, image and text generators, recorders, editors and mixers for your creative endeavors.

The audio tools are mostly "Jack aware". The Jackd server creates a virtual patch bay between jack aware tools and your sound card. A jack tutorial is available here
This allows very flexible connectivity beetween a variety of audio tools and in my humble opinion greatly increases the possibilities for studio creativity.

Enjoy! - Dr Funn -