Published on O'Reilly (http://oreilly.com/)
 See this if you're having trouble printing code examples

ETel and Your Second Language

by Matthew Chmiel

Computer Assisted Language Learning (CALL) is a term linguists use for programs like Rosetta Stone. It also happens to be a wonderfully apt acronym for this article. I recently created a CALL program with Asterisk called the Language Dialer. The Language Dialer uses the telephone to record practice conversations for language students in any language. All you gotta do is call.

I built the Language Dialer (LD) because I started teaching myself Bengali a year ago (my wife's language). My lessons come from a "Teach Yourself Bengali" book and CD I bought soon after we got engaged. I was a student at New York University's Interactive Telecommunications Program (ITP) at the time, taking Shawn Van Every's Producing Participatory Media class, when I was inspired to turn my Bengali learning experience into a new kind of language acquisition tool.

I developed this project from the perspective of a current student, wondering what kind of interaction would enhance the lessons of my "Teach Yourself Bengali" book. As a result, the first few versions of this project were quite self-involved. I used a web camera to film myself reading a Bengali children's book, expecting viewers to make comments in the comment section of my Moveable Type blog, pointing out errors I made in the reading (who would watch this?). This first version set a very limited context for a specific target audience (my wife). And even she wasn't interested.

I enhanced this model in the second version, using SMIL coding to link a QuickTime recording of me reading a Bengali headline in an embedded player. I improved user commenting by incorporating Asterisk — providing a phone number for users to call in and record a comment, critique, or reaction to my shaky Bengali. These projects were educational failures that motivated me to expand the original concept. I still needed help with my Bengali, but I had to open up this process to other students.

Asterisk facilitated better user interaction and allowed me to expand the idea of creating a unique language-learning tool on the Web. I decided to enhance the Asterisk-empowered interactions by emphasizing audio over textual content. Rather than reading Bengali script, I was going to recreate Bengali conversation simulations. Leaving text aside would also eventually let this project support any and every language on the planet, but I will get to that in a minute.


The third iteration of this concept was an all-audio prototype I introduced at the 2007 O'Reilly ETel Conference called BanglaBollo. This was a Bengali-centric Asterisk program for students interested in practicing three conversations stored on the site. These conversations followed a specific scene, like ordering food in a restaurant, where one side of a conversation was pre-recorded and broadcast when calling a number. If a student called the restaurant scene, he would hear my wife's voice playing the part of a waiter asking questions about his order. In between each question is a four or five second pause, just long enough to respond. It is up to the student to listen to each question and reply as if it were a real conversation. The whole exchange is then recorded and stored as a .wav file on the page, making it a collaboration of sorts where other students can participate as well as listen to other callers make recordings with the same audio file.

I ran into a problem with BanglaBollo. I had hoped to get critical user feedback from Bengali students using the site. Online resources for Bengali students are rare despite the fact that (depending on who you ask) Bengali is between the fifth and seventh most spoken language on Earth. I aggressively solicited use and/or comments from Bengali Associations across the country. Bengali Associations are dedicated to preserving Bengali culture and many members send their children to Bengali language classes on weekends. I got several responses from my solicitations, all of them saying basically the same polite thing: Thank you, this is nice. But… no one used the site. I can only assume the reasons for this are: 1) they did not trust me. Who is to say I am not recording their conversations for disreputable reasons? Why would anyone heed the solicitations of a random student (who isn't Bengali) trying to get students speaking better Bengali? 2) They did not like or understand the scenes I recorded. The pre-recorded audio samples were poorly thought out and in no way meaningful to any teachers. I remained too involved in the process, depriving potential users of a personal context that is necessary to motivate use.

The Language Dialer

The Language Dialer was built out of the lessons from my BanglaBollo experience. It offers the same Asterisk-powered audio interactions, where students call a phone number to record practice conversations that are divided into scenes. The LD updated the BanglaBollo package, making all content user-generated, and offering this service for every language on the planet. The LD is a social network between registered students and their tutors. A tutor is anyone a student knows who speaks a student's target language. My wife and in-laws are all my Bengali tutors on the Language Dialer. This relationship is harnessed by the web site to offer content for everybody, whether registered or not. If anyone reading this is struggling to develop conversational Bengali, he can benefit from the scenes provided by my tutors. This goes for any language.

The Language Dialer divides every language into 10 scenes. The scenes are empty shells; they are contexts for tutor-generated samples. It is up to tutors to record the parts. I designed the scene themes based on samples I would like to practice as a Bengali student. These scenes follow everyday conversations from simple introductory dialogues, to complex social interactions like attending a wedding. Each scene consists of four or five questions following the theme of the scene. In between each question is a pause long enough to record a student's response. Each scene is recorded by tutors registered with the site. A tutor is requested to record a scene by his student via email. The email gives detailed instructions on the scene's theme as well as how to record. This recording is done in Asterisk, and is as simple as leaving a voicemail. Once the recording is made, the web site offers that scene to any interested user. There can be up to five different versions of any scene on the site. This means that a fully populated language on the Language Dialer can offer 50 different conversations to practice.

The scene-recording platform is designed to protect against vandalism. I am wary of letting anyone record scenes for public consumption, such a framework leaves an easy target for people interested in recording inappropriate content that will be hard to referee. The fact that these recordings are done in any language makes it even harder to stop. As a result, there are no public instructions on how to record scenes — if a student visits an empty scene, he can enter his PIN to request that his tutor(s) record some content. That request is emailed to the tutor with detailed instructions.

The relationship between students and tutors also protects against vandalism. This relationship existed before either person visited the site; it is designed to match grandparents and grandchildren, wives and husbands, friends and friends, etc. I hope a tutor responds to a request because it is from her grandchild, friend, or student. The desire to help someone dear will not only help protect against nefarious recordings, but introduce non-social networking users to this social network. On a personal note, I offer the recordings made by my in-laws in India as an example; I can assure you they would not contribute to such a network without a personal reason. All that said, a voting system must be integrated so that other users can root-out unintelligible or inappropriate content. There is always room for more safeguards.

The Language Dialer framework is a product of my inexperience as a coder. It is a network of three separate accounts: a Junction Networks SIP account, a DreamHost web hosting and database, and a Lylix.net Asterisk server. This is not the most efficient model for developing such a network — if one of these accounts is down while calling the web site, there will be no answer. I plan on integrating as many of these parts as possible, considerably reducing the risk that one of these servers will crash.

Considering my inexperience using and managing these services and integrating all these parts, I would like to thank the tech staff at Lylix.net for helping me settle all the bugs interfering with this complicated setup. I faced serious permissions issues linking DreamHost database material to the Lylix Asterisk server, and vice versa. The web site links directly to the server in order to play back the audio samples recorded over the phone. This was harder than I expected.

The Language Dialer Dialplan

The LD Dialplan is relatively simple. Every call into the system is filtered through a phone tree that takes unregistered students, registered students, tutors, and phone pals to their specific destination. Each choice in this tree leads to a specific extension that determines what information is required from the caller, as well as what information is available. The information on the web site is organized to accommodate inexperienced, unregistered callers. If a first-time visitor who happens to be learning French wants to try a call, she is instructed to dial the LD phone number and press 1 at the phone tree to indicate she is not registered. The only other number she is required to enter is the scene code number that activates the audio file she would like to practice.

Each scene is organized with its own ID number — these numbers are stored in a MySQL database. The Dialplan accesses this database information through Asterisk Gateway Interface (AGI) scripting done in PHP. The audio files are recorded on the phone and stored on the Lylix Asterisk server, but named and linked in the DreamHost database. When a file needs to be retrieved, its name is found in the DreamHost database and linked directly to the Asterisk server. Each selection on the phone tree leads to a specific AGI script that accesses a specific table from the database.

Registered users are given a 4-digit Personal Identification Number (PIN) when they register with the site. Registration takes little information — it is limited to first and last names, email addresses, a PIN, and the native and target languages of the user. The PIN assures that the Language Dialer identifies each caller into the system. When a registered student calls and practices a specific scene, he is identified by the web site, as are his tutors. This information is passed into the AGI script using the "Read" command in the Dialplan. As a result, his call will be stored on his own personal LD page where he can keep track of all his calls.

Additionally, he can email his recording to his tutors for review. They will receive an email with a .wav link to the call that also includes a phone number and instructions on how to record a critique about the recording. This critique will also be stored on the student's personal page, associated with that recording. This information also lets the student track personal development.

The final choice in the original phone tree is for phone pals. Phone pals are a two-way collaboration between two registered students on the site. These relationships are formed when one student's target language is the other's native language, and vice versa. If I am a native German speaker learning Russian, and another student is a native Russian speaker learning German, we can be linked by the site for some phone pal exercises. The web site will supply a theme for the conversation — something simple for novice speakers like introducing themselves to each other in a series of voicemail exchanges. The only rule for phone pal collaborations is that only the target language can be spoken. This arrangement utilizes the native expertise of each student. At the end of the exchange, each student will record a critique of the other's performance. More advanced speakers will be given puzzles to solve, like reading a map together where one side cannot see the other and has to give detailed directions.

Looking Forward

The Language Dialer is currently up and running. It needs to be used to be useful, so it is up to me to get the word out about this project. There are many updates and improvements I will be working on in the future. I will create a specific account page for language teachers who can create classroom accounts, assigning PINs for students and recording classroom specific scenes. I will lay the groundwork so that these recordings become available to the general population of LD users. Additionally, I will keep this site online, accommodating any critiques or suggestions I get from users.

Asterisk provides a framework for truly innovative interactions to compliment our everyday lives. The Language Dialer was built to assist students currently learning a new language — it does not re-invent the wheel, nor try to replace the classroom. It can compliment any process by which students learn a new language, whether in school, with Rosetta Stone, Earworms, "Teach Yourself," or any teaching method. It links students with tutors in a way that can span the language gap and geographic distance.

I was looking for a way to get my in-laws to help me speak Bengali, even though they live 7,000 miles away. The LD helps them help me learn to speak not only Bengali, but the Bengali they speak at their kitchen table. Each call lasts no more than three minutes. It can be made with a Skype account, international calling card, or any other phone connection. If this project catches on, I would hope to get more Asterisk server space throughout the world, assuring more people access to local call rates.

Using recorded conversations with Asterisk supports the kind of practice that embraces dialect, and slang, and individual language quirks and that emphasizes meaning over vocabulary tests. Additionally, it offers the same services for global languages like Bengali, Hindi, Mandarin, English, as well as regional dialects like Ilonggo, Pali, and more. I am excited by this project and hope I can turn this idea into something more. In the meantime, I will keep working on this, listening to any feedback from users, and developing new ideas that make people talk more.

Matthew Chmiel is an interactive designer and graphic artist.

Return to Emerging Telephony.

Copyright © 2009 O'Reilly Media, Inc.