23rd April 17:18
Thought Eddie and Alex might be interested...
Could I Get That Song in Elvis, Please?
By BILL WERDE
Published: November 23, 2003
Imagine having a singer with a world-class voice at your disposal, any
of any day. She's just standing at the ready, game to perform whatever
silly song you might make up for her: a ballad about her love for you, a
tribute to your best friend's golf game, a stirring rendition of the
evening's dinner menu.
Close friends of Madonna or Mariah may already have had that pleasure,
for everyone else a new technology called Vocaloid may offer the next
thing. Developed at Pompeu Fabra University in Spain and financed by the
Yamaha Corporation, the software, which is due to be released to
in January, allows users to cast their own (or anyone else's) songs in a
disembodied but exceedingly life-like concert-quality voice. Just as a
synthesizer might be programmed to play a series of notes like a violin
time and then like a tuba the next, a computer equipped with Vocaloid
be able to "sing" whatever combination of notes and words a user feeds
The first generation of the software will be available for $200. But its
arrival raises the prospect of a time when anyone with a laptop will be
able to repurpose any singer's voice or even bring long-gone virtuosos
to life. In an era when our most popular singers are marketed in every
conceivable way - dolls, T-shirts, notebooks, make-up lines - the voice
become one more extension of a pop-star brand.
The human voice has proven the most difficult of all sounds to
Digital technology can produce something clear enough to convey meaning,
but only in a clipped monotone that sounds more like a robot than a real
live person. A convincing human voice, spoken or sung, with all its
complex, flowing articulations and quivering uncertainties has been
unattainable. Yamaha has not yet made Vocaloid available for scrutiny,
judging by some early samples and demonstrations, the company seem to
made that quantum leap.
You can think of the software as a kind of audio font: musical notation
lyrics can be translated into the chosen voice, then saved for replay,
as a word processor might translate a text into Helvetica or Times New
Roman and print it out as many times as you like.
These fonts are made up of a database of phonemes, the basic sounds that
make up any language. To create the database, technicians record a singer
performing as many as 60 pages of scripted articulations (like "epp, pep,
lep"). Assorted pitches and techniques like glissandos and legatos are
thrown in the mix; with all the combinations, the process takes a week of
five-hour singing days. The resultant font is "reminiscent" of the
voice, says Ed Stratton, the managing director of Zero-G Limited, a
based company that has licensed the Vocaloid technology.
Zero-G is using Vocaloid to create the first of these fonts: Leon,
described as a "Virtual Soul Vocalist," and Lola, his female counterpart.
The digitized duo will make their debut in January at the International
Music Products Association conference in Anaheim, Calif.
The technology first attracted attention in March at Musikmesse, an
music technology conference in Germany. Paul White, the editor of the
British audio gear magazine Sound on Sound, was there for the
demonstration. "A few simple tools were used to adjust inflection, tone,
Grace" - recorded with prototype technology, yet still more human
than any previous vocal synthesis - was released on Yamaha's Web site
shortly after the conference. Quickly, that sample drew links from sites
the Netherlands, Germany, France, Japan, Russia and the United States,
setting Internet message boards and chat rooms buzzing.
In the case of Leon and Lola, session singers were hired to record what
Stratton calls "generic soul-singing voices." The decision to start with
soul was purely a marketing calculation: Mr. Stratton figured that the
common use of Vocaloid, at least in its early stages, would be to serve
background singers. With a soulful sound, the company could target a
commercial market that ranges from Justin Timberlake to Jay-Z.
But Mr. Stratton has many more plans. Soon, he said: "You'll buy new
and then any song you write, you can hear it sung a number of ways. You
might hear what it sounds like sung by a soul singer, and then an
voice or a choir boy."
Hit music producers like Dan (The Automator) Takemura (a creator of the
Gorillaz, a band that appeared only in an animated form, but sold several
million albums anyway) and the Matrix (the trio of Scott Spock, Graham
Edwards and his wife, Lauren Christy, that produced the three No. 1 hits
from Avril Lavigne's last album) say they are likely at least to try
recording with Vocaloid instead of backup singers. "As producers, you run
into some artists and oh god, it's so hard to get the right vocal," Mr.
Spock said. "It's intriguing, this idea of `O.K., just give me all your
vowels and all your consonants and I'll see you later.' "
Mr. Takemura says he would want to use the software to create sounds that
human voices could not. "The first producers to work with this are
going to have a hit just based on the novelty factor," he said. But, he
warns, "it's the imperfections in a voice, the happy accidents, the
ness that are often what's best in a song."
The market for synthesized voices extends well beyond recorded music. For
example, cell phone ring tones - a rapidly expanding field - already use
synthesized voices to personalize incoming calls. The DA Group, a
company, uses patented technologies to animate several popular virtual
stars, including Ananova, the British newscaster who exists solely online
as a lifelike, digital countenance, and Maddy, the bank teller avatar who
is being tested on ATM's in several markets around the United States.
listening to some Vocaloid samples online, Mike Antliff, the company's
chief executive, said, "I'm going to have my research team look into this
as soon as I get off the phone."
Vocaloid's next application will be Miriam, a third font that Zero-G
expects to release later in 2004. (A Japanese company, Crypton, expects
release its own font - "Japanese Pops," a bubbly female voice - in
Miriam is based on recordings of Miriam Stockley, a singer for the new
group Adiemus, which has worldwide album sales in excess of several
million. "At first I was quiet horrified by the idea," Ms. Stockley said.
"People tend to pay a lot of money to get my sound, and here I am putting
it on a font."
She changed her mind, she said, because "you can't fight progress, no
matter how strange it sounds." She also negotiated an undisclosed
percentage for each copy of Miriam that sells. But once Miriam the vocal
font is out there in the public, Ms. Stockley the actual singer has
control of how it will be used. Anyone who legally purchases the font is
entitled to use it to write songs for commercial purposes, though they're
not allowed to market them as Ms. Stockley's own recordings.
Mr. Stratton reiterated the point, "when vocal fonts are used, the
performer is the user and Vocaloid is an instrument."
In the long term, Mr. Stratton is aware that the true killer application
will be recognizable celebrity fonts - the Elton, say, or the Aretha. But
so far, none of the world's most famous voices have volunteered.
Michael Stipe of R.E.M. heard a Vocaloid version of "Amazing Grace"
and he said he was impressed. (The Yamaha Corporation includes samples
a recent press release at http://www.global.yamaha.com/news/20030304b.html
<http://www.global.yamaha.com/news/20030304b.html>.) But he wasn't
to rush out and have a font created. "I would hate to think that 250
from now Altria would use the Michael Stipe voice to sell organic soy to
Mars landing," he said. "It's intriguing in 2003. I'm not sure about
If Napster and other online file-trading programs have taught the world
anything, it's that once a technological cat is out of the bag, it can be
difficult to control. What's to stop dilettantes from creating their own
fonts? Could it be long before falsified but entirely convincing clips of
Britney Spears begging for Justin's f****veness circulate on the Web - to
say nothing of George Bush conspiring with Tony Blair about weapons of
"It is a matter of time before Yamaha makes this technology available for
consumers to make their own fonts," Mr. Stratton said. But at present,
process, which requires a deep knowledge of phonetics and audio
engineering, is too complex for ordinary consumers. Even if an ingenious
audiophile were to untangle the process, however, he would still need a
database of thousands of articulations - more than someone would be
to cobble together from available recordings. As for famous voices now
to time, if they left behind a substantial enough catalog, it might be
possible to produce at least a portion of the required phoneme database.
The rest of the required vocals could come from a sound-alike singer.
Elvis seems like an obvious candidate for vocal reanimation. Recently
for the first time), his estate licensed a couple of his songs for dance-
floor remixes; one of them became a No. 1 single in England. Licensing
Elvis for Vocaloid would be a different matter, though, says Gary Hovey,
vice-president of entertainment for Elvis Presley Enterprises. "If
came to us and said, `We want Elvis to sing this new song,' we'd have a
to contemplate," he said. "We tried to retain the integrity of his
song with the remixes. Now you're talking about a whole new vocal
performance of a song he never sang or knew? How do we know he'd want to
"Believe me, that would go all the way to Lisa," he added, referring to
Elvis's daughter, Lisa Marie Presley, who owns Elvis's estate.
Still, there is the potential for enormous money to be made, even by
standards. How much would an advertiser pay to have Elvis sing a new
jingle? How easily would a new "Elvis" song climb the pop charts - if
for the novelty value? Mr. Stratton is optimistic about the prospect. "No
font comes out of the box with a singer's timing and expressions," he
"It's just the tone of his voice and his pronunciations. The finer bits
expression - timing, pitch bend, the sorts of things that add real
character - would have to be added by the user working with the font. It
would take a great deal of effort to make it sound just like Elvis. But
could do it."
Once a full palette of vocal fonts is available (or once Yamaha allows
users to create their own), the possibilities become mind-boggling: a
chorus of Billie Holiday, Louis Armstrong and Frank Sinatra; Marilyn
singing show tunes and Barbra Streisand covering Iron Maiden. And how
before a band takes the stage with no human at the mike, but boasting an
amazing voice, regardless?
In fact, in today's world of computer-produced music, who needs humans at
all? Vocaloid could be used as part of an integrated music-generating
machine. Start with any number of existing programs that randomly
music. Run those files through Hit Song Science, the software that has
****yzed 3.5 million songs to determine mathematic patterns in hit music.
(Major labels are already taking suggestions from it - "Slower tempo,
please, and a little more melody at the bridge.") Throw in a lyric-
generating program, several of which can be found free online, and then
route the notes and lyrics through Vocaloid to give the song a voice. It
might not be a hit, but the process could provide inspiration for a lot
At this early stage of its development, the future life of this
is as much fun to think about as the almost-human voices could be to play
with. At the very least, Vocaloid promises to bring a whole new
infringing definition to the phrase "losing one's voice." We may soon
if an unmanned computer could produce hit singles or the voice of
tomorrow's virtual pop hero. Lisa Marie, any thing to say about that? And
really, can we even be certain it's you?
Bill Werde writes about the arts and technology.