Imagine this scenario: You hear the buzz of conversation as you near the
local pub, so you walk in. Immediately, since you’re a 7’9” 300 pound
herculean adventurer, the locals gawk at you and temporarily hush up.
Then, as they figure youre not going to start slaughtering anyone, you
walk up to the bar and sit down. All around you, locals are making small
talk, about the price of wheat, the thief that was lynched the other
night… typical backwater drudgery.
The bartender approaches and engages you in conversation. Since he is
directly within your targeting range (imagine 2.5 ft radius) the chat
also appears in your chat reader. You can clearly hear his words over
Basically, the tech design here would require realistic 3D sound
emitters, a text to speech engine, and a localized AIML set (dynamic, of
course, so that the NPC’s aren’t totally stupid.) Real-Time phonetics
emulation would be an option to turn on/off based on performance… but
having realistic conversations with NPC’s (not perfect, AIML is dumb
sometimes) would be awesome.
It seems to me that running an AI server to handle all the ‘Chat’ AI
functionality would offer one helluva lot to the immersion. All that’s
needed is a script that handles efficient conversation targeting and a
“Text In/Sound Out” script that reads the chat from the AI server and
tags the source. Everything else would be fairly standard for MMO NPCs.
I’m thinking of adding phonetics based “Voice Styles” and different
voice types, for accents, customization, and a real feeling of having A
Person on the other side of the pixels.
I understand the limitations of AIML, but the overall structure makes it
ideal for NPC chat, and I was wondering if anyone has looked into this
at a serious level. Are there games out there utilizing this approach,
or is it too bandwidth consumptive/CPU draining?
I’m still in a conceptual process, but the idea seems sound. Anyone see
any glaring errors?
You could also allow players to submit ideas for additions to the AIML.
Combine this with a relatively large community (5000+) and you’re
guaranteed to snag some obsessive compulsive gamer who will devote
dozens of hours developing perfect conversational AIML sets for your
Ok, that would be unwarrantedly taking advantage of a sick person for
coroporate gain. :whistle:
Anyway, what are some thoughts?
Please log in or register to post a reply.
An XML format for defining automated answers to questions. (Like a
Embedding an Alice like bot into the game sounds fun, as long as getting
all the data for all the different NPCs is not a challenge.
Text-to-speach really isn’t that wonderful, so unless the theme is
`robot world’ maybe leave that out (which would make development a lot
easier, free up cpu ect).
Make a little avatar demo and post it :)
Text to speech just lacks polish. I’m not looking for a universal
system, just a very broad, closed system approach to add depth to a
gameworld where it was previously lacking. And yeah, I’ll do up a simple
avatar over the next few weeks as a kind of demo.
My goal is, at this point, a realistic virtual tavern. Something like
John’s, if you’ve ever read Feist’s Serpentwar series.
Anyway, aiml seemed to me to be the easiest route to realistic
conversation. On a MMO scale, it would require a huge database, and a
server all it’s own, but for my purposes the CPU usage shouldn’t be bad
at all :)
Check out www.Alicebot.org to learn more of AIML.
Sounds like a cool idea. I’d like to see a demo of it too when you’re
AT&T Natural Voices and some other company sell very high quality voice
packs. I honestly thought they were very convincing from their demos. It
might be worth investigating if you go along with this.
Blah. I was shown a chatbot demo the other day by a friend of mine,
thinking that the text-to-speech was computer generated. It was a
library of recorded responses, which is where the tone and inflection
really came through and outshone the CG stuff (like the AT&T pack.)
I ran a basic bot voice into one of the microsoft avatars just to test
out the voices I could find (I’m not yet willing to put money into this
if there’s no good voice collections out there.)
It’s standard bot text-to-speech, and it r sux.
Hiring people and recording their recitals of AIML sets with proper
inflection is the only way to really get this idea to work.
Word to the wise: 15 chat bots talking in a virtual room with
text-to-speech sounds like crap. Especially if you are unfamiliar with
sound programming :P
A project like this requires more time and effort than is worthwhile to
me at this point. I’m still keen on the idea of AIML driven NPC’s. Just
not having them speak. Btw, AIML set conversations are really stupid, as
How are you?
I’m great, how are you?
I’m great, how are you?
The day is looking fine.
The day is.
Is the day is what?
It’s probably my implementation that sucks, and I only used the standard
ALICE brain, although if you’ve played with AIML at all, you’ll have
come to the same conclusion I have: ALICE is pretend-AI.
Maybe I should leave this idea in the attic to collect dust until such a
time as we have real (or real enough) AI to deal with something of this
Or until I have $20,000 to shell out to voice actors to record huge
AIML sets :P
btw, I discovered CyN, which is an aiml integration with OpenCyc… very
interesting case for developing a so called strong AI out of aiml.
@ Ooka: I’m really glad I found this thread. I’ve been dreaming of
seeing this hugely important feature in a future MMOG. Unfortunetly,
sound never seems to get the attention it deserves, so I’m not gonna
hold my breath.
Do you guys think we’ll have talking NPC/Mobs/pets in MMOGs by 2010?
Actually, I think that it would take several dozen voice actors to make
this a doable idea. You’d need to set up scripted responses to almost
any type of question in-game, which would be a very large effort, and
then you’d give each actor a recorder and have them speak each phrase
numerous times, covering any of the contexts in which the phrase could
be spoken. Different accents (scottish accent for dwarves, british for
elves, etc) and different tones would need to be covered as well.
Once you had the library of responses built up, you could build up a
believable conversational system. It would probably cost upwards of
$15000 to complete it, but once you did, you could not only use it
directly in a game, but you could apply learning algorithms to it and
have the computer generate it’s own voices.
It’s an aspect of immersive game technology that hasn’t yet been
developed because of it’s complexity. The question comes down to this:
is it worth the time and money to invest in the technology, or do we let
it advance on it’s own until voice libraries are as developed and freely
available as graphics libraries?
Linguistics is far more complex than graphics. In fact, I’d say that
it’s by far the most complex system short of strong AI that relates to
2010 might be possible, but I’d say we will definitely have it by 2015.
Check out Cyc, for an example. Each piece of “common sense” knowledge is
hand-fed into the knowledge base to verify it’s accuracy and pertinence.
The system is very specifically designed to allow the engine to infer
things about context. A similar system would have to be employed in
order for a voice-chat program to be believable and accurate.
Anyway, the first company/person that develops it is going to make a
killing, because once one game has it, every game will need it. Just
like 3D graphics. And there will be lots of crappy spinoffs and
imitations that will lead to further discoveries and improvements in the
This has given me another idea for a thread, I’m going to develop it in
Your right Ooka, who ever develops this thing will make a killing…the
whole development team could retire!
Well, it’s good to know that someones moving on it.
Check out MegaHAL The Ultimate
definitely some funny stuff, there. Hal can seem almost human, at times.
I prefer Cyn, however, and have actually had some decent conversations
with it, which it remembers. Hal always forgets me :(
I mused about replacing the typical adventure game dialogue tree with
code that actually simulates the conversation, taking into account
factors such as the NPC’s emotional state and knowledge base, and
generates his/hers sentences on the basis of the actual grammar, but
then I’ve realised that:
a) The algorhithm would be too complex for such an amateur as myself to
ever conceive, and
b) I’ve had to create at least one full vocabulary of any given language
for the thing to work. Given that typical vocabulary for any given
language is several thousands of nouns at best - good luck typing them
all into the computer.
As you might imagine, I no longer even consider this possibility.
Thats why you use things like wordnet, or the openCyc database. Millions
upon millions of terms that have been given a standardized format for
use in computer/AI related apps.
Fun stuff :)