A couple of days ago, a friend sent me an article talking about the present state of the art of chatbots – artificially intelligent assistants, if you like. The article focused on those few bots which are particularly convincing in terms of relationship.
Now, as regular readers will know, I quite often talk about the Alexa skills I develop. In fact I have also experimented with chatbots, using both Microsoft’s and Amazon’s frameworks. Both the coding style, and the flow of information and logic, are very similar between these two types of coding, so there’s a natural crossover. Alexa, of course, is predominantly a voice platform, whereas chatbots are more diverse. You can speak to, and listen to, bots, but they are more often encountered as part of a web page or mobile app.
Now, beyond the day job and my coding hobby, I also write fiction about artificially intelligent entities – the personas of Far from the Spaceports and related stories (Timing and the in-progress The Liminal Zone). Although I present these as occurring in the “near-future”, by which I mean vaguely some time in the next century or two, they are substantially more capable than what we have now. There’s a lot of marketing hype about AI, but also a lot of genuine excitement and undoubted advancement.
So, what are the main areas where tomorrow’s personas vastly exceed today’s chatbots?
First and foremost, a wide-ranging awareness of the context of a conversation and a relationship. Alexa skills and chatbots retain a modest amount of information during use, called session attributes, or context, depending on the platform you are using. So if the skill or bot doesn’t track through a series of questions, and remember your previous answers, that’s disappointing. The developer’s decision is not whether it is possible to remember, but rather how much to remember, and how to make appropriate use of it later on.
Equally, some things can be remembered from one session to the next. Previous interactions and choices can be carried over into the next time. Again, the questions are not how, but what should be preserved like this.
But… the volume of data you can carry over is limited – it’s fine for everyday purposes, but not when you get to wanting an intelligent and sympathetic individual to converse with. If this other entity is going to persuade, it needs to retain knowledge of a lot more than just some past decisions.
Secondly, a real conversational partner does other things with their time outside of the chat specifically between the two of you. They might tell you about places, people, or things they had seen, or ideas that had occurred to them in the meantime. But currently, almost all skills and chatbots stay entirely dormant until you invoke them. In between times they do essentially nothing. I’m not counting cases where the same skill is activated by different people – “your” instance, meaning the one that holds any record of your personal interactions, simply waits for you to get involved again. The lack of any sense of independent life is a real drawback. Sure, Alexa can give you a “fact of the day” when you say hello, but we all know that this is just fished out of an internet list somewhere, and does not represent actual independent existence and experience.
Finally (for today – there are lots of other things that might be said) today’s skills and bots have a narrow focus. They can typically assist with just one task, or a cluster of closely related tasks. Indeed, at the current state of the art this is almost essential. The algorithms that seek to understand speech can only cope with a limited and quite structured set of options. If you write some code that tries to offer too wide a spectrum of choice, the chances are that the number of misunderstandings gets unacceptably high. To give the impression of talking with a real individual, the success rate needs to be pretty high, and the entity needs to have some way of clarifying and homing in on what it was that you really wanted.
Now, I’m quite optimistic about all this. The capabilities of AI systems have grown dramatically over the last few years, especially in the areas of voice comprehension and production. My own feeling is that some of the above problems are simply software ones, which will get solved with a bit more experience and effort. But others will probably need a creative rethink. I don’t imagine that I will be talking to a persona at Slate’s level in my lifetime, but I do think that I will be having much more interesting conversations with one before too long!