- The key differentiator between a bot and most technology is the word "conversation" -- bots can have one.
- Bots (like humans) need to be able to understand that different keywords mean the same thing (such as "2-bed" and "two-bdrm").
- Machine learning algorithms help bots extract meaning from sentences and data from queries to eventually fulfill a user request.
A huge topic of interest at this year’s Inman Connect New York event was bots — digital assistants that can do anything from qualify leads to help buyers with home search to give a homeowner a voice-prompted home valuation estimate.
OK, bots are hot — but how do you build one, and how do they work? Pete Walsh, director of artificial intelligence at Structurely, and Nate Joens, CEO at Structurely, gave a rundown at the inaugural Hacker Connect event on Monday, breaking down the steps to building a bot.
What’s a bot?
A bot is an application that lives inside a messaging interface — and there are developers out there who believe bots will replace websites and apps entirely at some point.
The key differentiator between a bot and most technology is the word “conversation” — bots can have one.
And building a digital entity that can talk to a person is probably a lot more complicated than you’d expect.
“While most engineers know how to build a mobile app and a website,” those standard components don’t necessarily exist for a conversational interface, noted Joens.
Components of a conversational interface
User behavior or event data that’s gathered by or for the bot before the user even starts a conversation is “context” — and it can help the bot help the user.
Behavior data might include pages on your website that the user has visited or the user’s current location, for example.
Understanding what context you might be able to give the bot and providing the bot with that available context is the first step.
Bots are designed to respond to intents — a goal that the user wants to accomplish.
One obvious example is searching for the home. The user’s intent is, perhaps, to find a home in his or her desired ZIP code and price range.
The bot will need to understand the user’s intent (however it is expressed) and respond appropriately to it.
When the user has expressed an intent, the bot will typically need to prompt the user for more information. This is called, well, a prompt.
In the home search example, a prompt might be, “What features are you looking for?” or “How many bedrooms?”
A slot is a piece of data that is needed to execute the prompt. (Execution, in developer language, is the function performed by an application using either contextual data — which is gathered without the user providing it — or slot data — which is provided directly by the user.)
In the home search example, a slot might be the number of bedrooms that a property has or whether or not it has extra storage on the lower level (also known as “a basement”).
This refers to a piece of business application data that’s needed to continue the exchange.
For a real estate agent with a home search bot, the goal is probably obtaining the user’s email address or phone number, which the agent (or bot) can use to follow up with the user.
Giving a bot a goal also means that instead of delivering the search results directly to the user, it will provide the results via email or text message.
How do you build a bot?
The problem with conversational interfaces — from type to voice — is that humans don’t structure their sentences in exactly the same way every time, or from person to person.
So one sentence intended to express a simple idea could be rendered dozens of different ways, with countless variations.
Think of all the different ways you could type “two-bedroom two-bathroom” into a search field:
- two-bedroom two-bathroom
- 2 bedroom 2 bathroom
- two bed two bath
- 2 bed 2 bath
- two bdrm two bthrm
- 2 bdrm 2 bthrm
- two bd two bth
- 2 bd 2 bth
- 2 br 2 bt
The bot (like a human) needs to be able to understand that all of those different keywords mean the same thing.
This requires hand-trained data — in-house artificial intelligence data trainers who teach the bot about the endless variations in human language.
(And the really smart bots are designed to be self-teaching — like Amazon’s Alexa, who’s supposed to improve on her own language skills as you interact with her.)
Teaching a bot to talk
There are five steps to building a bot worth talking about, said Walsh.
1. Preprocessing: Every piece of text that comes into a bot is preprocessed.
- Words are clustered into groups that mean the same thing.
- Then, developers do “neural network vector embedding” of words that relate to different expressions of “bedroom” and “bathroom,” for example.
- Stemming reduces those words to base stems or roots. (You might use “mov” for “moving” and “mover,” for example.)
- Remove stop words (like “in,” “for,” a”) — but sometimes stop words can contain useful information, so be careful.
2. Intent classification: Figure out what the user is trying to do.
N-gram feature generation gives the bot this ability to decode user intent.
Every sentence generates a set of n-grams — a contiguous sequence of items from text or speech — which break the sentence into different word groups, and tree-based machine learning algorithms help the bot extract meaning from the sentence and correctly interpret the user’s intent.
Here’s an example of how you might break down a real estate-related intent using n-grams.
Intent: I am looking for a two-bed two-bath apartment in Denver’s Golden Triangle neighborhood.
If n=2, then your n-grams would be:
- I am
- am looking
- looking for
- for a
- a two
- two bed
- bed two
- two bath
- bath apartment
- apartment in
- in Denver’s
- Denver’s Golden
- Golden Triangle
- Triangle neighborhood
If n=3, then your n-grams would be:
- I am looking
- am looking for
- looking for a
- for a two …
The tree-based algorithm helps the bot relate words to each other within the sentence to help it ascertain what the user wants it to do.
3. Entity extraction: The “entity” might be the address or other specific data extracted from a query.
Basically, this is the data that’s needed to fulfill the user request. For example, the bot would use address extraction for a user who wants to see properties on a specific block or street in order to return results on that block or street.
Ideally, the bot will have a “neural network with LSTM (long short-term memory)” for extracting subtle features as well as obvious ones. LSTM helps the bot “remember” the words it “read” at the beginning of a sentence and relate them to any appropriate words at the end of the sentence.
4. Execution: What the bot does to fulfill the request.
This will vary depending on what type of bot you have. A recommendation for finding specific properties might require an external API (application programming interface) to the MLS or a public data lookup in order to give the user the desired results.
Market statistics or points of interest will probably also involve APIs — but more likely internal ones, Walsh said.
Predictive days on market or automated valuations are other types of requests that real estate professionals might want to fill, and execution will require some kind of machine learning to give the bot the ability to respond.
5. Response composer
You want to humanize the bot — but not too much, cautioned Walsh; humans can still get freaked out by the “uncanny valley.”
The bot should be given the ability to generate random messages from a predefined set of structured or templated messages. (And a little bit of self-deprecation, artificial intelligence style, couldn’t hurt.)
After all, the Structurely team concluded: AI is about creating the illusion of intelligence — not simply intelligence.