How does predictive text work?


Background: so I’ve just started working with handwriting recognition softwares and I’ve read that some of them use model similar to predicitive text in your phone, to guess an unidentified word in a sequence would be, how exactly does this work? And also, specifically with modern phones, how do the predictions become so personalised?

The “predictive text” you are talking about is maybe a misnomer, as predictive text is the system used back when mobile phones had buttons, to guess (predict) words based on number key presses, rather than having to push four times on number 7 to produce the letter ‘S’. It was a system to severely cut down on the amount of button presses needed to write text messages. What you may be thinking of in more modern terms is autocorrect, and autocomplete.

Imagine having a giant dictionary with all the words in it. As soon as you type the first letter of a word, the dictionary moves to that letter in the dictionary, with the second letter it moves down the list to the words beginning with those two letters, three letters and onward. That’s the very basic autocomplete which works on a per word basis.

In reality most autocomplete systems today are primed primarily with words which are more commonly used being suggested first, rather than more specialized and seldom used words. They are based on vast analysis of written texts, conversations, emails, etc, to make a system which analyzes the words previously written to suggest the most likely word to follow. The word “the” is used by far more often than “thesaurus”, for instance. Some systems even starts suggesting the next word before you even start writing it only based on words you have already written. It’s basically a statistics game, simply explained as a vast database of words and sentences cross linked with linked possibilities between them. As most sentences you’d write there are many possible routes they could take you’re often suggested several possible words, and combined with what I wrote in the previous paragraph typing the first letter of the following word narrows down the selection even more..

The next level to this are systems which learn from your writing style, and learn from how you use language, which is somewhat unique to you. That’s why systems like autocorrect can learn how you misspell a word and keep suggesting that even though it’s the wrong spelling, and against what the dictionary says.