I can’t believe I’ve got myself into this, really. The more I look into what I have to get to grips with, the more I realise I might as well be right back where I started…. even explaining it is difficult, but I’ll have a go.
The development of Artificial Intelligence continues to be a headline-grabbing discipline. Alan Turing’s famous ‘Turing Test’ suggests criteria by which a computer can convince a human that it’s real by the way it responds to questions using language. In order to respond coherently to increasingly complex questions (and ‘complex’ doesn’t necessarily mean ‘long’, with more than one clause) , computers have to understand the language being used, which is often far from straightforward.
To illustrate my point, let’s take a short and well known poem, ‘Not Waving but Drowning’ by Stevie Smith:
Not Waving but Drowning
Nobody heard him, the dead man,
But still he lay moaning:
I was much further out than you thought
And not waving but drowning.
Poor chap, he always loved larking
And now he’s dead
It must have been too cold for him his heart gave way,
Oh, no no no, it was too cold always
(Still the dead one lay moaning)
I was much too far out all my life
And not waving but drowning.
Even if you’re unfamiliar with the poem, at the end of the first read-through you should realise that the drowning element is used as a metaphor. The man isn’t literally drowning, but is metaphorically overwhelmed by life, and whose waving has been mistaken for larking about instead of a cry for help.
How would you go about teaching that level of understanding to a computer? How does our knowledge of language enable us to reach this understanding? Of course, it’s not just language, but also a knowledge of other things that helps us reach an interpretation, but if we were just to say to a computer ‘I’m drowning’ how would you get it to understand the non-literal interpretation of that two word statement? And of course humans don’t always agree on the meaning of a sentence. Without visual and contextual clues, it can be difficult to understand what’s being said, as anyone who has inadvertently caused offence in an email will tell you.
As a result, various approaches have been developed to help computers to interpret meaning. This is sentiment analysis, sometimes referred to as opinion mining. There are different ways of extracting meaning from a text, something I will try to explain in future blog posts. It’s also possible to combine methods and, if you were using sentiment analysis to evaluate, for example, the likelihood that an individual was a suicide risk using their posts on Twitter, you would also need to write an algorithm (put simply, a recipe) that would take other things into account before reaching the conclusion that suicide was imminent. In addition, your proposed solution has to be tested using human classifiers, and each stage of the process separately tested and assessed (using statistics) to establish reliability and accuracy.
There’s another problem as well, and that’s the gathering of data in the first place. The WWW is awash with words. Twitter, Facebook, blogs, comments on news feeds and online articles….. all of these sources are freely available, but difficult to collect. Of course a few things can be copied and pasted into a single document for further analysis, but what if you need to analyse a series of blog posts from a variety of different authors? Or comments left on several articles posted on a news site? Or perhaps you want to collect every tweet made over the course of a week…. even if you restrict this to tweets made in English, that’s still tens or possibly hundreds of thousands of data. How are you going to process them? How are you even going to weed out the ones that are irrelevant, such as the ones promoting products or porn sites? This is a classic ‘big data’ problem which can only be solved by writing code to collect the data for you, and dump it into a file or series of files until such time as you can use some other code to process it. Happily, this process is the bit I now know something about, and can tackle with the aid of a couple of good books and some time to experiment.
And so, I have to learn a whole new branch of Computer Science: Machine Learning; and I have to learn how to collect, analyse and present data, which is Data Science. Frankly, it’s taken me several weeks to really isolate what I need to know and read just a handful of research papers that cover the fields. My first job, then, is to locate and complete a couple of online courses (which are available through Futurelearn and Coursera) and go from there. I seem to have spent more time finding out what I don’t know and need to find out than actually consolidating what I know already, but that’s progress I suppose!
And still, I’m so far out of my comfort zone I can’t even see it in the rear view mirror….