Before I get started with NLP, I would like to tell you that the purpose of this blog is to give you a fair understanding of NLP if you are new to this domain and just started exploring it.
Let's break the word Natural Language Processing into two parts. The first part "Natural Language" is referring to Human Natural Language (Written & Spoken) and the second part is referring to the "Processing" of the language. Natural language processing is collectively referring to the automatic computational processing of human language and its forms. Natural language processing(NLP) is a challenging task, as human language is inherently ambiguous, ever-changing, and not well defined.
You might think why do we have to say Natural ? We already know we are working with human generated text ? Is there any unnatural language ?
Yes, you are absolutely correct. There are many formal languages such as the programming languages ( ex. C, python ) which we may heard of in compuater science domain. On the other hand , we humans speak natural language to have commununications between each other. Human Language is not artificially created but emerged and evolved naturally since thousands of years. Hence to have that clarity Natural language is basically dealing with human language or the language that you and me speak, read or write.
What's Natural Language Processing is clear but why do we need a separate process for understanding human language ? Isn't that computers understand python, c, c++ without the help of NLP?
To understand formal languages computers have an interpreter and compiler. The interpreter translates just one statement of the program at a time into machine code. The compiler scans the entire program and translates the whole of it into machine code at once. It has a secific rule and condition taht has to be followed and if we miss that computers won't detect it.
Human languages are not only complex to understand but also ambiguos which makes them more unpredictable , complex and extreamly hard to understand by compueters.
Do you remember a conversation with your friend or a colleague when he/she has told you, that you are being ambiguous? It means that you are using ambiguous speech that's making the conversation difficult to understand.
We, humans, are mysterious in communication. Sometimes some simple sentences or words become so difficult to understand that we might end up blaming the speaker or ourselves for not understanding them clearly. Sometimes, there are people who are not even aware that they are using ambiguous speech. Other times, people use verbal ambiguity on purpose. There are also times when people prefer being unclear to make others feel as if they are solving a mystery.
I often ask the interviewee this very interesting question:
What is the biggest challenge in processing human language? or What is the most complex thing about NLP?
some of them say it's the data, and some of them say it's the business problem they are solving. The real problem here is ambiguity.
So the next question comes: What Is Ambiguity?
The word ‘ambiguity’ actually originates from Latin, meaning “wandering about.”
Ambiguity is the kind of meaning in which a statement, resolution, or phrase is not clearly defined. Ambiguity in writing occurs when the meaning of some part of the text is uncertain, and there can be more than one meaning to it.
Let's look at through an example, so it becomes more clear to you.
For instance:
‘Mrs. Nicole was proved guilty of keeping an endangered species in the Athens Magistrates Court after being charged with stealing a rattlesnake from her neighbour’s property.’
The ambiguity:
It can’t be determined whether Mrs. Nicole was guilty of stealing the rattlesnake in the Athens Magistrates Court, or guilty of stealing the rattlesnake after she caught it from a resident’s property
The correct sentence:
In the Athens Magistrates Court, Mrs. Nicole was proved guilty of keeping an endangered species, a rattlesnake, after stealing it from a neighbour’s property.
Examples of Ambiguous Sentences
"The lecturer said on Friday she would take a pop quiz."
"It can either mean that it was on Friday that the lecturer told the students about the pop quiz or that the pop quiz would be held on Friday"
"The goat is ready to eat."
"It can either mean that the goat is cooked and ready for everyone to eat it or that the goat is ready to be fed some food"
"The burglar robbed the woman with a knife."
"It can either mean that a knife-wielding burglar robbed a woman or the woman that the burglar robbed was holding a knife."
"Visiting friends can be annoying."
"It can either mean that the act of visiting one’s friends can lead to annoyance or that visiting friends can feel annoying"
By now I hope you are clear about what is ambiguity and what kind of problems it can create in understanding human languages. Ofcourse, there are solutions to deal with them in certain ways.
We will get to know them one by one. Before that let's have some minimal understanding of the types of ambiguities. A couple of months back I had posted about the semantic and syntactic nature of human language. If you haven't noticed it yet have a look here.
Ambiguity is not just the lack of clarity in writing and speech. It is more than that. Ambiguous sentences are a bit more focused than just being unclear. In its two different types, ambiguity offers two or more than two possible plausible interpretations of a passage or a single word.
There are two types of ambiguity in speech and writing. The two types are:
In my next blog, I will talk about the solution and further processing of Natural Language.