Before we get started on Vectors and their understanding , I would ask few basic questions that generally occurs in our mind.
We will go deeper into this section gradually as we progress on our understading the basics.
Why do we make vectors out of the text ❓
To build any Natural Language Processing based applications we need to convert them into numbers or vectors of some shape or form.
💁♀️The answer is simple, to communicate with the machines as machine doesn't understand anything but numbers. Let's go more deeper into this explanation.
👉Vectorisation of text is imperative and necessary objective as it enables the machines to understand the textual contents by converting them into meaningful numerical representations. We do it by transforming textual data to meaningful vectors in a way to communicate with the machines for performing any NaturalLanguage Processing tasks and solve problems mathematically.
🔲So the next question comes in mind is how do we do that ❓
▪️Over the decades of research in NLP domain, researcher had revealed many sophisticated methods for vectorising the text. A straight forward method would be taking the words from a sentence and just map them some random integer value.
✔️Example:
🟢cat ——009
🟢dog ——080
🟢man ——010
Vectorisation which can be applicable to any form of text be word, sentence, paragraph, or document.
Some popularly known vectorisation methods are listed below:
⭕️Tf-IDF
⭕️Cosine Similarity
⭕️Glove
⭕️Word2Vec
⭕️Sentence2Vec
⭕️Doc2Vec
⭕️ELMo
⭕️BERT