Text similarity analysis tries to determine how “close” different texts are to each other. There are two different kinds of text similarity:
- Lexical text similarity: Aims to identify how similar two texts on a word level
- Semantic text similarity: Aims to identify how similar two texts are based on the context of each document
To give an example, the two sentences “my house is empty” and “there is nobody at mine” have a low lexical similarity but a high semantic similarity.
These similarities can be identified using Natural Language Processing (NLP) techniques.
Tags: concept