Monday, November 27, 2017

NLP Basics - 1

Tokenization - Process of segmenting running text into words and sentences.

Normalization - Steps we take to condense terms into lexical units.

Named Entity Extraction - Locate and classify named entities in text into pre-defined categories

Stemming - Reducing inflected words to their root form.

Lemmatization - Grouping together inflected forms of a words so that they can be analyzed as a single item.




API  Links:

1. Watson - https://www.ibm.com/watson/services/natural-language-classifier/
2. Google Cloud Platform - https://cloud.google.com/natural-language/
3. Amazon Lex - https://aws.amazon.com/lex/
4. Facebook Deep Text - https://code.facebook.com/posts/181565595577955/introducing-deeptext-facebook-s-text-understanding-engine/
5. Microsoft Cognitive Services - https://azure.microsoft.com/en-in/services/cognitive-services/directory/lang/



References:

https://www.ibm.com/developerworks/community/blogs/nlp/entry/tokenization?lang=en

No comments:

Post a Comment