How to add stop word remover, lemmatization and stemming feature in Rasa NLU

vignesh amudha
1 min readApr 21, 2019

--

Hi everyone, I have been working on Rasa Stack for past 4 months, and we were doing chatbot for a wedding card website, In that chatbot, I planned to add stop word remover so that it can predict the intent and entity much easily.

What is a stop word?

Words that are filtered out by Web search engines and other enterprise searching and indexing platforms. Stop words are natural language words which have very little meaning, such as “and”, “the”, “a”, “an”, and similar words.

What are lemmatization and stemming?

please download the file from below link and copy the file and overwrite the existing file.

Note : New rasa stack planning to merge both rasa nlu and rasa core, I don’t know the path, So please search for the whitespace_tokenizer.py in the rasa stack installed path and overwrite the file and it only works for tensorflow embedding because in tensorflow embedding only we use the whitespace_tokenizer feature and for another tokenizer you can write the code for stop_word remover.

--

--

Responses (1)