The term text mining is sometimes used in two different meanings in computational statistics:
Using predictive modeling to label many documents (e.g. legal docs might be “relevant” or “not relevant”) – this is what we call text mining.
Using grammar and syntax to parse the meaning of individual documents – we use the term natural language processing.
Nitin Indurkhya, co-author of Fundamentals of Predictive Text Mining (Wiley) teaches both approaches in his courses here at the Institute for Statistics Education:
Nitin Indurkhya, co-author of Fundamentals of Predictive Text Mining(Wiley) and a senior data scientist with experience at eBay, Samsung and elsewhere, teaches the NLP courses. His colleague from eBay days, Anurag Bhardwaj, now a data scientist at QuadAnalytix, teaches the Text Mining class.
Registration options:
Sign up for any individual course using the above links
Sign up for all three courses for just $399 each – earn a Specialization in Text Analytics and save $450; use the code “text-specialization“
Need better grounding in Python first? Add our May 11Python for Analyticsfor the same $399.
The courses take place online at Statistics.com in a series of weekly lesson and assignments, and requires about 15 hours/week. Participate at your own convenience; there are no set times when you are required to be online.
We hope to see you in one or more of our text analytics courses!