In this week’s blog, we discuss our recent acquisition by Elder Research Inc. We also look at the “Canary Trap” and its connection to text mining.
Our course spotlight is on
- Jan 31 to Feb 28: Text Mining using Python (still open for registrations, first assignment due in a week)
- Feb 28 -Mar 27: Natural Language Processing
See you in class!
Going Beyond the Canary Trap
Statistics.com Acquired by Elder Research
In last week’s Brief I described how The Institute’s courses, and its Mastery, Certificate and degree programs would continue without interruption, following our acquisition by Elder Research, Inc. . Now I’d like to talk about how the Institute’s students stand to gain […]
Course Spotlight
Jan 31 to Feb 28: Text Mining using Python
(still open for registrations, first assignment due in a week)
This course covers the “bag-of-words” approach to text mining, in which documents are considered as collections of terms and predictive models are trained to classify documents. This course is taught by Anurag Bhardwaj, Senior Manager, Data Scientist at Apple. You should be familiar with Python and with predictive modeling; you will learn how to:
- Prepare documents by tokenizing, creating dictionaries, and converting text to numerical vectors
- Build classifiers with decision trees, Naive Bayes, and linear models using training and validation data
- Perform “tagging” of text data
- Cluster documents using the k-means algorithm
- Generate predicted Twitter hashtags for text data
Feb 28 – Mar 27: Natural Language Processing
In contrast to the broad classification goal of the “bag-of-words” approach, natural language processing aims at parsing the meaning of individual sentences and documents. This course is taught by Nitin Indurkhya, co-author of Predictive Text Mining (Springer) and the Handbook of Natural Language Processing (CRC). The course is conceptual in nature and does not involve software. You will learn how to:
- Understand and give examples of N-grams and their role in probabilistic language prediction
- Correctly understand and produce regular expressions
- Assign (tag) parts of speech to words in a corpus
- Parse sentences and use semantic analysis to understand meaning
- Disambiguate word meanings
See you in class!