logo.gif The leading source for professional development COURSES in statistics
 ÖÐÎÄ Course Login
Home > Our Courses >



Introduction to Support Vector Machines in R

Dr. Lutz Hamel

Aim of Course:

Support vector machines (SVMs) have established themselves as one of the preeminent machine learning models for classification and regression over the past decade or so, frequently outperforming artificial neural networks in task such as text mining and bioinformatics.

The aim of this course is to give you an understanding on what is going on "under the hood" when using SVMs. After completing this course, you will be able to interpret the performance of SVM models and make appropriate choices for model parameters during the model evaluation and selection cycle. You will understand the difference between linear, polynomial, and gaussian kernels and know how to tune their parameters. In addition, you will have a deep understanding on how the cost constant "C" affects the quality of your models.

The course is based on the R statistical computing environment. However, the knowledge gained here is easily transferred to other knowledge discovery environments.

Who Should Take This Course:

Statisticians and data miners who need to know a variety of methods for classification.

For those enrolled in a Program of Advanced Statistical Studies, this is a required or elective course in the following Programs:

  • Data Mining - elective

Course Program:

The course is structured as follows

SESSION 1: The Foundations
  • What is Knowledge Discovery?
  • Describing Data Mathematically
  • Linear Decision Surfaces and Functions
  • Perceptron Learning
    • Duality
  • Maximum Margin Classifiers
    • Quadratic Programming

SESSION 2: Support Vector Machines

  • The Lagrangian Dual
  • Dual Maximum Margin Optimization
  • Linear/Non-Linear SVMs
    • "The Kernel Trick"
  • Soft-margin Classifiers

SESSION 3: Model Evaluation and Selection

  • Performance metrics
    • the Confusion Matrix
  • Model Evaluation
    • Hold-out
    • Leave-one-out
    • N-fold Cross-validation
  • Confidence Intervals
  • Elements of Statistical Learning Theory
    • the VC-dimension
    • Empirical Risk Minimization
    • VC-confidence
    • Structural Risk Minimization

SESSION 4: Extensions to the Basic Model

  • Multi-class Classification
    • One-versus-the-rest Classification
    • Pairwise Classification
  • Regression with SVMs
    • Regression with Maximum Margin Machines
    • Regression with Support Vector Machines
    • Model Evaluation

The Instructor:

Dr. Lutz Hamel teaches at the University of Rhode Island and was the founder of the machine learning and data mining group there. He is the author of Knowledge Discovery with Support Vector Machines (Wiley, 2009). Before becoming an academic, Dr. Hamel was Director of Software Development at Thinking Machine Corporation, and Vice President of R&D for Bluestreak, where he oversaw the development of advanced technologies for online ad delivery and optimization, and directed the building of a next generation data warehouse-driven system for campaign analysis and design tools.

Organization of the Course:

The course takes place over the internet, at statistics.com. During each course week, you participate at times of your own choosing - there are no set times when you must be online. Course participants will be given access to a private discussion board. In class discussions led by the instructor, you can post questions, seek clarification, and interact with your fellow students and the instructor. The course is scheduled to take place over 4 weeks, and typically requires 15 hours per week. At the beginning of each week, you receive the relevant material, in addition to answers to exercises from the previous session. During the week, you are expected to go over the course materials and work through exercises. Discussion among participants is encouraged. The instructor will provide answers and comments.

Certificates and Grades:

You may be interested only in learning the material presented, and not be concerned with grades or certificates. Or you may be enrolled in a statistics.com Program in Advanced Statistical Studies that requires demonstration of proficiency in the subject, in which case your work will be assessed for purposes of issuing a grade. Or you may require only a "Certificate of Course Completion," along with professional development credit in the form of Continuing Education Units (CEU's). As you begin the class, you will be asked to specify your category.

Credit:

This course offers continuing education units (CEU's). For those successfully completing the course (generally this means marks of 50% or better on the homework), 5.0 CEU's and a certificate will be issued by statistics.com, upon request.

Dates:

Nov. 20 - Dec. 18, 2009
Click here to be notified of future course offerings.

Participants gain access to the online materials on the first day of the course, and typically spend about 15 hours per week (at their convenience). You retain full access to course materials, including discussion board, for two weeks after the course closing date.

Level:

Advanced

Prerequisite:

The equivalent of Introduction to Statistics 1: Inference for a Single Variable, and Introduction to Statistics 2: Working with Bivariate Data (and, if necessary before these courses, Introduction to Statistics for Beginners or Survey of Statistics for Beginners).

Familiarity with calculus and matrix algebra.

Introduction to R - Data Handling and Data Mining 1 will be helpful, but not required.

Course Text:

The required text for this course is Knowledge Discovery with Support Vector Machines, by Lutz Hamel, from Wiley, and it can be ordered from Wiley by clicking here. Wiley typically offers statistics.com customers up to 15% discount on this book (and all other statistics titles): enter the code aff15 in the Promotion Code field when prompted during checkout and click the Apply Discount button. (If you are located in Asia, the web procedure for your location may not accept this discount -- try calling your regional Wiley representative.) PLEASE ORDER YOUR COPY IN TIME FOR THE COURSE STARTING DATE.

Software:

You must have a copy of R for the course. Click Here for information on obtaining a free copy.

Registration:

Register Online - $469
Register Online (academic) - $369 (you must be affiliated with a college, university or high school)

Add $50 service fee if you require a prior invoice, or if you need to submit a purchase order or voucher, pay by wire transfer or EFT, or refund and reprocess a prior payment. Please use this printed registration form, for these and other special orders.

Note: Courses may fill up at any time and registrations are processed in the order in which they are received. Your registration will be confirmed for the first available course date, unless you specify otherwise.