Machine Learning – May 2023

Event Phone: 610-715-0115

We're sorry, but all tickets sales have ended because the event is expired.

There are no upcoming dates for this event.


Cancellation Policy: If you cancel your registration at least two weeks before the course is scheduled to begin, you are entitled to a full refund (minus a processing fee of $50).
In the unlikely event that Statistical Horizons LLC must cancel a seminar, we will do our best to inform you as soon as possible of the cancellation. You would then have the option of receiving a full refund of the seminar fee or a credit towards another seminar. In no event shall Statistical Horizons LLC be liable for any incidental or consequential damages that you may incur because of the cancellation.

A 4-Day Livestream Seminar
Taught by Kevin Grimm, Ph.D.

Machine learning has emerged as a major field of statistics and data analysis where the goal is to create reliable and flexible predictive models. This seminar offers a thorough introduction to machine learning methods. Topics covered include: cross-validation; multiple regression; basic variable selection methods; an overview of the R statistical framework; and advanced variable selection methods for regression analysis.

Machine Learning methods have gained much attention for analyzing large datasets that may be composed of several hundred variables and many thousands (perhaps millions) of participants. In these situations, machine learning algorithms attempt to identify key variables needed in the predictive model, and several techniques search for nonlinear associations and interactive effects.

While machine learning techniques have been most attractive for large datasets, these same techniques can be useful in smaller datasets for the same reasons–to create simpler and more reliable predictive models, and to search for nonlinear and interactive effects. These techniques are also a natural follow-up to standard hypothesis-driven statistical analyses (e.g., multiple regression) to search for additional important patterns in the data.

The seminar begins by introducing machine learning and cross-validation, the approach for model selection in machine learning. Next, we will focus on variable selection algorithms for multiple regression, including lasso regression and multivariate adaptive regression splines. We will also cover machine learning techniques for categorical outcomes. Topics include logistic regression, decision theory, naïve Bayes, k-nearest neighbor, and support vector machines. Finally, we will focus on recursive partitioning (classification and regression trees) and ensemble models, such as bagging, random forests, and boosting. Throughout the course, you will gain experience with these methods through hands-on exercises.

Venue: