Machine Learning Spring 2017

Event Phone: 1-610-715-0115

We're sorry, but all tickets sales have ended because the event is expired.

There are no upcoming dates for this event.


Cancellation Policy: If you cancel your registration at least two weeks before the course is scheduled to begin, you are entitled to a full refund (minus a processing fee of $50).
In the unlikely event that Statistical Horizons LLC must cancel a seminar, we will do our best to inform you as soon as possible of the cancellation. You would then have the option of receiving a full refund of the seminar fee or a credit towards another seminar. In no event shall Statistical Horizons LLC be liable for any incidental or consequential damages that you may incur because of the cancellation.
A 2-day seminar taught by Stephen Vardeman, Ph.D. 

Modern researchers increasingly find themselves facing a new paradigm where data are no longer scarce and expensive, but rather abundant and cheap. Both numbers of cases/instances and numbers of variables/features are exploding. This new reality raises important issues in effective data analysis.

Of course, the basic statistical objective–discovery and quantitative description of simple structure–remains unchanged. But new possibilities for applying highly flexible methods (not practical in “small data” contexts) must be reconciled with the inherent sparsity of essentially any data set comprised of a large number of features–and the corresponding danger of overfitting and unwarranted generalization from data in hand. Modern statistical machine methods rationally and effectively address these new realities.

This course first describes and explains the new context, formulates issues that it raises, and points to cross-validation as a fundamental tool for matching method flexibility/complexity to data set information content in predictive problems. Then a variety of modern squared error loss prediction methods (modern regression methods) will be discussed, related to optimal prediction, and illustrated using standard R packages. These will include

  • smoothing methods
  • shrinkage for linear prediction (ridge, lasso, and elastic net predictors)
  • regression trees
  • random forests, and
  • boosting

Next a variety of modern classification methods will be introduced, related to optimal classification, and illustrated using standard R packages. These will include:

  • linear methods for classification (linear discriminant analysis, logistic regression, support vector classifiers)
  • kernel extensions of support vector classifiers
  • classification trees
  • adaboost, and
  • other ensemble classifiers

Finally, we’ll discuss some methods of modern “unsupervised” statistical machine learning, where the object is not prediction of a particular response variable but rather discovery of relations among features or natural groupings of either cases or features. These will include principal components and clustering methods

The course will consist of both lectures and hands-on R sessions.

Venue:  

Address:
1515 Market Street, Philadelphia, Pennsylvania, 19103, United States