Project Report Due: 9:00am Monday June 2
Project Presentations: Week of June 2
The goals of this project are to work with your team:
To work as a group to apply an appropriate machine learning technique to a dataset from the . Machine learning methods fall, broadly, into three categories: classification, clustering and prediction. A dataset and one of these broad categories will be assigned separately to each group. These datasets do not necessarily fall into the category of “Big Data,” but machine learning techniques are generally scalable to any size of data.
To produce a short report describing the method you used (with appropriate references) and summarizing your findings.
To make a presentation giving more details about the method you used (with appropriate references) and your findings to the entire class.
Data for this project come from the UCI Machine Learning Repository (https://archive.ics.uci.edu/ml/datasets.html). Each group will be assigned one dataset and task from http://stat599.cwick.co.nz/assignments/proj3-data.html
CRAN task views list R packages relevant to a specific task, you might find these two helpful for finding methods implemented in R:
This site gives a pretty good list of methods too:
http://machinelearningmastery.com/a-tour-of-machine-learning-algorithms/
By Monday, June 2 at 9:00am: Submit a two-page (reasonable margins and font size) document that summarizes the following:
By the week of June 2: Make a 20 minute presentation of your project. The order in which teams make presentations will be determined randomly during the week of Jun 2. Each presentation must consist of four sections: (1) Introduction and Overview; (2) Detailed description of the machine learning method(s) you used; (3) Summary of findings from applying you method(s) to your data; (4) Discussion including assumptions/limitations of the method(s) and scalability. These sections will not necessarily be of equal length, but the total presentation length must not be longer than 20 minutes. All members of your team must be prepared to deliver all four sections—assignments of which team member will present which section will be determined randomly immediately before the presentation begins.
Presentation slides must be made available in PDF format so that they can be posted on the course website—these should be sent to Charlotte or Alix before 9:00am on the day your group presents.
By Friday, June 6 at 9:50am: You must provide Charlotte and Alix with access to your GIT repository where we will be able to access your well-documented R-code with which we could completely reproduce the content of your summary report and your presentation.
By Friday, June 6 at 9:50am: Each group member must turn in a completed Group Member Evaluation Form for all other members on his/her team.