General Course Info


  • Instructor:
    Roberto Corizzo[rcorizzo@american.edu]
  • First Class: Jan 16
  • Location: DMTI 121
  • Office Hours:
    Schedule a time to meet with me through Acuity

Course abstract

This course presents the main machine learning/data mining algorithms and evaluation methods developed to date in an intuitive way suitable for a non-specialized audience. It also introduces current research developments in the field and initiates students to the solving of applied programs in an innovative way, using existing machine learning and data mining tools.

AU Core Quantitative Literacy II (Q2) Outcomes:

  1. Translate real-world questions or intellectual inquiries into quantitative frameworks.
  2. Select and apply appropriate quantitative methods or reasoning.
  3. Draw appropriate insights from the application of a quantitative framework.
  4. Explain quantitative reasoning and insights using appropriate forms of representation so that others could replicate the findings.

Course Schedule

Date Topic Module / Book Chapter Deadlines
Week 1
Jan 16 Introduction to Data Mining 1
Jan 19 Conceptual Overview 2+3
Week 2
Jan 23 Data Manipulation 2+3 Assignment 0 Release
Jan 26 Linear Regression 4
Week 3
Jan 30 Logistic Regression 4 Assignment 0 Deadline
Feb 02 Naïve Bayes /
Week 4
Feb 06 Instance Based Learning /
Feb 09 Support Vector Machines 5
Week 5
Feb 13 Evaluation Techniques 3
Feb 16 Decision Trees 6 Assignment 1 Release
Week 6
Feb 20 Ensemble Learning
Soft/Hard Voting, Bagging
Tree Ensembles: Random Forest
7
Feb 23 Ensemble Learning
Boosting, Stacking, ECOC
Tree Ensembles: Gradient Boosting
7 Assignment 1 Deadline
Week 7
Feb 27 Dimensionality Reduction 8 Pool of Papers Release
Mar 01 Neural Networks 4+10
Week 8
Mar 05 Neural Networks + Midterm Review 4+10
Mar 08 Midterm
Week 9
Mar 12 Spring Break
Mar 15 Spring Break
Week 10
Mar 19 Deep Learning 11-14 Assignment 2 Release
Mar 22 Deep Learning 11-14 Project Announcement
Week 11
Mar 26 Feature Selection 14 Assignment 2 Deadline
Mar 29 Clustering 9
Week 12
Apr 02 Class Imbalance Paper Critiques Deadline
Apr 05 One Class Learning 17
Week 13
Apr 09 Time Series 15
Apr 12 NLP 16
Week 14
Apr 16 Paper Presentations /
Apr 19 Paper Presentations /
Week 15
Apr 23 Final Project Presentation /
Apr 26 Final Project Presentation /

Syllabus

Grading

CSC-480


Component Weight
Homework Assignments (3) 25%
Midterm Exam 30%
Critiques of 5 research papers 10% = 5 x 2%
Presentation of 1 research paper 5%
Final Project + Presentation 30% (25% + 5%)


CSC-680


Component Weight
Homework Assignments (3) 20%
Midterm Exam 30%
Critiques of 10 research papers 10% = 10 x 1%
Presentation of 2 research papers 5% = 2 x 2.5%
Final Project + Presentation 35% (30% + 5%)

Attendance

Students are recommended to attend all lectures. Prolonged absences must be discussed with the instructor. If you cannot attend lectures regularly, due to work or other obligations during remote learning, then please reach out to the instructor so that I know about it.


Exams

Exams cover the material from the lectures, projects, and reading. While not necessarily cumulative, each exam will require understanding many of the concepts covered in the preceding exams. Exams consist of multiple choice, short answer, and long answer questions.

For the Final Project, students will propose their own topic in consultation with the instructor. Project proposals will be due in mid-semester.

Late Submissions

A penalty of 5% per day will be levied. The course doesn’t grant extension on the homework/lab/project submission deadline unless you have an extremely compelling excuse as observance of a religious holiday (in which case you need to let me know in advance).

Letter Grades

Range Letter
>=93 A
>=90 A-
>=87 B+
>=83 B
>=80 B-
>=77 C+
>=73 C
>=70 C-
>=60 D
<60 F

Academic Integrity

Even though we encourage collaboration with a partner, sharing code between groups is strictly forbidden - this is a form of plagiarism. As is showing your work to other students, even just for a second. There is rarely one single correct way to write code that solves a problem. While we want you to feel free to discuss your approach freely with a partner, you should know that there are often many solutions for a given problem and it's typically obvious when one student shares code with another. If you directly copy and paste code from the Internet (or even the text), cite your source in your comments (but also ensure that you understand what the code is doing - not all code on the web is good!). Assignments will be checked using plagiarism detection software and by hand to ensure the originality of the work.

Do not share your code with anyone other than a partner. Do not let someone look at your screen. You may get behind, or your friend may ask for help, but the consequences for plagiarism are far worse than an incomplete submission - for the submission, you will still likely get some points. If I suspect that you have purposely shared code with another student or presented someone else's work as your own, the matter will be referred to the Academic Integrity Code Administrator for adjudication. If you are found responsible for an academic integrity violation, sanctions can include a failing grade for the course, suspension for one or more academic terms, dismissal from the university, or other measures as deemed appropriate by the Dean.

All students are expected to adhere to the American University Honor Code. If you have a question about whether or not something is permissible, ask the instructor or the TA first.


Textbook

This course adopts the textbook "Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow", 2nd Edition by Aurélien Géron.
The online version of the book may be accessible for free from AU’s online Library After selecting "O’Reilly Online Learning" from the list and logging in with your AU account, you should be able to search for the book by name, or try accessing it from this link.



Acknowledgments

Course design by Roberto Corizzo at American University.

Thanks to Leah Ding and Nathalie Japkowicz at American University for discussions and contributions that inspired the design and the materials of this course. Thanks to Alex Godwin at American University for designing this syllabus template.