CMSC848K Multimodal Foundation Models - Fall 2024

Course Description

Recent advances in machine learning are driven by training scalable models on Internet-scale data (e.g., billions of image-text pairs or trillions of text tokens). This gives rises to foundation models that demonstrate in diverse tasks. In this course, we will study techniques that enable such machine learning systems. We will cover foundation models for language, vision, and other modalities.

People

Course instructor

Jia-Bin Huang (jbhuang@umd.edu)
Office: 4234 IRB building

Teaching assistants

Hadi Alzayer (hadi@umd.edu), Yi-Ting Chen (ytchen@umd.edu), Yue Feng (yuefeng@umd.edu), Ji-Ze Jang (gjang@umd.edu), Yao-Chih Lee (yclee@umd.edu)

Schedule

Coursework

Prerequisites

College Calculus, Linear algebra, Probability and Statistics. Prior courses in machine learning, natural language processing, and computer vision are helpful, but not required.

Midterm (30%)

We have two in-class midterm exams through out the semester. Detailed information will be made available.

Final project (30%)

Students will work in a group of 2-3 students to work on projects on the topic of multimodal foundation models.

Paper review (40%)

We will have a list of recommended paper readings starting from the third lecture. For each lecture, students will turn in an one-page paper review. The review should have two sections: 1) paper summary and 2) your critiques (strenth/weakness of the paper, interesting insights or questions that worth discussions). The paper review will be due prior to the class (11:00 AM on Tues or Thurs). No late submissions are allowed. The students need to submit at least 20 paper reviews to receive full scores (40%)

Course logistics

Lectures

Tuesday/Thursday 11:00 AM - 12:15 PM at IRB 0318

Lecture Videos

No lecture recordings. The instructor will post edited/summarized videos on the selected topics for student reviews. These will be posted shortly after the lectures.

Piazza

We will be using Piazza as the primary platform for communication. Please do not send individual emails to TAs or the instructor as they are difficult to track.

Additional Resources

Courses at UMD

Several courses offered at UMD also overlap with this course.

External

Web Accessibility