CSI 4116: Computer Vision

Course Information

Lecture Dates: Monday 15:00 - 15:50, Wednesday 11:00 - 12:50

Class Location: 공학관 D504

Instructor: Seong Jae Hwang (seongjae@yonsei.ac.kr)

Office: 공학원 439b

TA:

Youngjun Jun (youngjun@yonsei.ac.kr)
Jiwoo Park (wldn1677@yonsei.ac.kr)
Sujung Hong (sujung0914@yonsei.ac.kr)
Tae Eun Choi (tiachoiwppq@yonsei.ac.kr)

Course Description

In this class, students will learn the basics of modern computer vision. The course will first cover low-level computer vision topics such as image filtering, edge detection, feature extraction, description and matching, grouping, and clustering. Then, we will cover high-level topics such as object detection, object recognition, segmentation, unsupervised learning, and generative models. We will cover recently popular techniques such as convolutional and recurrent neural networks. We will also discuss a few topics from recent computer vision conferences.

After the course, you may be able to:

Understand how machines see and understand photos and images
Have basic knowledge about some classic to modern computer vision techniques
See why and how modern computer vision techniques (e.g., deep learning) have evolved in such ways
Identify common pitfalls and mistakes ML/CV practitioners may make and learn how to avoid them
Identify or at least guess the family of ML/CV models that modern products could be using
Find an appropriate set of techniques and methods for the task of your interest
Start getting into advanced computer vision and deep learning techniques
Identify the subareas/topics in CV that interest you

Pre-requisites

Required: Object-Oriented Programming and Data Structures. Recommended: Linear Algebra basics.

Programming languages

Recommended: Matlab (2019a or higher) by MathWorks. Free to download it from Yonsei Portal IT Service.
Alternative: Octave

Textbooks

There are recommended readings from the following textbooks:

Computer Vision: Algorithms and Applications by Richard Szeliski (available for free on author's page). First Edition is more "complete", while the Second Edition is a draft.
Visual Object Recognition by Kristen Grauman and Bastian Leibe (free from Semantic Scholar)

You can also refer to the following textbooks for additional explanations:

Deep Learning by Ian Goodfellow, Yoshua Bengio, Aaron Courville
Pattern Recognition and Machine Learning by Christopher Bishop
Computer Vision: A Modern Approach by David Forsyth and Jean Ponce
Computer Vision: Models, Learning, and Inference by Simon Prince

Course Policies

Grading (Tentative)

Programming HW assignments (6~7 assignments x 10% each = 60~70%)
Attendance (10%)
Final Exam (20%)

Homework Submission

When: Homework is due at 23:59 on the due date.
Where: You will submit your homework on LearnUs.
How: You should submit all your files on the corresponding HW submission page. Do NOT zip the files. Upload them separately.

Attendance

10% of total course grade.
4 digit attendance code ~20 mins into the lecture
Allowed to miss 1 lecture without penalty. This is for personal reasons (doctor's appointments, on-site interviews, etc.). No justifications necessary.
Lose 0.5% per missing lecture. Need to attend at least 2/3 of the lectures. Meaning, you cannot miss more than 8 lectures.
If you miss a lecture due to an unusual circumstance (e.g., medical emergency, family emergency, etc.), contact me ASAP with clear reasons and proof.
Note: first two lectures (add/drop period): attendance NOT taken

Homework Late Policy

Late policy: -25% of the assignment's total per late day.
- Late day is counted based on the submission date. example: Due 3/1 at 23:59. Submitted anytime on 3/2 = 1 day late. Submitted anytime on 3/3 = 2 days late.
- -25% means you lose 25% of the total possible HW grade. example: Your HW1 was initially scored 40/50 pts, but it was submitted 2 hours late. The final HW1 grade is (40 - 12.5)/50 = 27.5/50.)

Collaboration Policy and Academic Honesty

You will do your work (exams and homework) individually.
The work you turn in must be your own work.
You are allowed to discuss the assignments with your classmates, but do not look at the code they might have written for the assignments, or at their written answers.
You are not allowed to search for code on the internet, use solutions posted online unless you are explicitly allowed to look at those, or to use Matlab implementation if you are asked to write your own code.
When in doubt about what you can or cannot use, ask the instructor!
Posting and asking HW questions online (e.g., Stack Overflow) is NOT allowed. This will be considered cheating.
A first offense will cause you to get 0% credit on the assignment. A report will be filed with the school.
A second offense will cause you to fail the class and receive a disciplinary penalty.
Again, do not cheat. In previous years, I have identified cheaters which resulted in an F.

Note on Disabilities

If you have a disability for which you are or may be requesting accommodation, you are encouraged to contact your instructor ASAP.

Note on Medical Conditions

If you have a medical condition which will prevent you from doing a certain assignment, you must inform the instructor of this before the deadline. You must then submit documentation of your condition within a week of the assignment deadline.

Statement on Classroom Recording

To ensure the free and open discussion of ideas, students may not record classroom lectures, discussion and/or activities without the advance written permission of the instructor, and any such recording properly approved in advance can be used solely for the student's own private use.

Course Schedule

CSI 4116 (2024-2) - Computer Vision

Homework Assignments Instructions

Related files are on HW submission pages @ LearnUs.

HW1 - Due 10/7 at 11:59 PM

HW2 - Due 10/21 at 11:59 PM

HW3 - Due 11/4 at 11:59 PM

HW4 - Due 11/15 at 11:59 PM

HW5 - Due 11/29 at 11:59 PM

HW6 - Due 12/13 at 11:59 PM (Instruction included in the provided file)

Academic integrity

All assignment submissions must be the sole work of each individual student. Students may not read or copy another student's solutions or share their own solutions with other students. Students may not review solutions from students who have taken the course in previous years. Submissions that are substantively similar will be considered cheating by all students involved, and as such, students must be mindful not to post their code publicly. The use of books and online resources is allowed, but must be credited in submissions, and material may not be copied verbatim. Any use of electronics or other resources during an examination will be considered cheating.

If you have any doubts about whether a particular action may be construed as cheating, ask the instructor for clarification before you do it. The instructor will make the final determination of what is considered cheating.

Cheating in this course will result in a grade of F for the course and may be subject to further disciplinary action.

Using an open-source codebase is accepted, but you must explicitly cite the source, especially following the owner's guidelines if it exists. For any writing involved in the project, plagiarism is strictly prohibited. If you are unclear whether your work will be considered plagiarism, ask the instructor before submitting or presenting the work.

Resources

This course is consistent with the previous Intro to Computer Vision courses at Pitt by Adriana Kovashka and Seong Jae Hwang which were inspired by the following courses:

Computer Vision by Kristen Grauman, UT Austin, Spring 2011
Computer Vision by Derek Hoiem, UIUC, Spring 2015
Convolutional Neural Networks for Visual Recognition by Fei-Fei Li, Andrej Karpathy, Justin Johnson, and Serena Young, Stanford University, Spring 2017

Tutorials:

Matlab tutorial
Linear algebra review by Fei-Fei Li
Brief machine learning intro by Aditya Khosla and Joseph Lim
Resources list (including code and data, tutorials, and other related courses) compiled by Devi Parikh

Some computer vision datasets:

Microsoft COCO (Common Objects in Context) (object recognition, segmentation, image description)
ImageNet (object recognition)
SUN Database (scenes)
Caltech-UCSD Birds 200 (fine-grained object recognition)
MSRC Annotations (active learning)
Animals with Attributes (attribute-based recognition)
a-Pascal + a-Yahoo (attribute-based recognition)
Shoes (attribute-based search)
INRIA Movie Actions (action recognition)
ADL (ego-centric action recognition)
Action Quality (evaluating action quality)
CarDb Historical Cars (style classification of cars)
Recognizing Image Style (photographic style classification)
Judd gaze (visual saliency prediction)
Visual Persuasion (predicting subtle messages in images)
Advertisements: Images and Videos (understanding what the ad prompts of the viewer and why)
VQA (visual question-answering)
Recognition datasets list compiled by Kristen Grauman
Human activity datasets list compiled by Chao-Yeh Chen

Some code and frameworks of interest:

LIBSVM (by Chih-Chung Chang and Chih-Jen Lin)
SVM Light (by Thorsten Joachims)
VLFeat (feature extraction, tutorials and more, by Andrea Vedaldi)
TensorFlow (deep learning framework by Google)
Caffe (deep learning framework by Yangqing Jia et al.)
PyTorch (another popular deep learning framework)
Keras (deep learning library)