Data Science Research Center - Self-Supervised Multimodal Learning from Videos

Jun. 16, 2021

13:00

-14:00

Speaker: Dr. Rami Ben-Ari

Title: Self-Supervised Multimodal Learning from Videos

Abstract:

In this talk, I am going to touch on two previous works of mine that address multimodal learning. I'll begin with a work that we started back in 2019, trying to learn object detection from instructional videos on YouTube, without a single manual annotation. We called this method, self-supervised object detection. This work yielded some interesting results that I'll share.

In follow-up work, we handled the problem of self-supervised representation learning of video snippets, again without any labeling. Previous work showed that this type of self-supervised training can significantly boost results on downstream tasks using the limited annotated data of the specific task. We specifically addressed the noisy label nature of the self-supervised learning and came up with a new approach to handle label noise and eventually improve the performance.

Ραμι Βεν Αρι.jpg

Bio:

Rami Ben-Ari is a Senior Research Scientist at Vision AI Research, a newly established research institute (currently in stealth mode). He has held several senior research positions in the industry including the technical lead for computer vision and deep learning at Video-AI technologies and research scientist in medical imaging at IBM- Research. He also serves as an adjunct professor at Bar-Ilan University, faculty of electrical and computer engineering.

Rami holds BSc and MSc in Aerospace engineering from Technion and a Ph.D. in Applied Mathematics from Tel-Aviv University. His research interests cover various topics in computer vision and deep learning including medical image analysis and video understanding.

→ All Events

Self-Supervised Multimodal Learning from Videos

$$Events$$