Visual Learning and Recognition (Deep Learning for Computer Vision)

● Trained multi-label image classification models for FashionMNIST and PASCAL 2007 datasets.
● Worked on weakly supervised object detection using Robust AlexNet backbone on Pascal dataset and obtained comparable mAP to the supervised networks.
● Deployed GAN network architectures which include LSGAN and WGAN-GP on CUB 2011 Dataset to generate realistic images of birds.
● Worked on open-ended Visual Question Answering for MSCOCO VQA dataset using simple Bag of Words baseline with GoogleNet feature extractor and Co-attention networks.

Learning Outcomes:

Visualizing and Understanding Neural Nets.
Basics of Image Segmentation, Object Detection and 3D Image Understanding.
Generative Models - GANs, Autoencoders and VAEs.
Few-Shot and Transfer Learning.
Action Recognition and Videos.
Combining Vision and Language models - VQA.

Programming Language: Pytorch, Python

Share on

Twitter Facebook LinkedIn

Vishnu M H

Share on