Visual Learning and Recognition (Deep Learning for Computer Vision)
● Trained multi-label image classification models for FashionMNIST and PASCAL 2007 datasets.
● Worked on weakly supervised object detection using Robust AlexNet backbone on Pascal dataset and obtained comparable mAP to the supervised networks.
● Deployed GAN network architectures which include LSGAN and WGAN-GP on CUB 2011 Dataset to generate realistic images of birds.
● Worked on open-ended Visual Question Answering for MSCOCO VQA dataset using simple Bag of Words baseline with GoogleNet feature extractor and Co-attention networks.
Learning Outcomes:
- Visualizing and Understanding Neural Nets.
- Basics of Image Segmentation, Object Detection and 3D Image Understanding.
- Generative Models - GANs, Autoencoders and VAEs.
- Few-Shot and Transfer Learning.
- Action Recognition and Videos.
- Combining Vision and Language models - VQA.
Programming Language: Pytorch, Python