I am a PhD student at Michigan State University in the department of Computer Science and Engineering working in the Computer Vision Lab advised by Dr. Xiaoming Liu.
My primary interests are in deep learning, computer vision, autonomous driving. I focus on object recognition in the realm urban scenes. This involves work in pedestrian detection, object detection, semantic forecasting, and efficient binary-weight CNNs among others.
Selected projects conducted as part of research efforts within Michigan State University CSE doctoral program.
We propose an autoregressive cascaded phase network designed to progressively improve precision. The proposed framework utilizes a lightweight and stackable de-encoder module with inner-lateral convolutions designed to encourage features to both refine and diversify. We explicitly encourage
increasing levels of precision by assigning strict labeling
policies to each consecutive phase such that early phases
develop features primarily focused on achieving high recall
and later on accurate precision. In consequence, the final
feature maps form more peaky radial gradients emulating
from the centroids of unique pedestrians.
Collectively, our framework leads new state-of-the-art
performance on challenging settings of Caltech.
The preprint is available to read here.
In this work, we decompose the challenging semantic forecasting task into two subtasks: current frame segmentation and future optical flow prediction. We build an efficient, effective, low overhead model with three core components: flow prediction network, feature-flow aggregation LSTM, and end-to-end learnable warp layer. Our proposed method achieves state-of-the-art accuracy on short-term and moving objects semantic forecasting while simultaneously reducing model parameters by up to 95% and increasing efficiency by greater than 40x.
Publication is available to read here.
In this work, we explore how semantic segmentation can be used to boost pedestrian detection accuracy while having little to no impact on network efficiency.
We propose a segmentation infusion network to enable joint supervision on semantic segmentation and pedestrian detection.
Find out more at project website, publication, and source code.
Monocular urban pedestrian detection system featuring a cascade of networks, and a 30 class semantic segmentation side task jointly trained using Caltech and Cityscapes.
The classes utilized for segmentation are based on classes commonly seen in urban street scenes including roads, buildings, vehicles, pedestrians, traffic signs, and trees. Video is captured on Michigan State University campus in conjunction with the the CANVAS autonomous driving effort.
Selected projects from various coursework at Michigan State University as part of the CSE doctoral program.
We implemented a GAN loss to generate residual images to augment urban driving scenes with 2 primary goals: photo realism and maximum confusion/anarchy for the pedestrian detector. In essence, we aim to fool and attack a state-of-the-art detector. We further investigate the effects of using the synthetic data for alternate training and thus putting the GAN network against a detector. We find that state-of-the-art systems are not naturally robust to such attacks in that the miss-rate error will quadruple (4x) unless the network is trained directly on the real and synthetic image data.
We build a convolutional word embedding network using a series of fully-connected layers followed by two LSTMs and utilize attribute labeling as an auxillary task, resulting in a natural language person retrieval system functional on in-the-wild traffic surveillance data (manually collected at MSU). The system works as a proof-of-concept and, in our experience, performs only modestly due to apparent poor generalization between training and test data domains. We merge the CUHK person description dataset with the PETA pedestrian attribute dataset for training the respective loss functions.
We build a proof-of-concept project for generating synthetic pedestrians. We use the MakeHuman tool to generate synthetic 3D pedestrian models with highly variable pose, shape, race, hair, clothing, etc. In addition to synthetic, we further use a simple cut-and-paste method based on pixel-level segmentation masks to generate real people. Finally, we place synthetic or real pedestrians onto arbitrary background images. We learn depth cues to place pedestrians at proper scales on the sidewalk or road regions only. In reflection, this method, while a useful proof-of-concept, could be drastically improved using a GAN to enforce realistic shadows, color consistencies, and borders.
Selected computer science projects from recent years, conducted for little to no reason.
The project contains a large collection of profane words built in. Each of these
words were ran through the English dictionary to generate a list of potential
false positives which are then compared against before the filtering finishes.
Simply import blasphemy.js and start having fun!
Multipi is a simple Unity game based on an imaginary operation in mathematics called multipiecation.
The only objective of the game is to throw pies at other pies.
When pies collide they multiply in size or quantity depending on the circumstance.
A man is in game creeps around eating pies, eventually multiplying himself. Score is kept in terms of pi and pie.
The game is playable using Unity's web player below.
Uses a combination of basic authentication, reliable rest calls, and unfortunate web scraping. Documentation available.
Created as a way to abstract the functions used in KUMobile project (listed below) so that others could make use of it as well. The latest build is available through rawgit CDN service here.
The Oakland Scheduler is a tool created specifically for Oakland University in order to make choosing classes easier.
Features: visual representation, filter by day, filter by sections, sort by time, export schedule, and print schedule.
Created for fun through using old quick and dirty algorithms left over from very early prototypes of KUMobile.
Simulates up to two 3D Lightsabers from the user's webcam feed in real-time.
The application uses OpenCV to track target props and then OpenGL to draw 3D Lightsaber ontop.
Flicker and blur also implemented.
This project was lots of fun with very rewarding hard work! Completed as part of a team for virtual reality course at Kettering University.