TreeHacks 2019: Best computer vision prize and IBM’s #1 favorite health hack. Upscaling low resolution videos with a novel super resolution GAN architecture.
- Friday, Feb 15, 10:00pm
- Plan project
- Make timeline
- Saturday, Feb 16, 1:00am
- Download and pre-process data (Kian)
- SRGAN (Tyler, Grant)
- Research realtime, low res video (Priyank)
- Saturday, Feb 16, 1:00pm
- SRGAN implementation and integration with YouTube dataset
- RCAN implementation
- Saturday, Feb 16, 5:00pm
- Have an upscaled image generated from a CNN
- Saturday, Feb 16, 11:00pm
- Have an upscaled image generated from a GAN
- Sunday, Feb 17, 3:00am
- Finish devpost
- Upscale a whole video
- Sunday, Feb 17, 8:00am
- Write quick frontend
- Optimize video data upscaling
- Sunday, Feb 17, 10:00am
- Finish data pipeline so user can upscale any YouTube video of their choosing in realtime.
- Hacking stops!
- Clean up GitHub
- Clean up devpost
- Create experimental pipeline
- Write arXiv article
- Figure out what graphs / sample videos to use
- Market research (Disney/ESPN, GoPro, USC ITS, Axon, IBM, Waymo, Uber ATG, Defense contracts)
- Are up upgrading image quality?
- Are we upgrading video quality?
- Are we upgrading live video streams?
- Build out attention model
- Build out time-series, 3D convolution over last 10 frames
- Make it generalizable - put in input resolution and output resolution, library of models
- Page write up on what we created
- People from these companies were excited, we won this prize, want to figure out if this would be something that would be interesting to people, I'd love to get your advice. d
- Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network
- They set a new state of the art for image SR with high upscaling factors (4×) as measured by PSNR and structural similarity (SSIM) with 16 blocks deep ResNet (SRResNet) optimized for MSE.
- They propose SRGAN which is a GAN-based network optimized for a new perceptual loss. Here they replace the MSE-based content loss with a loss calculated on feature maps of the VGG network, which are more invariant to changes in pixel space.
- They confirm with an extensive mean opinion score (MOS) test on images from three public benchmark datasets that SRGAN is the new state of the art, by a large margin, for the estimation of photo-realistic SR images with high upscaling factors (4×).
- Deep Residual Learning for Image Recognition
- Efficient Super Resolution For Large-Scale Images Using Attentional GAN
- Very Deep Convolutional Networks for Large-Scale Image Recognition
- Generative Adversarial Nets
- Self-Attention Generative Adversarial Networks
- Low-Complexity Single-Image Super-Resolution based on Nonnegative Neighbor Embedding
- tempoGAN: A Temporally Coherent, Volumetric GAN for Super-resolution Fluid Flow
- Are GANs Created Equal? A Large-Scale Study
- A study of GAN evaluation metrics
- We provide a fair and comprehensive comparison of the state-of-the-art GANs, and empirically demonstrate that nearly all of them can reach similar values of FID, given a high enough computational budget.
- We provide strong empirical evidence that to compare GANs it is necessary to report a summary of distribution of results, rather than the best result achieved, due to the randomness of the optimization process and model instability.
- Going deeper with convolutions
- Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift