Research Blog: Google at ICLR 2018

This week, Vancouver, Canada hosts the 6th International Conference on Learning Representations (ICLR 2018), a conference focused on how one can learn meaningful and useful representations of data for machine learning. ICLR includes conference and workshop tracks, with invited talks along with oral and poster presentations of some of the latest research on deep learning, metric learning, kernel learning, compositional models, non-linear structured prediction, and issues regarding non-convex optimization.

At the forefront of innovation in cutting-edge technology in neural networks and deep learning, Google focuses on both theory and application, developing learning approaches to understand and generalize. As Platinum Sponsor of ICLR 2018, Google will have a strong presence with over 130 researchers attending, contributing to and learning from the broader academic research community by presenting papers and posters, in addition to participating on organizing committees and in workshops.

If you are attending ICLR 2018, we hope you’ll stop by our booth and chat with our researchers about the projects and opportunities at Google that go into solving interesting problems for billions of people. You can also learn more about our research being presented at ICLR 2018 in the list below (Googlers highlighted in blue)

Senior Program Chairs include:
Tara Sainath

Steering Committee includes:
Hugo Larochelle

Oral Contributions
Wasserstein Auto-Encoders
Ilya Tolstikhin, Olivier Bousquet, Sylvain Gelly, Bernhard Scholkopf

On the Convergence of Adam and Beyond (Best Paper Award)
Sashank J. Reddi, Satyen Kale, Sanjiv Kumar

Ask the Right Questions: Active Question Reformulation with Reinforcement Learning
Christian Buck, Jannis Bulian, Massimiliano Ciaramita, Wojciech Gajewski, Andrea Gesmundo, Neil Houlsby, Wei Wang

Beyond Word Importance: Contextual Decompositions to Extract Interactions from LSTMs
W. James Murdoch, Peter J. Liu, Bin Yu

Conference Posters
Boosting the Actor with Dual Critic
Bo Dai, Albert Shaw, Niao He, Lihong Li, Le Song

MaskGAN: Better Text Generation via Filling in the _______
William Fedus, Ian Goodfellow, Andrew M. Dai

Scalable Private Learning with PATE
Nicolas Papernot, Shuang Song, Ilya Mironov, Ananth Raghunathan, Kunal Talwar, Ulfar Erlingsson

Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training
Yujun Lin, Song Han, Huizi Mao, Yu Wang, William J. Dally

Flipout: Efficient Pseudo-Independent Weight Perturbations on Mini-Batches
Yeming Wen, Paul Vicol, Jimmy Ba, Dustin Tran, Roger Grosse

Latent Constraints: Learning to Generate Conditionally from Unconditional Generative Models
Adam Roberts, Jesse Engel, Matt Hoffman

Multi-Mention Learning for Reading Comprehension with Neural Cascades
Swabha Swayamdipta, Ankur P. Parikh, Tom Kwiatkowski

QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension
Adams Wei Yu, David Dohan, Thang Luong, Rui Zhao, Kai Chen, Mohammad Norouzi, Quoc V. Le

Sensitivity and Generalization in Neural Networks: An Empirical Study
Roman Novak, Yasaman Bahri, Daniel A. Abolafia, Jeffrey Pennington, Jascha Sohl-Dickstein

Action-dependent Control Variates for Policy Optimization via Stein Identity
Hao Liu, Yihao Feng, Yi Mao, Dengyong Zhou, Jian Peng, Qiang Liu

An Efficient Framework for Learning Sentence Representations
Lajanugen Logeswaran, Honglak Lee

Fidelity-Weighted Learning
Mostafa Dehghani, Arash Mehrjou, Stephan Gouws, Jaap Kamps, Bernhard Schölkopf

Generating Wikipedia by Summarizing Long Sequences
Peter J. Liu, Mohammad Saleh, Etienne Pot, Ben Goodrich, Ryan Sepassi, Lukasz Kaiser, Noam Shazeer

Matrix Capsules with EM Routing
Geoffrey Hinton, Sara Sabour, Nicholas Frosst

Temporal Difference Models: Model-Free Deep RL for Model-Based Control
Sergey Levine, Shixiang Gu, Murtaza Dalal, Vitchyr Pong

Deep Neural Networks as Gaussian Processes
Jaehoon Lee, Yasaman Bahri, Roman Novak, Samuel L. Schoenholz, Jeffrey Pennington, Jascha Sohl-Dickstein

Many Paths to Equilibrium: GANs Do Not Need to Decrease a Divergence at Every Step
William Fedus, Mihaela Rosca, Balaji Lakshminarayanan, Andrew M. Dai, Shakir Mohamed, Ian Goodfellow

Initialization Matters: Orthogonal Predictive State Recurrent Neural Networks
Krzysztof Choromanski, Carlton Downey, Byron Boots

Learning Differentially Private Recurrent Language Models
H. Brendan McMahan, Daniel Ramage, Kunal Talwar, Li Zhang

Learning Latent Permutations with Gumbel-Sinkhorn Networks
Gonzalo Mena, David Belanger, Scott Linderman, Jasper Snoek

Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning
Benjamin Eysenbach, Shixiang Gu, Julian IbarzSergey Levine

Meta-Learning for Semi-Supervised Few-Shot Classification
Mengye Ren, Eleni Triantafillou, Sachin Ravi, Jake Snell, Kevin Swersky, Josh Tenenbaum, Hugo Larochelle, Richard Zemel

Thermometer Encoding: One Hot Way to Resist Adversarial Examples
Jacob Buckman, Aurko Roy, Colin Raffel, Ian Goodfellow

A Hierarchical Model for Device Placement
Azalia Mirhoseini, Anna Goldie, Hieu Pham, Benoit Steiner, Quoc V. LeJeff Dean

Monotonic Chunkwise Attention
Chung-Cheng Chiu, Colin Raffel

Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples
Kimin Lee, Honglak Lee, Kibok Lee, Jinwoo Shin

Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
Ofir Nachum, Mohammad Norouzi, Kelvin Xu, Dale Schuurmans

Ensemble Adversarial Training: Attacks and Defenses
Florian Tramèr, Alexey Kurakin, Nicolas Papernot, Ian Goodfellow, Dan Boneh, Patrick McDaniel

Stochastic Variational Video Prediction
Mohammad Babaeizadeh, Chelsea Finn, Dumitru Erhan, Roy Campbell, Sergey Levine

Depthwise Separable Convolutions for Neural Machine Translation
Lukasz Kaiser, Aidan N. Gomez, Francois Chollet

Don’t Decay the Learning Rate, Increase the Batch Size
Samuel L. Smith, Pieter-Jan Kindermans, Chris Ying, Quoc V. Le

Generative Models of Visually Grounded Imagination
Ramakrishna Vedantam, Ian Fischer, Jonathan Huang, Kevin Murphy

Large Scale Distributed Neural Network Training through Online Distillation
Rohan Anil, Gabriel Pereyra, Alexandre Passos, Robert Ormandi, George E. Dahl, Geoffrey E. Hinton

Learning a Neural Response Metric for Retinal Prosthesis
Nishal P. Shah, Sasidhar Madugula, Alan Litke, Alexander Sher, EJ Chichilnisky, Yoram Singer, Jonathon Shlens

Neumann Optimizer: A Practical Optimization Algorithm for Deep Neural Networks
Shankar Krishnan, Ying Xiao, Rif A. Saurous

A Neural Representation of Sketch Drawings
David HaDouglas Eck

Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling
Carlos Riquelme, George Tucker, Jasper Snoek

Generalizing Hamiltonian Monte Carlo with Neural Networks
Daniel Levy, Matthew D. HoffmanJascha Sohl-Dickstein

Leveraging Grammar and Reinforcement Learning for Neural Program Synthesis
Rudy Bunel, Matthew Hausknecht, Jacob Devlin, Rishabh Singh, Pushmeet Kohli

On the Discrimination-Generalization Tradeoff in GANs
Pengchuan Zhang, Qiang Liu, Dengyong Zhou, Tao Xu, Xiaodong He

A Bayesian Perspective on Generalization and Stochastic Gradient Descent
Samuel L. Smith, Quoc V. Le

Learning how to Explain Neural Networks: PatternNet and PatternAttribution
Pieter-Jan Kindermans, Kristof T. Schütt, Maximilian Alber, Klaus-Robert Müller, Dumitru Erhan, Been Kim, Sven Dähne

Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks
Víctor Campos, Brendan Jou, Xavier Giró-i-Nieto, Jordi Torres, Shih-Fu Chang

Towards Neural Phrase-based Machine Translation
Po-Sen Huang, Chong Wang, Sitao Huang, Dengyong Zhou, Li Deng

Unsupervised Cipher Cracking Using Discrete GANs
Aidan N. Gomez, Sicong Huang, Ivan Zhang, Bryan M. Li, Muhammad Osama, Lukasz Kaiser

Variational Image Compression With A Scale Hyperprior
Johannes Ballé, David Minnen, Saurabh Singh, Sung Jin Hwang, Nick Johnston

Workshop Posters
Local Explanation Methods for Deep Neural Networks Lack Sensitivity to Parameter Values
Julius Adebayo, Justin Gilmer, Ian Goodfellow, Been Kim

Stoachastic Gradient Langevin Dynamics that Exploit Neural Network Structure
Zachary Nado, Jasper Snoek, Bowen Xu, Roger Grosse, David Duvenaud, James Martens

Towards Mixed-initiative generation of multi-channel sequential structure
Anna Huang, Sherol Chen, Mark J. Nelson, Douglas Eck

Can Deep Reinforcement Learning Solve Erdos-Selfridge-Spencer Games?
Maithra Raghu, Alex Irpan, Jacob Andreas, Robert Kleinberg, Quoc V. Le, Jon Kleinberg

GILBO: One Metric to Measure Them All
Alexander Alemi, Ian Fischer

HoME: a Household Multimodal Environment
Simon Brodeur, Ethan Perez, Ankesh Anand, Florian Golemo, Luca Celotti, Florian Strub, Jean Rouat, Hugo Larochelle, Aaron Courville

Learning to Learn without Labels
Luke Metz, Niru Maheswaranathan, Brian Cheung, Jascha Sohl-Dickstein

Learning via Social Awareness: Improving Sketch Representations with Facial Feedback
Natasha Jaques, Jesse Engel, David Ha, Fred Bertsch, Rosalind Picard, Douglas Eck

Negative Eigenvalues of the Hessian in Deep Neural Networks
Guillaume Alain, Nicolas Le Roux, Pierre-Antoine Manzagol

Realistic Evaluation of Semi-Supervised Learning Algorithms
Avital Oliver, Augustus Odena, Colin Raffel, Ekin Cubuk, lan Goodfellow

Winner’s Curse? On Pace, Progress, and Empirical Rigor
D. Sculley, Jasper Snoek, Alex Wiltschko, Ali Rahimi

Meta-Learning for Batch Mode Active Learning
Sachin Ravi, Hugo Larochelle

To Prune, or Not to Prune: Exploring the Efficacy of Pruning for Model Compression
Michael Zhu, Suyog Gupta

Adversarial Spheres
Justin Gilmer, Luke Metz, Fartash Faghri, Sam Schoenholz, Maithra Raghu,,Martin Wattenberg, Ian Goodfellow

Clustering Meets Implicit Generative Models
Francesco Locatello, Damien Vincent, Ilya Tolstikhin, Gunnar Ratsch, Sylvain Gelly, Bernhard Scholkopf

Decoding Decoders: Finding Optimal Representation Spaces for Unsupervised Similarity Tasks
Vitalii Zhelezniak, Dan Busbridge, April Shen, Samuel L. Smith, Nils Y. Hammerla

Learning Longer-term Dependencies in RNNs with Auxiliary Losses
Trieu Trinh, Quoc Le, Andrew Dai, Thang Luong

Graph Partition Neural Networks for Semi-Supervised Classification
Alexander Gaunt, Danny Tarlow, Marc Brockschmidt, Raquel Urtasun, Renjie Liao, Richard Zemel

Searching for Activation Functions
Prajit Ramachandran, Barret Zoph, Quoc Le

Time-Dependent Representation for Neural Event Sequence Prediction
Yang Li, Nan Du, Samy Bengio

Faster Discovery of Neural Architectures by Searching for Paths in a Large Model
Hieu Pham, Melody Guan, Barret Zoph, Quoc V. Le, Jeff Dean

Intriguing Properties of Adversarial Examples
Ekin Dogus Cubuk, Barret Zoph, Sam Schoenholz, Quoc Le

PPP-Net: Platform-aware Progressive Search for Pareto-optimal Neural Architectures
Jin-Dong Dong, An-Chieh Cheng, Da-Cheng Juan, Wei Wei, Min Sun

The Mirage of Action-Dependent Baselines in Reinforcement Learning
George Tucker, Surya Bhupatiraju, Shixiang Gu, Richard E. Turner, Zoubin Ghahramani, Sergey Levine

Learning to Organize Knowledge with N-Gram Machines
Fan Yang, Jiazhong Nie, William W. Cohen, Ni Lao

Online variance-reducing optimization
Nicolas Le Roux, Reza Babanezhad, Pierre-Antoine Manzagol