Google Research Blog
The latest news from Research at Google
Get moving with the new Motion Stills
Thursday, December 15, 2016
Posted by Matthias Grundmann and Ken Conley, Machine Perception
Last June, we
released Motion Stills
, an
iOS app
that uses our video stabilization technology to create easily shareable GIFs from Apple Live Photos. Since then, we
integrated Motion Stills into Google Photos for iOS
and thought of ways to improve it, taking into account your ideas for new features.
Today, we are happy to announce a major new update to the
Motion Stills app
that will help you create even more beautiful videos and fun GIFs using motion-tracked text overlays, super-resolution videos, and automatic
cinemagraphs
.
Motion Text
We’ve added motion text so you can create moving text effects, similar to what you might see in movies and TV shows, directly on your phone. With Motion Text, you can easily position text anywhere over your video to get the exact result you want. It only takes a second to initialize while you type, and a tracks at 1000 FPS throughout the whole Live Photo, so the process feels instantaneous.
To make this possible, we took the motion tracking technology that we run on YouTube servers for
“Privacy Blur”
, and made it run even faster on your device. How? We first create motion metadata for your video by leveraging machine learning to classify foreground/background features as well as to model temporally coherent camera motion. We then take this metadata, and use it as input to an algorithm that can track individual objects while discriminating it from others. The algorithm models each object’s state that includes its motion in space, an implicit appearance model (described as a set of its moving parts), and its centroid and extent, as shown in the figure below.
Enhance! your videos with better detail and loops
Last month,
we published the details of our state-of-the-art RAISR technology
, which employs machine learning to create super-resolution detail in images. This technology is now available in Motion Stills, automatically sharpening every video you export.
We are also going beyond stabilization to bring you fully automatic cinemagraphs. After freezing the background into a still photo, we analyze our result to optimize for the perfect loop transition. By considering a range of start and end frames, we build a matrix of transition scores between frame pairs. A significant minimum in this matrix reflects the perfect transition, resulting in an endless loop of motion stillness.
Continuing improve the experience
Thanks to your feedback, we’ve additionally rebuilt our navigation and added more tutorials. We’ve also added Apple’s 3D touch to let you “peek and pop” clips in your stream and movie tray. Lots more is coming to address your top requests, so please
download the new release of Motion Stills
and keep sending us feedback with #motionstills on your favorite social media.
App Discovery with Google Play, Part 2: Personalized Recommendations with Related Apps
Wednesday, December 14, 2016
Posted by Ananth Balashankar & Levent Koc, Software Engineers, and Norberto Guimaraes, Product Manager
In
Part 1 of this series
on app discovery, we discussed using machine learning to gain a deeper understanding of the topics associated with an app, in order to provide a better search and discovery experience on the
Google Play Apps Store
. In this post, we discuss a deep learning framework to provide personalized recommendations to users based on their previous app downloads and the context in which they are used.
Providing useful and relevant app recommendations to visitors of the
Google Play Apps Store
is a key goal of our apps discovery team. An
understanding of the topics associated with an app
, however, is only one part of creating a system that best serves the user. In order to create a better overall experience, one must also take into account the tastes of the user and provide personalized recommendations. If one didn’t, the “You might also like” recommendation would look the same for everyone!
Discovering these nuances requires both an understanding what an app does, and also the context of the app with respect to the user. For example, to an avid sci-fi gamer, similar game recommendations may be of interest, but if a user installs a fitness app, recommending a health recipe app may be more relevant than five more fitness apps. As users may be more interested in downloading an app or game that complements one they already have installed, we provide recommendations based on app relatedness with each other (“You might also like”), in addition to providing recommendations based on the topic associated with an app (“Similar apps”).
Suggestions of similar apps and apps that you also might like shown both before making an install decision (left) and while the current install is in progress (right).
One particularly strong contextual signal is app relatedness, based on previous installs and search query clicks. As an example, a user who has searched for and plays a lot of graphics-heavy games likely has a preference for apps which are also graphically intense rather than apps with simpler graphics. So, when this user installs a car racing game, the “You might also like” suggestions includes apps which relate to the “seed” app (because they are graphically intense racing games) ranked higher than racing apps with simpler graphics. This allows for a finer level of personalization where the characteristics of the apps are matched with the preferences of the user.
To incorporate this app relatedness in our recommendations, we take a two pronged approach: (a) offline candidate generation i.e. the generation of the potential related apps that other users have downloaded, in addition to the app in question, and (b) online personalized re-ranking, where we re-rank these candidates using a personalized ML model.
Offline Candidate Generation
The problem of finding related apps can be formulated as a
nearest neighbor search
problem. Given an app X, we want to find the k nearest apps. In the case of “you might also like”, a naive approach would be one based on counting, where if many people installed apps X and Y, then the app Y would be used as candidate for seed app X. However, this approach is intractable as it is difficult to learn and generalize effectively in the huge problem space. Given that there are over a million apps on Google Play, the total number of possible app pairs is over ~10
12
.
To solve this, we trained a deep neural network to predict the next app installed by the user given their previous installs. Output
embeddings
at the final layer of this deep neural network generally represents the types of apps a given user has installed. We then apply the nearest neighbor algorithm to find related apps for a given seed app in the trained embedding space. Thus, we perform dimensionality reduction by representing apps using embeddings to help prune the space of potential candidates.
Online Personalized Re-ranking
The candidates generated in the previous step represent relatedness along multiple dimensions. The objective is to assign scores to the candidates so they can be re-ranked in a personalized way, in order to provide an experience that is crafted to the user’s overall interests and yet maintain relevance for the user installing a given app. In order to do this, we take the characteristics of the app candidates as input to a separate deep neural network, which is then trained with real-time with user specific context features (region, language, app store search queries, etc.) to predict the likelihood of a related app being specifically relevant to the user.
Architecture for personalized related apps
One of the takeaways from this work is that re-ranking content, like related apps, is one of the critical ways of app discovery in the store, and can bring great value to the user without impacting perceived relevance. Compared to the control (where no re-ranking was done), we saw a 20% increase in the app install rate from the “You might also like” suggestions. This had no user perceivable change in latency.
In Part 3 of this series, we will discuss how we employ machine learning to keep bad actors who try to manipulate the signals we use for search and personalization at bay.
Acknowledgements
This work was done within the Google Play team in collaboration with Halit Erdogan, Mark Taylor, Michael Watson, Huazhong Ning, Stan Bileschi, John Kraemer, and Chuan Yu Foo.
Open sourcing the Embedding Projector: a tool for visualizing high dimensional data
Wednesday, December 07, 2016
Posted by Daniel Smilkov and the Big Picture group
Recent advances in Machine Learning (ML) have shown impressive results, with applications ranging from
image recognition
,
language translation
,
medical diagnosis
and more. With the widespread adoption of ML systems, it is increasingly important for research scientists to be able to explore how the data is being interpreted by the models. However, one of the main challenges in exploring this data is that it often has hundreds or even thousands of dimensions, requiring special tools to investigate the space.
To enable a more intuitive exploration process, we are
open-sourcing the Embedding Projector
, a web application for interactive visualization and analysis of high-dimensional data recently shown as an
A.I. Experiment
, as part of
TensorFlow
. We are also releasing a standalone version at
projector.tensorflow.org
, where users can visualize their high-dimensional data without the need to install and run TensorFlow.
Exploring Embeddings
The data needed to train machine learning systems comes in a form that computers don't immediately understand. To translate the things we understand naturally (e.g. words, sounds, or videos) to a form that the algorithms can process, we use
embeddings
, a mathematical vector representation that captures different facets (dimensions) of the data. For example, in
this language embedding
, similar words are mapped to points that are close to each other.
With the Embedding Projector, you can navigate through views of data in either a 2D or a 3D mode, zooming, rotating, and panning using natural click-and-drag gestures. Below is a figure showing the nearest points to the embedding for the word “important” after training a TensorFlow model using the
word2vec tutorial
. Clicking on any point (which represents the learned embedding for a given word) in this visualization, brings up a list of nearest points and distances, which shows which words the algorithm has learned to be semantically related. This type of interaction represents an important way in which one can explore how an algorithm is performing.
Methods of Dimensionality Reduction
The Embedding Projector offers three commonly used methods of data dimensionality reduction, which allow easier visualization of complex data:
PCA
,
t-SNE
and custom linear projections.
PCA
is often effective at exploring the internal structure of the embeddings, revealing the most influential dimensions in the data.
t-SNE
, on the other hand, is useful for exploring local neighborhoods and finding clusters, allowing developers to make sure that an embedding preserves the meaning in the data (e.g. in the
MNIST dataset
, seeing that the same digits are clustered together). Finally, custom linear projections can help discover meaningful "directions" in data sets - such as the distinction between a formal and casual tone in a language generation model - which would allow the design of more adaptable ML systems.
A custom linear projection of the 100 nearest points of "See attachments." onto the "yes" - "yeah" vector (“yes” is right, “yeah” is left) of a corpus of
35k frequently used phrases in emails
The Embedding Projector
website
includes a few datasets to play with. We’ve also made it easy for users to publish and share their embeddings with others (just click on the “Publish” button on the left pane). It is our hope that the
Embedding Projector
will be a useful tool to help the research community explore and refine their ML applications, as well as enable anyone to better understand how ML algorithms interpret data. If you'd like to get the full details on the Embedding Projector, you can read the paper
here
. Have fun exploring the world of embeddings!
NIPS 2016 & Research at Google
Sunday, December 04, 2016
Posted by Doug Eck, Research Scientist, Google Brain Team
This week, Barcelona hosts the
30
th
Annual Conference on Neural Information Processing Systems
(NIPS 2016), a machine learning and computational neuroscience conference that includes invited talks, demonstrations and oral and poster presentations of some of the latest in machine learning research. Google will have a strong presence at NIPS 2016, with over 280 Googlers attending in order to contribute to and learn from the broader academic research community by presenting technical talks and posters, in addition to hosting workshops and tutorials.
Research at Google is at the forefront of innovation in
Machine Intelligence
, actively exploring virtually all aspects of machine learning including classical algorithms as well as cutting-edge techniques such as
deep learning
. Focusing on both theory as well as application, much of our work on language understanding, speech, translation, visual processing, ranking, and prediction relies on Machine Intelligence. In all of those tasks and many others, we gather large volumes of direct or indirect evidence of relationships of interest, and develop learning approaches to understand and generalize.
If you are attending NIPS 2016, we hope you’ll stop by our booth and chat with our researchers about the projects and opportunities at Google that go into solving interesting problems for billions of people, and to see demonstrations of some of the exciting research we pursue. You can also learn more about our work being presented at NIPS 2016 in the list below (Googlers highlighted in
blue
).
Google is a Platinum Sponsor of NIPS 2016.
Organizing Committee
Executive Board includes:
Corinna Cortes, Fernando Pereira
Advisory Board includes:
John C. Platt
Area Chairs include:
John Shlens
,
Moritz Hardt
,
Navdeep Jaitly
,
Hugo Larochelle
,
Honglak Lee
,
Sanjiv Kumar
,
Gal Chechik
Invited Talk
Dynamic Legged Robots
Marc Raibert
Accepted Papers:
Boosting with Abstention
Corinna Cortes
, Giulia DeSalvo,
Mehryar Mohri
Community Detection on Evolving Graphs
Stefano Leonardi, Aris Anagnostopoulos, Jakub Łącki,
Silvio Lattanzi
,
Mohammad Mahdian
Linear Relaxations for Finding Diverse Elements in Metric Spaces
Aditya Bhaskara, Mehrdad Ghadiri,
Vahab Mirrokni
, Ola Svensson
Nearly Isometric Embedding by Relaxation
James McQueen, Marina Meila,
Dominique Joncas
Optimistic Bandit Convex Optimization
Mehryar Mohri
, Scott Yang
Reward Augmented Maximum Likelihood for Neural Structured Prediction
Mohammad Norouzi
,
Samy Bengio
,
Zhifeng Chen
,
Navdeep Jaitly
,
Mike Schuster
,
Yonghui Wu
,
Dale Schuurmans
Stochastic Gradient MCMC with Stale Gradients
Changyou Chen,
Nan Ding
, Chunyuan Li, Yizhe Zhang, Lawrence Carin
Unsupervised Learning for Physical Interaction through Video Prediction
Chelsea Finn
*
, Ian Goodfellow,
Sergey Levine
Using Fast Weights to Attend to the Recent Past
Jimmy Ba,
Geoffrey Hinton
, Volodymyr Mnih, Joel Leibo, Catalin Ionescu
A Credit Assignment Compiler for Joint Prediction
Kai-Wei Chang, He He,
Stephane Ross
, Hal III
A Neural Transducer
Navdeep Jaitly
,
Quoc Le
, Oriol Vinyals, Ilya Sutskever,
David Sussillo
,
Samy Bengio
Attend, Infer, Repeat: Fast Scene Understanding with Generative Models
S. M. Ali Eslami, Nicolas Heess, Theophane Weber, Yuval Tassa, David Szepesvari, Koray Kavukcuoglu,
Geoffrey Hinton
Bi-Objective Online Matching and Submodular Allocations
Hossein Esfandiari,
Nitish Korula
,
Vahab Mirrokni
Combinatorial Energy Learning for Image Segmentation
Jeremy Maitin-Shepard
,
Viren Jain
,
Michal Januszewski
,
Peter Li
, Pieter Abbeel
Deep Learning Games
Dale Schuurmans
,
Martin Zinkevich
DeepMath - Deep Sequence Models for Premise Selection
Geoffrey Irving
,
Christian Szegedy
,
Niklas Een
,
Alexander Alemi
,
François Chollet
, Josef Urban
Density Estimation via Discrepancy Based Adaptive Sequential Partition
Dangna Li,
Kun Yang
, Wing Wong
Domain Separation Networks
Konstantinos Bousmalis
, George Trigeorgis,
Nathan Silberman
,
Dilip Krishnan
,
Dumitru Erhan
Fast Distributed Submodular Cover: Public-Private Data Summarization
Baharan Mirzasoleiman,
Morteza Zadimoghaddam
, Amin Karbasi
Satisfying Real-world Goals with Dataset Constraints
Gabriel Goh,
Andrew Cotter
,
Maya Gupta
, Michael P Friedlander
Can Active Memory Replace Attention?
Łukasz Kaiser
,
Samy Bengio
Fast and Flexible Monotonic Functions with Ensembles of Lattices
Kevin Canini
,
Andy Cotter
,
Maya Gupta
,
Mahdi Fard
,
Jan Pfeifer
Launch and Iterate: Reducing Prediction Churn
Quentin Cormier,
Mahdi Fard, Kevin Canini, Maya Gupta
On Mixtures of Markov Chains
Rishi Gupta,
Ravi Kumar
,
Sergei Vassilvitskii
Orthogonal Random Features
Felix Xinnan Yu
,
Ananda Theertha Suresh
,
Krzysztof Choromanski
,
Dan Holtmann-Rice
,
Sanjiv Kumar
Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D
Supervision
Xinchen Yan, Jimei Yang, Ersin Yumer, Yijie Guo,
Honglak Lee
Structured Prediction Theory Based on Factor Graph Complexity
Corinna Cortes
,
Vitaly Kuznetsov
,
Mehryar Mohri
, Scott Yang
Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity
Amit Daniely
,
Roy Frostig
,
Yoram Singer
Demonstrations
Interactive musical improvisation with Magenta
Adam Roberts
,
Sageev Oore
,
Curtis Hawthorne
,
Douglas Eck
Content-based Related Video Recommendation
Joonseok Lee
Workshops, Tutorials and Symposia
Advances in Approximate Bayesian Inference
Advisory Committee includes:
Kevin P. Murphy
Invited Speakers include:
Matt Johnson
Panelists include:
Ryan Sepassi
Adversarial Training
Accepted Authors:
Luke Metz
,
Ben Poole
,
David Pfau
,
Jascha Sohl-Dickstein
,
Augustus Odena
,
Christopher Olah
,
Jonathon Shlens
Bayesian Deep Learning
Organizers include:
Kevin P. Murphy
Accepted Authors include:
Rif A. Saurous
,
Eugene Brevdo
,
Kevin Murphy
,
Eric Jang
,
Shixiang Gu
,
Ben Poole
Brains & Bits: Neuroscience Meets Machine Learning
Organizers include:
Jascha Sohl-Dickstein
Connectomics II: Opportunities & Challanges for Machine Learning
Organizers include:
Viren Jain
Constructive Machine Learning
Invited Speakers include:
Douglas Eck
Continual Learning & Deep Networks
Invited Speakers include:
Honglak Lee
Deep Learning for Action & Interaction
Organizers include:
Sergey Levine
Invited Speakers include:
Honglak Lee
Accepted Authors include:
Pararth Shah
,
Dilek Hakkani-Tur
,
Larry Heck
End-to-end Learning for Speech and Audio Processing
Invited Speakers include:
Tara Sainath
Accepted Authors include:
Brian Patton
,
Yannis Agiomyrgiannakis
,
Michael Terry
,
Kevin Wilson
,
Rif A. Saurous
,
D. Sculley
Extreme Classification: Multi-class & Multi-label Learning in Extremely Large Label Spaces
Organizers include:
Samy Bengio
Interpretable Machine Learning for Complex Systems
Invited Speaker:
Honglak Lee
Accepted Authors include:
Daniel Smilkov
,
Nikhil Thorat
,
Charles Nicholson
,
Emily Reif
,
Fernanda Viegas
,
Martin Wattenberg
Large Scale Computer Vision Systems
Organizers include:
Gal Chechik
Machine Learning Systems
Invited Speakers include:
Jeff Dean
Nonconvex Optimization for Machine Learning: Theory & Practice
Organizers include:
Hossein Mobahi
Optimizing the Optimizers
Organizers include:
Alex Davies
Reliable Machine Learning in the Wild
Accepted Authors:
Andres Medina
,
Sergei Vassilvitskii
The Future of Gradient-Based Machine Learning Software
Invited Speakers:
Jeff Dean
,
Matt Johnson
Time Series Workshop
Organizers include:
Vitaly Kuznetsov
Invited Speakers include:
Mehryar Mohri
Theory and Algorithms for Forecasting Non-Stationary Time Series
Tutorial Organizers:
Vitaly Kuznetsov,
Mehryar Mohri
Women in Machine Learning
Invited Speakers include:
Maya Gupta
*
Work done as part of the Google Brain team
↩
Labels
accessibility
ACL
ACM
Acoustic Modeling
Adaptive Data Analysis
ads
adsense
adwords
Africa
AI
Algorithms
Android
Android Wear
API
App Engine
App Inventor
April Fools
Art
Audio
Augmented Reality
Australia
Automatic Speech Recognition
Awards
Cantonese
Chemistry
China
Chrome
Cloud Computing
Collaboration
Computational Imaging
Computational Photography
Computer Science
Computer Vision
conference
conferences
Conservation
correlate
Course Builder
crowd-sourcing
CVPR
Data Center
Data Discovery
data science
datasets
Deep Learning
DeepDream
DeepMind
distributed systems
Diversity
Earth Engine
economics
Education
Electronic Commerce and Algorithms
electronics
EMEA
EMNLP
Encryption
entities
Entity Salience
Environment
Europe
Exacycle
Expander
Faculty Institute
Faculty Summit
Flu Trends
Fusion Tables
gamification
Gboard
Gmail
Google Accelerated Science
Google Books
Google Brain
Google Cloud Platform
Google Docs
Google Drive
Google Genomics
Google Maps
Google Photos
Google Play Apps
Google Science Fair
Google Sheets
Google Translate
Google Trips
Google Voice Search
Google+
Government
grants
Graph
Graph Mining
Hardware
HCI
Health
High Dynamic Range Imaging
ICLR
ICML
ICSE
Image Annotation
Image Classification
Image Processing
Inbox
India
Information Retrieval
internationalization
Internet of Things
Interspeech
IPython
Journalism
jsm
jsm2011
K-12
KDD
Keyboard Input
Klingon
Korean
Labs
Linear Optimization
localization
Low-Light Photography
Machine Hearing
Machine Intelligence
Machine Learning
Machine Perception
Machine Translation
Magenta
MapReduce
market algorithms
Market Research
Mixed Reality
ML
MOOC
Moore's Law
Multimodal Learning
NAACL
Natural Language Processing
Natural Language Understanding
Network Management
Networks
Neural Networks
Nexus
Ngram
NIPS
NLP
On-device Learning
open source
operating systems
Optical Character Recognition
optimization
osdi
osdi10
patents
Peer Review
ph.d. fellowship
PhD Fellowship
PhotoScan
Physics
PiLab
Pixel
Policy
Professional Development
Proposals
Public Data Explorer
publication
Publications
Quantum AI
Quantum Computing
renewable energy
Research
Research Awards
resource optimization
Robotics
schema.org
Search
search ads
Security and Privacy
Semantic Models
Semi-supervised Learning
SIGCOMM
SIGMOD
Site Reliability Engineering
Social Networks
Software
Speech
Speech Recognition
statistics
Structured Data
Style Transfer
Supervised Learning
Systems
TensorBoard
TensorFlow
TPU
Translate
trends
TTS
TV
UI
University Relations
UNIX
User Experience
video
Video Analysis
Virtual Reality
Vision Research
Visiting Faculty
Visualization
VLDB
Voice Search
Wiki
wikipedia
WWW
YouTube
Archive
2018
May
Apr
Mar
Feb
Jan
2017
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2016
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2015
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2014
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2013
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2012
Dec
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2011
Dec
Nov
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2010
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2009
Dec
Nov
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2008
Dec
Nov
Oct
Sep
Jul
May
Apr
Mar
Feb
2007
Oct
Sep
Aug
Jul
Jun
Feb
2006
Dec
Nov
Sep
Aug
Jul
Jun
Apr
Mar
Feb
Feed
Google
on
Follow @googleresearch
Give us feedback in our
Product Forums
.