Research Blog: Vision Research

Improving Photo Search: A Step Across the Semantic Gap

Wednesday, June 12, 2013

Posted by Chuck Rosenberg, Image Search TeamGoogle I/Omajor upgrade to the photos experiencesearch your own photosmany yearsImage Searchdeep learningconvolutional neural networksImageNet computer vision competitionwinning teamProfessor Geoffrey Hintonsoftware infrastructureJeff DeanAndrew Ngacquired the rights to the technologyphotos.google.comProfessor Yann LeCunreading handwritten letters and digitsImageNet Large Scale Visual RecognitionentitiesFreebase entitiesKnowledge Graph/m/0449p/m/012x34

A typical photo of a flower found on the web.

A typical photo of a flower found in an impromptu photo.

linear classifier

Photos recognized as containing a meal.

mistaking a goat for a dog or a millipede for a snake

Photo of a banana slug mistaken for a snake.

Photo of a donkey mistaken for a dog.

Photo recognized as containing a hibiscus flower.

Photo recognized as containing a dahlia flower.

Photo recognized as containing a polar bear.

Photo recognized as containing a grizzly bear.

Video Stabilization on YouTube

Friday, May 04, 2012

Posted by Matthias Grundmann, Vivek Kwatra, and Irfan Essa, Research at Googleearlier blog postAuto-Directed Video Stabilization with Robust L1 Optimal Camera PathsIEEE CVPR 2011YouTube editor video manager Original video with rolling shutter distortionsCalibration-Free Rolling Shutter Removalbest paperIEEE ICCP 2012 narrated video descriptionYouTube stabilizer After stabilization and rolling shutter removal

Excellent Papers for 2011

Thursday, March 22, 2012

Posted by Corinna Cortes and Alfred Spector, Google ResearchUPDATE: Added Theo Vassilakis as an author for "Dremel: Interactive Analysis of Web-Scale Datasets"publicationsset of paperssecond roundpublications listAudio processingCascades of two-pole–two-zero asymmetric resonators are good models of peripheral auditory functionRichard F. LyonJournal of the Acoustical Society of AmericaElectronic Commerce and AlgorithmsOnline Vertex-Weighted Bipartite Matching and Single-bid Budgeted AllocationsGagan AggarwalGagan GoelChinmay KarandeAranyak MehtaSODA 2011Milgram-routing in social networksSilvio LattanziProceedings of the 20th International Conference on World Wide Web, WWW 2011Non-Price Equilibria in Markets of Discrete GoodsECHCIFrom Basecamp to Summit: Scaling Field Research Across 9 LocationsJens RiegelsbergerCHI 2011 Extended AbstractsUser-Defined Motion Gestures for Mobile InteractionYang LiCHI 2011: ACM Conference on Human Factors in Computing SystemsInformation RetrievalReputation Systems for Open CollaborationA. Kulshreshtha Communications of the ACMMachine Learning and Data MiningDomain adaptation in regressionCorinna CortesMehryar MohriProceedings of The 22nd International Conference on Algorithmic Learning Theory, ALT 2011On the necessity of irrelevant variablesPhilip M. LongICMLOnline Learning in the Manifold of Low-Rank MatricesGal ChechikNeural Information Processing Systems (NIPS 23)Machine TranslationTraining a Parser for Machine Translation ReorderingSlav PetrovRyan McDonaldFranz OchProceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP '11)Watermarking the Outputs of Structured Prediction with an application in Statistical Machine TranslationJakob UszkoreitFranz OchProceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP)Inducing Sentence Structure from Parallel Corpora for ReorderingJohn DeNeroJakob UszkoreitProceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP)Multimedia and Computer VisionKernelized Structural SVM Learning for Supervised Object SegmentationLuca BertelliTianli YuProceedings of IEEE Conference on Computer Vision and Pattern Recognition 2011Auto-Directed Video Stabilization with Robust L1 Optimal Camera PathsMatthias GrundmannVivek KwatraIEEE Conference on Computer Vision and Pattern Recognition (CVPR 2011).youtube.com/editorThe Power of Comparative ReasoningJay YagnikDavid RossInternational Conference on Computer VisionNLPUnsupervised Part-of-Speech Tagging with Bilingual Graph-Based ProjectionsSlav PetrovProceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL '11)Best Paper AwardNetworksTCP Fast OpenYuchung ChengJerry ChuArvind JainProceedings of the 7th International Conference on emerging Networking EXperiments and Technologies (CoNEXT)Proportional Rate Reduction for TCPNandita DukkipatiYuchung ChengProceedings of the 11th ACM SIGCOMM Conference on Internet Measurement 2011, Berlin, Germany - November 2-4, 2011Security and PrivacyAutomated Analysis of Security-Critical JavaScript APIsÚlfar ErlingssonMark S. MillerIEEE Symposium on Security & Privacy (SP)App Isolation: Get the Security of Multiple Browsers with Just OneCharles Reis18th ACM Conference on Computer and Communications SecuritySpeechImproving the speed of neural networks on CPUsVincent VanhouckeAndrew SeniorDeep Learning and Unsupervised Feature Learning Workshop, NIPS 2011.Bayesian Language Model Interpolation for Mobile Speech InputCyril AllauzenMichael RileyInterspeech 2011.StatisticsLarge-Scale Parallel Statistical Forecasting Computations in RMurray StokelyJSM Proceedings, Section on Physical and Engineering SciencesStructured DataDremel: Interactive Analysis of Web-Scale DatasetsTheo VassilakisCommunications of the ACMRepresentative Skylines using Threshold-based Preference DistributionsAtish Das SarmaInternational Conference on Data Engineering (ICDE)Hyper-local, directions-based ranking of placesAlon Y. HalevyPVLDBSystemsPower Management of Online Data-Intensive ServicesLuiz André BarrosoWolf-Dietrich WeberProceedings of the 38th ACM International Symposium on Computer ArchitectureThe Impact of Memory Subsystem Resource Sharing on Datacenter ApplicationsRobert HundtISCALanguage-Independent Sandboxing of Just-In-Time Compilation and Self-Modifying CodeÚlfar ErlingssonBrad ChenCliff L. BiffleACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI)Thialfi: A Client Notification Service for Internet-Scale ApplicationsDaniel MyersMichael PiatekProc. 23rd ACM Symposium on Operating Systems Principles (SOSP)

Auto-Directed Video Stabilization with Robust L1 Optimal Camera Paths

Monday, June 20, 2011

Posted by Matthias Grundmann, Vivek Kwatra, and Irfan Essa, Research TeamannouncedYouTube Video Editorthis paperCVPR 2011Google’s exhibition boothpaper

Google at CVPR 2011

Thursday, June 16, 2011

Posted by Mei Han and Sergey Ioffe, Research TeamIEEE International Conference on Computer Vision and Pattern RecognitionImage SearchYouTubeStreet ViewPicasaGogglesAndrew Senior

Where's Waldo: Matching People in Images of Crowds by Rahul Garg, Deva Ramanan, Steve Seitz, Noah Snavely

Visual and Semantic Similarity in ImageNet by Thomas Deselaers, Vittorio Ferrari

Multicore Bundle Adjustment by Changchang Wu, Sameer Agarwal, Brian Curless, Steve Seitz

A Hierarchical Conditional Random Field Model for Labeling and Segmenting Images of Street Scenes by Qixing Huang, Mei Han, Bo Wu, Sergey Ioffe

Kernelized Structural SVM Learning for Supervised Object Segmentation by Luca Bertelli, Tianli Yu, Diem Vu, Salih Gokturk

Discriminative Tag Learning on YouTube Videos with Latent Sub-tags by Weilong Yang, George Toderici

Auto-Directed Video Stabilization with Robust L1 Optimal Camera Paths by Matthias Grundmann, Vivek Kwatra, Irfan Essa

Image Saliency: From Local to Global Context by Meng Wang, Janusz Konrad, Prakash Ishwar, Yushi Jing, Henry Rowley

Google Earth Facade Shadow Removal by Mei Han, Vivek Kwatra, and Shengyang Dai
We will demonstrate our technique for removing shadows and other lighting/texture artifacts from building facades in Google Earth. We obtain cleaner, clearer, and more uniform textures which provide users with an improved visual experience.

Video Stabilization on YouTube Editor by Matthias Grundmann, Vivek Kwatra, and Irfan Essa
Casually shot videos captured by handheld or mobile cameras suffer from significant amount of shake. In contrast, professionally shot video usually employs stabilization equipment such as tripods or camera dollies, and employ ease-in and ease-out for transitions. Our technique mimics these cinematographic principles, by optimally dividing the original, shaky camera path into a set of segments and approximating each with either constant, linear or parabolic motion using a computationally efficient and stable algorithm. We will showcase a live version of our algorithm, featuring real-time performance and interactive control, which is publicly available at youtube.com/editor.

Tag Suggest for YouTube by George Toderici and Mehmet Emre Sargin
YouTube offers millions of users the opportunity to upload videos and share them with their friends. Many users would love to have their videos discoverable but don't annotate them properly. One new feature on YouTube that seeks to address this problem is tag prediction based on video content and independently based on text metadata.

6/17/2011 UPDATE: "Posted by" was changed to include Sergey Ioffe.

Large Scale Image Annotation: Learning to Rank with Joint Word-Image Embeddings

Thursday, March 10, 2011

Posted by Jason Weston and Samy Bengio, Research Teampaper

Google Research Blog

Improving Photo Search: A Step Across the Semantic Gap

Video Stabilization on YouTube

Excellent Papers for 2011

Auto-Directed Video Stabilization with Robust L1 Optimal Camera Paths

Google at CVPR 2011

Large Scale Image Annotation: Learning to Rank with Joint Word-Image Embeddings

Labels

Archive

Feed

Company-wide

Products

Developers