Customer-obsessed science

Amazon Science Fulfillment Center OAK4 in Tracy, CA

Automated evaluation of RAG pipelines with exam generation

June 13, 2024

The fight against hallucination in retrieval-augmented-generation models starts with a method for accurately assessing it.

Conversational AI
A quick guide to Amazon’s papers at CVPR 2024

June 13, 2024

As in other areas of AI, generative models and foundation models — such as vision-language models — are a hot topic.

Computer vision
A quick guide to Amazon’s 30+ papers at NAACL 2024

June 07, 2024

Although work involving large language models predominates, classical and more-general techniques remain well represented.

Conversational AI
Conference calendar
- SIGIR 2024
  
  Search and information retrieval
  
  July 14 - 18, 2024
- ICML 2024
  
  Machine learning
  
  July 21 - 27, 2024
- ACL 2024
  
  August 11 - 16, 2024

Do large language models understand the world?

February 15, 2024

In addition to its practical implications, recent work on “meaning representations” could shed light on some old philosophical questions.

Conversational AI

Learn more

Virtual try-all: Visualizing any product in any personal setting

Karim Bouyarmane

April 16, 2024

First model to work across a wide range of products uses a second U-Net encoder to capture fine-grained product details.

Computer vision
Adapting language model architectures for time series forecasting

Abdul Fatir Ansari, Lorenzo Stella

March 18, 2024

Tokenizing time series data and treating it like a language enables a model whose zero-shot performance matches or exceeds that of purpose-built models.

Machine learning
Scenario Diffusion helps Zoox vehicles navigate safety-critical situations

Ethan Pronovost, Kai Wang

February 20, 2024

Generative AI supports the creation, at scale, of complex, realistic driving scenarios that can be directed to specific locations and environments.

Robotics
New tool, dataset help detect hallucinations in large language models

Xiangkun Hu, Dongyu Ru

January 17, 2024

Representing facts using knowledge triplets rather than natural language enables finer-grained judgments.

Conversational AI

A shocking amount of the web is machine translated: Insights from multi-way parallelism

Brian Thompson, Mehak Dhaliwal, Peter Frisch, Tobias Domhan, Marcello Federico

ACL Findings 2024

2024

We show that content on the web is often trans-lated into many languages, and the low quality of these multi-way translations indicates they were likely created using Machine Translation (MT). Multi-way parallel, machine generated content not only dominates the translations in lower resource languages; it also constitutes a large fraction of the total web content in those languages. We also find evidence

Conversational AI
Towards unbiased calibration using meta-regularization

Cheng Wang, Jacek Golebiowski

Transactions on Machine Learning Research

2024

Model miscalibration has been frequently identified in modern deep neural networks. Recent work aims to improve model calibration directly through a differentiable calibration proxy. However, the calibration produced is often biased due to the binning mechanism. In this work, we propose to learn better-calibrated models via meta-regularization, which has two components: (1) gamma network (γ-Net), a meta

Machine learning
Prompting foundational models for omni-supervised instance segmentation

Arnav Das, Ritwick Chaudhry, Kaustav Kundu, Davide Modolo

CVPR 2024 Workshop on Prompting in Vision

2024

Pixel-level mask annotation costs are a major bottleneck in training deep neural networks for instance segmentation. Recent promptable foundation models like the Segment Anything Model (SAM) and GroundedDINO (GDino) have shown impressive zero-shot performance in segmentation and object detection benchmarks. While these models are not capable of performing inference without prompts, they are ideal for omnisupervised

Computer vision
SWAN: SubWord Alignment Network for HMM-free word timing estimation in end-to-end automatic speech recognition

Woohyun Kang, Srikanth Vishnubhotla, Rusen Aktas, Yogesh Virkar, Raghuveer Peri, Kyu Han

Interspeech 2024

2024

End-to-end (E2E) automatic speech recognition (ASR) systems often exploited pre-trained hidden Markov model (HMM) systems for word timing estimation (WTE), due to their inability to predict word boundaries. However, training an HMM is difficult for low-resource languages due to the lack of phonetic transcriptions, leading to a high demand for HMM-free WTE methods, particularly for multilingual ASR systems

Conversational AI
BASS: Batched attention-optimized speculative sampling

Haifeng Qian, Sujan Gonugondla, Sungsoo Ha, Mingyue Shang, Sanjay Krishna Gouda, Ramesh Nallapati, Sudipta Sengupta, Anoop Deoras

ACL Findings 2024

2024

Speculative decoding has emerged as a powerful method to improve latency and through-put in hosting large language models. However, most existing implementations focus on generating a single sequence. Real-world generative AI applications often require multiple responses and how to perform speculative decoding in a batched setting while preserving its latency benefits poses non-trivial challenges. This

Conversational AI

Career opportunities

We look for talent from around the world for applied scientists, data scientists, economists, research scientists, scholars, academics, PhDs, and interns.

Explore open roles
Academics at Amazon

We hire world-class academics to work on large-scale technical challenges, while they continue to teach and conduct research at their universities. Learn more about each program and how to apply below.

Join the program
Amazon Research Awards

Supporting research at academic institutions and non-profit organizations in areas that align with our mission to advance customer-obsessed science.

Apply for awards

Customer-obsessed science

Conference calendar

Publications

Resources

Work with us