2nd Physical Retail AI Workshop (PRAW)
Foundation Models and GenAI Technologies for Physical Retail

AM, Friday, February 28th, 2025 WACV

Overview and Topics

In an evolving world in which consumers have a plurality of shopping methods available to them, physical “brick and mortar” stores continue to be the preferred means of shopping around the world. From groceries to clothing, customers continue to show strong demand for in person shopping.

Applications of vision-based Artificial Intelligence (AI) methods are increasingly present throughout society. Fueled by recent advances in Computer Vision, Deep Learning, web-scale training of vision and language models (“foundation models”), and edge compute, AI applications have expanded into a novel array of industries and products. In particular, the physical retail and grocery sectors have recently experienced an explosion of AI-enabled technologies, allowing for more efficient, effortless, and engaging experiences for shoppers, enabling the reduction of shrinkage for retailers, and providing insights on improving store efficiency, thereby reducing operational costs. Computer Vision applications are being deployed to numerous retail sectors, including small convenience stores, large grocery stores, fashion stores, and shopping carts, to name but a few.

The focus of this workshop includes active areas of research and development in the physical retail space, including:

Multi-modal modeling for shopping activity recognition, product detection/tracking/identification, and product quantity estimation
Applications of Generative AI and Large Language Models (LLM) to Physical Retail applications
Zero-Shot learning methods and approaches for activity recognition and understanding, appearance-based classification, and object detection
Zero-shot data labeling with foundation models
Appearance-based classification from a large gallery of product classes in the wild and open-set recognition (OSR)
Store analytics and shopping cart localization within a store environment
Generative models and systems for synthetic data generation of images and videos paired with ground truth labels

Important Dates

Paper Submissions	Date
Call For Paper Release	11/08/2024
Paper Submission Deadline	12/20/2024
Notification of Acceptance	01/03/2025
Camera-Ready Paper Deadline (to be included in the proceeding)	01/10/2025

Final Workshop Schedule

Time	Talk	Name	Title/Institution
08:30-08:45AM	Opening Remarks	Dr. Rocco Pietrini	Assistant Professor @VRAI Lab
08:45-09:15AM	Invited Talk: Just Walk Out: Enabling Autonomous Checkout with Multi-modal AI	Dr. Xiaoqing Ge	Manager of Applied Science @Amazon Just Walk Out (JWO)
09:15-09:45AM	Invited Talk: Amazon One: Technology behind the Experience	Dr. Carlos Castillo	Principal Applied Scientist @Amazon One
09:45-10:05AM	Paper Presentation: What Matters when Building Vision Language Models for Product Image Analysis?	Dr. Maria Zontak	Sr. Applied Scientist @Amazon Catalog AI
10:05-10:20AM	COFFEE BREAK
10:20-10:50AM	Invited Talk: How I Learned to Stop Worrying and Love Foundation Models	Dr. Yosi Keller	Principal Applied Scientist @Amazon Prime Video; Prof. @Bar-Ilan University
10:50-11:20AM	Invited Talk: On the role of Extended Artificial intelligence in the future of physical retail	Dr. Lorenzo Stacchio	Research Fellow @University of Macerata
11:20-11:40AM	Paper Presentation: WTPose: Waterfall Transformer for Multi-person Pose Estimation	Navin Ranjan	Student @Rochester Institute of Technology
11:40-11:50AM	Closing Remarks	Dr. Rocco Pietrini	Assistant Professor @VRAI Lab

Organizers

Dr. Bruno Artacho
(Amazon)

Dr. Austen Groener
(Amazon)

Dr. Y. Robert Chen
(Amazon)

Abhay Doke
(Amazon)

Dr. Sean Ma
(Amazon)

Dr. Shun Miao
(Amazon)

Dr. Quanfu Fan
(Amazon)

Dr. Rocco Pietrini
(VRAI Lab)