Foundation Models and GenAI Technologies for Physical Retail
March 6-7 2026 WACV
Overview and Topics
In an evolving world in which consumers have a plurality of shopping methods available to them, physical “brick and mortar” stores continue to be the preferred means of shopping around the world. From groceries to clothing, customers continue to show strong demand for in person shopping.
Applications of vision-based Artificial Intelligence (AI) methods are increasingly present throughout society. Fueled by recent advances in Computer Vision, Deep Learning, web-scale training of vision and language models (“foundation models”), and edge compute, AI applications have expanded into a novel array of industries and products. In particular, the physical retail and grocery sectors have recently experienced an explosion of AI-enabled technologies, allowing for more efficient, effortless, and engaging experiences for shoppers, enabling the reduction of shrinkage for retailers, and providing insights on improving store efficiency, thereby reducing operational costs. Computer Vision applications are being deployed to numerous retail sectors, including small convenience stores, large grocery stores, fashion stores, and shopping carts, to name but a few.
The focus of this workshop includes active areas of research and development in the physical retail space, including:
- Multi-modal modeling for shopping activity recognition, product detection/tracking/identification, and product quantity estimation
- Applications of Generative AI and Large Language Models (LLM) to Physical Retail applications
- Zero-Shot learning methods and approaches for activity recognition and understanding, appearance-based classification, and object detection
- Zero-shot data labeling with foundation models
- Appearance-based classification from a large gallery of product classes in the wild and open-set recognition (OSR)
- Store analytics and shopping cart localization within a store environment
- Generative models and systems for synthetic data generation of images and videos paired with ground truth labels
Important Dates
| Workshop Paper Track | Date |
|---|---|
| Call For Paper Release | 11/07/2025 |
| Paper Submission Deadline | 12/19/2025 |
| Notification of Paper Acceptance | 12/26/2026 |
| Camera-Ready Paper Deadline (to be included in the proceeding) | 01/09/2026 |
| Workshop Challenges Track | Date |
| Call For Challenge Submission Release | 11/07/2025 |
| Challenge Training Data Release | 12/12/2025 |
| Challenge Submission Deadline | 01/16/2026 |
| Challenge Results Release and Winner Notification | 01/30/2026 |
Tentative Workshop Schedule
| Time | Talk | Name | Title/Institution |
|---|---|---|---|
| 08:30-08:45AM | Opening Remarks | TBD | |
| 08:45-09:15AM |
Invited Talk:
TBD
|
Invited Speaker 1 | |
| 09:15-09:45AM |
Invited Talk:
TBD
|
Invited Speaker 2 | |
| 09:45-10:05AM |
Paper Presentation:
TBD
|
Paper Author 1 | |
| 10:05-10:20AM | COFFEE BREAK | ||
| 10:20-10:50AM |
Invited Talk:
TBD
|
Dr. Yosi Keller | Principal Applied Scientist @Amazon Prime Video; Prof. @Bar-Ilan University |
| 10:50-11:20AM |
Challenge winner presentation 1:
TBD
|
TBD | Challenge winner team |
| 11:20-11:40AM |
Challenge winner Presentation 2:
TBD
|
Challenge winner team | |
| 11:40-11:50AM | Closing Remarks | TBD |







