AI Computer Vision Lead - Onsite (Bengaluru)

About Us

Role Description

The AI Computer Vision Lead will be responsible for developing, upgrading and owning the entire AI Vision Stack of Tara - our AI-powered cooking companion. This includes designing model architectures for different stages of the cooking process, ensuring scalability, accuracy and real-world performance.

The role demands a deep understanding of computer vision, particularly in applying it to dynamic, real-world, edge-based environments. You’ll define model architectures, guide data annotation processes, and set the foundation for scalable datasets and pipelines. You’ll also establish annotation standards and feedback loops to ensure high-quality training data.

In short, you’ll be tackling one of the hardest problems in frontier AI bringing human-like visual understanding to the kitchen and redefining how people cook.

What You’ll Do?

“Long story short - you’ll own everything related to the core AI Vision Stack.”

Design and develop scalable model architectures for vision tasks in real-world cooking environments.
Build and experiment across architectures - from CNNs to Transformers and hybrid approaches.
Develop multi-head architectures optimized for both edge and cloud inference.
Define compute and hardware requirements for efficient real-time performance on edge devices.
Continuously optimize models for accuracy, latency and scalability.
Rapidly prototype and iterate to identify best-performing model structures for specific cooking tasks.
Implement sub-task-based training for higher modularity and accuracy in complex vision pipelines.
Lead work on temporal action prediction to understand and anticipate cooking actions over time.
Define best practices for annotation, data quality and dataset scalability.
Collaborate closely with product and hardware teams to ensure seamless edge integration.