Comment by davidst

Comment by davidst 5 days ago

22 replies

I left the following comment some months ago, duplicating it here:

[Disclaimer: Former Amazon employee and not involved with Go since 2016.]

I worked on the first iteration of Amazon Go in 2015/16 and can provide some context on the human oversight aspects.

The system incorporated human review in two primary capacities:

1. Low-confidence event resolution: A subset of customer interactions resulted in low-confidence classifications that were routed to human reviewers for verification. These events typically involved edge cases that were challenging for the automated systems to resolve definitively. The proportion of these events was expected to decrease over time as the models improved. This was my experience during my time with Go.

2. Training data generation: Human annotators played a significant role in labeling interactions for model training-- particularly when introducing new store fixtures or customer behaviors. For instance, when new equipment like coffee machines were added, the system would initially flag all related interactions for human annotation to build training datasets for those specific use cases. Of course, that results in a surge of humans needed for annotation while the data is collected.

Scaling from smaller grab-and-go formats to larger retail environments (Fresh, Whole Foods) would require expanded annotation efforts due to the increased complexity and variety of customer interactions in those settings.

This approach represents a fairly standard machine learning deployment pattern where human oversight serves both quality assurance and continuous improvement.

The news story is entertaining but it implies there was no working tech behind Amazon Go which just isn't true.

grogenaut 4 days ago

The go tech is amazing in 2 places: airport and stadium beverage tunnels. There's a premium price and high volume in those areas. The go tech has basically revolutionized the speed of getting a beer and a dog at the stadium here in Seattle. I can be back in my seat in 4 minutes including the bathroom now which for NFL means I can literally be back in a commercial break sometimes.

no idea how much they make on it, but it's a game changer in that small area.

  • trollbridge 4 days ago

    One wonders just how much technology is needed to dispense a beer and a hot dog.

  • afavour 4 days ago

    Couldn't you just use vending/automat machines in these scenarios? Beers in particular are... not complicated. I believe the go tech makes the existing situation better but if you were to reimagine it from ground up I can't help but imagine you could do better.

    • rvnx 4 days ago

      Huge queues

      • mrguyorama 4 days ago

        Automats don't have to have huge queues.

        One employee as a stocker/chef can support higher throughput in automat style than in counter style fast food service because you have a much more focused task (put food in empty cubby, repeat) than the normal process of "Take order, take money, get order, give customer, deal with mistakes"

        They can have an entire wall full of panels for the same item, so that purchases are heavily parallelized. There's usually only a single digit number of items available.

        Automats seemingly died because inflation made it hard to accept payment, but that has been a solved problem in vending machines since then.

        Japan and some other places still do a lot of vending machine food, but the specific "Wall of items" Automat format enables great logistics that you don't get from vending machines. Weirder still, there are places in asia I have seen that have a AutomatWall style setup, but cook food to order, so you end up waiting!

        You can't use an Automat for beer though, without some sort of external system to only allow use by "adults". But surely that's true of a vision system?

        • Breza 2 days ago

          I've been FASCINATED by Automats ever since I watched Bugs Bunny as a child. The idea that you could just walk up, look at what looked good, and buy it seemed indescribably awesome.

LPisGood 4 days ago

What’s still not clear to me about this story is if there was ever live human monitoring of shoppers. Did the low confidence resolution occur in real time, at some point between the customer grabbing the item and getting their bill?

  • davidst 4 days ago

    It wasn't real-time. Recorded events were entered into a queue and latency would vary depending on the size of the queue and the number of annotators.

BoredPositron 5 days ago

I get being proud of the work done but if they scrapped the project after 10 years because of feasibility I don't think the tech rolled out at the start was "working" as intended.

  • davidst 5 days ago

    The first iteration of the tech reached the accuracy needed to support just-walk-out for a small-format store. It did achieve that goal. I left the project before it went further.

    I imagined, at the time, future goals would be to scale store size and product variety while reducing the cost of the technology, but I have no insight into how that progressed. I am sorry to learn it's been shut down.

    • BoredPositron 4 days ago

      But they started to have more clerks at the stores 2-3 weeks after launch and they were still present when I last visited one.

      • burningChrome 4 days ago

        This was one of the issues that killed it. They continually missed goals of reducing human involvement.

        Training is part of any AI project, but it sounds like Amazon wasn’t making much progress, even after years of working on the project. “As of mid-2022, Just Walk Out required about 700 human reviews per 1,000 sales, far above an internal target of reducing the number of reviews to between 20 and 50 per 1,000 sales,” the report said.

        The report said Amazon’s team “repeatedly missed goals” to cut down on human reviews, and “the reliance on backup humans explains in part why it can take hours for customers to receive receipts.”

        https://arstechnica.com/gadgets/2024/04/amazon-ends-ai-power...

      • davidst 4 days ago

        I don't know how the store clerk staffing changed over time but they were not directly involved with the underlying tech (that is, clerks did not annotate data.) Stores had to comply with state laws for certain kinds of items (e.g., a live person must verify ID and age for alcohol) so the store automation had the ability to summon a clerk when needed. And there were the usual things all stores must do: restocking, cleaning, safety, and customer relations. I expected customer relations to decrease over time as people became accustomed to the just-walk-out shopping experience.

throwaway_15612 4 days ago

Could it be improved by requiring the customers to use a "smart" shopping basket that can read RFC codes from the product packaging? In combination with vision tech it should give a relatively higher accuracy.

If so, is the reason why it is not used related to cost?

scoot 4 days ago

Obligatory /disclaimer/disclosure/. (Don't worry, most HNrs get this wrong for some reason. I will be downvoted for pointing this out, but whatever. It's a meaningful difference to those that understand.)

  • Terretta 4 days ago

    Arguably they first disclose (employee) then disclaim (but not for a while now)...

  • davidst 4 days ago

    I have been making this mistake for decades. I am upvoting your comment to show thanks!

londons_explore 5 days ago

As soon as you get to ~99% accuracy, you probably don't need to go further.

If the customer is accidentally billed for an orange instead of a tangerine 1% of the time, the consumer probably won't notice or care, and as long as the errors aren't biased in favour of the shop, regulators and the taxman probably won't care either.

With that in mind, I suspect Amazon Go wasn't profitable due to poor execution not an inherently bad idea.

  • Slartie 5 days ago

    Actually, discount grocers operate on razor-thin margins of 2-4%. If your inaccuracy is geared to the benefit of your customer (because otherwise you'll be out of business due to the regulatory bodies) and thus removes just one percent of that, you suddenly lose a quarter to half of your earnings! And that goes ON TOP of the additional cost incurred with all that computer vision tech.

    In addition to that, you'll have the problem of inventory differences, which is often cited as being an even bigger problem with store theft than the loss of valued product. If the inventory numbers on your books differ too much from the inventory actually on the shelves, all your replenishment processes will suffer, eventually causing out of stock situations and thus loss of revenue. You may be able to eventually counter that by estimating losses to billing inaccuracies, but that's another complexity that's not going to be free to tackle, so the 1% inaccuracy is going to cost you money on the inventory difference front, no matter what.

    • SilverBirch 4 days ago

      And to add to that, it's not a neutral environment. If there's 1% of scenarios that are incorrect, people will figure out they haven't been billed for something, figure out why, and then tell their friends. Before you know it every teenager is walking into Amazon Fresh standing on one foot, taking a bag of Doritos, hopping over to the Coca Cola stand, putting the Doritos down, spinning 3 times, picking it up again and walking out of store, safe in the knowledge that the AI system has annotated the entire event as a seagull getting into the shop.

  • davidst 5 days ago

    I don't have insight into what ultimately transpired at Amazon Go so take the following as speculation on my part.

    It is unlikely the tech would be frozen when an acceptable accuracy threshold is reached:

    1. There is a strong incentive to reduce operational costs by simplifying the hardware infrastructure and improving the underlying vision tech to maintain acceptable accuracy. You can save money if you can reduce the number and quality of cameras, eliminate additional signal assistance from other inputs (e.g., shelves with load cells), and generally simplify overall system complexity.

    2. There is business pressure to add product types and fixtures which almost always result in new customer behaviors. I mentioned coffee in my prior post. Consider what it would mean to add support for open-top produce bins and the challenge of complex customer rummaging. It would take a lot of high-quality annotated data and probably some entirely new algorithms, as well.

    Both of those require maintaining a well-staffed annotation team working continuously for an extended time. And those were just the first two things that come to mind. There are likely more reasons that aren't immediately apparent.