Comment by mjr00

Comment by mjr00 5 days ago

> On April 4, 2024, it was revealed that Amazon's "Just Walk Out" technology was supported by approximately 1,000 Indian workers who manually reviewed transactions. Despite claims of being fully automated through computer vision, a significant portion of transactions required this manual verification. ( https://en.wikipedia.org/wiki/Amazon_Go )

Wonder how much of this is due to economics since computer vision tech never reached the expected performance + outsourced workers got (relatively) much more expensive after COVID.

davidst 5 days ago

I left the following comment some months ago, duplicating it here:

[Disclaimer: Former Amazon employee and not involved with Go since 2016.]

I worked on the first iteration of Amazon Go in 2015/16 and can provide some context on the human oversight aspects.

The system incorporated human review in two primary capacities:

1. Low-confidence event resolution: A subset of customer interactions resulted in low-confidence classifications that were routed to human reviewers for verification. These events typically involved edge cases that were challenging for the automated systems to resolve definitively. The proportion of these events was expected to decrease over time as the models improved. This was my experience during my time with Go.

2. Training data generation: Human annotators played a significant role in labeling interactions for model training-- particularly when introducing new store fixtures or customer behaviors. For instance, when new equipment like coffee machines were added, the system would initially flag all related interactions for human annotation to build training datasets for those specific use cases. Of course, that results in a surge of humans needed for annotation while the data is collected.

Scaling from smaller grab-and-go formats to larger retail environments (Fresh, Whole Foods) would require expanded annotation efforts due to the increased complexity and variety of customer interactions in those settings.

This approach represents a fairly standard machine learning deployment pattern where human oversight serves both quality assurance and continuous improvement.

The news story is entertaining but it implies there was no working tech behind Amazon Go which just isn't true.

Reply View 22 replies

grogenaut 4 days ago

The go tech is amazing in 2 places: airport and stadium beverage tunnels. There's a premium price and high volume in those areas. The go tech has basically revolutionized the speed of getting a beer and a dog at the stadium here in Seattle. I can be back in my seat in 4 minutes including the bathroom now which for NFL means I can literally be back in a commercial break sometimes.
no idea how much they make on it, but it's a game changer in that small area.

Reply View | 5 replies
- trollbridge 4 days ago
  
  One wonders just how much technology is needed to dispense a beer and a hot dog.
  
  Reply View | 0 replies
- afavour 4 days ago
  
  Couldn't you just use vending/automat machines in these scenarios? Beers in particular are... not complicated. I believe the go tech makes the existing situation better but if you were to reimagine it from ground up I can't help but imagine you could do better.
  
  Reply View | 3 replies
  
  rvnx 4 days ago
  
  Huge queues
  
  Reply View | 2 replies
LPisGood 4 days ago

What’s still not clear to me about this story is if there was ever live human monitoring of shoppers. Did the low confidence resolution occur in real time, at some point between the customer grabbing the item and getting their bill?

Reply View | 1 reply
- davidst 4 days ago
  
  It wasn't real-time. Recorded events were entered into a queue and latency would vary depending on the size of the queue and the number of annotators.
  
  Reply View | 0 replies
BoredPositron 5 days ago

I get being proud of the work done but if they scrapped the project after 10 years because of feasibility I don't think the tech rolled out at the start was "working" as intended.

Reply View | 5 replies
- davidst 5 days ago
  
  The first iteration of the tech reached the accuracy needed to support just-walk-out for a small-format store. It did achieve that goal. I left the project before it went further.
  I imagined, at the time, future goals would be to scale store size and product variety while reducing the cost of the technology, but I have no insight into how that progressed. I am sorry to learn it's been shut down.
  
  Reply View | 4 replies
  
  BoredPositron 4 days ago
  
  But they started to have more clerks at the stores 2-3 weeks after launch and they were still present when I last visited one.
  
  Reply View | 3 replies
throwaway_15612 4 days ago

Could it be improved by requiring the customers to use a "smart" shopping basket that can read RFC codes from the product packaging? In combination with vision tech it should give a relatively higher accuracy.
If so, is the reason why it is not used related to cost?

Reply View | 0 replies
scoot 4 days ago

Obligatory /disclaimer/disclosure/. (Don't worry, most HNrs get this wrong for some reason. I will be downvoted for pointing this out, but whatever. It's a meaningful difference to those that understand.)

Reply View | 2 replies
- Terretta 4 days ago
  
  Arguably they first disclose (employee) then disclaim (but not for a while now)...
  
  Reply View | 0 replies
- davidst 4 days ago
  
  I have been making this mistake for decades. I am upvoting your comment to show thanks!
  
  Reply View | 0 replies
londons_explore 5 days ago

As soon as you get to ~99% accuracy, you probably don't need to go further.
If the customer is accidentally billed for an orange instead of a tangerine 1% of the time, the consumer probably won't notice or care, and as long as the errors aren't biased in favour of the shop, regulators and the taxman probably won't care either.
With that in mind, I suspect Amazon Go wasn't profitable due to poor execution not an inherently bad idea.

Reply View | 3 replies
- Slartie 5 days ago
  
  Actually, discount grocers operate on razor-thin margins of 2-4%. If your inaccuracy is geared to the benefit of your customer (because otherwise you'll be out of business due to the regulatory bodies) and thus removes just one percent of that, you suddenly lose a quarter to half of your earnings! And that goes ON TOP of the additional cost incurred with all that computer vision tech.
  In addition to that, you'll have the problem of inventory differences, which is often cited as being an even bigger problem with store theft than the loss of valued product. If the inventory numbers on your books differ too much from the inventory actually on the shelves, all your replenishment processes will suffer, eventually causing out of stock situations and thus loss of revenue. You may be able to eventually counter that by estimating losses to billing inaccuracies, but that's another complexity that's not going to be free to tackle, so the 1% inaccuracy is going to cost you money on the inventory difference front, no matter what.
  
  Reply View | 1 reply
  
  SilverBirch 4 days ago
  
  And to add to that, it's not a neutral environment. If there's 1% of scenarios that are incorrect, people will figure out they haven't been billed for something, figure out why, and then tell their friends. Before you know it every teenager is walking into Amazon Fresh standing on one foot, taking a bag of Doritos, hopping over to the Coca Cola stand, putting the Doritos down, spinning 3 times, picking it up again and walking out of store, safe in the knowledge that the AI system has annotated the entire event as a seagull getting into the shop.
  
  Reply View | 0 replies
- davidst 5 days ago
  
  I don't have insight into what ultimately transpired at Amazon Go so take the following as speculation on my part.
  It is unlikely the tech would be frozen when an acceptable accuracy threshold is reached:
  1. There is a strong incentive to reduce operational costs by simplifying the hardware infrastructure and improving the underlying vision tech to maintain acceptable accuracy. You can save money if you can reduce the number and quality of cameras, eliminate additional signal assistance from other inputs (e.g., shelves with load cells), and generally simplify overall system complexity.
  2. There is business pressure to add product types and fixtures which almost always result in new customer behaviors. I mentioned coffee in my prior post. Consider what it would mean to add support for open-top produce bins and the challenge of complex customer rummaging. It would take a lot of high-quality annotated data and probably some entirely new algorithms, as well.
  Both of those require maintaining a well-staffed annotation team working continuously for an extended time. And those were just the first two things that come to mind. There are likely more reasons that aren't immediately apparent.
  
  Reply View | 0 replies

Cornbilly 5 days ago

It's great that they faced essentially no consequences for this. A sure sign that we have a functional and sane market.

Reply View 26 replies

colinplamondon 5 days ago

Why would they face consequences? Every store has video surveillance that can be reviewed.
They trusted their tech enough to accept the false-positive rate, then worked to determine / validate their false positive rate with manual review, and iterate their models with the data.
From a consumer perspective the point is that you can "just walk out". They delivered that.

Reply View | 12 replies
- acdha 5 days ago
  
  If the stock price goes down, I won’t be surprised if there’s a shareholder lawsuit claiming that they misrepresented their level of AI achievement and that lead to this write-off by keeping operating costs and error rates high. The whole business model really assumed that they could undercut competitors by lower staffing.
  
  Reply View | 0 replies
- Cornbilly 5 days ago
  
  Their initial advertising claimed near full automation by their "AI" system when, in reality, they had people manually handling around 70% of the transactions.
  I get that this is a message board for YC, so lying about your company's tech is considered almost a virtue but that is an unreasonably big lie to tell without getting your hand-slapped by some regulatory body or investor backlash.
  
  Reply View | 9 replies
  
  neilc 5 days ago
  
  I don’t remember Amazon claiming “near-full automation” by AI. They said that you can checkout automatically and that AI/computer vision is somehow involved.
  
  Reply View | 1 reply
  
  EdiX 5 days ago
  
  If they didn't say it they heavily implied it to the point that journalists were fooled. For example you can read about it in this very quaint 2018 article that went with a woke "it's your fault I'm disfunctionally paranoid" angle: https://www.cnet.com/culture/amazon-go-avoid-discrimination-....
  No one cared what I was doing. Is this what it feels like to shop when you're not black?
  Turns out people did care what she was doing.
  
  Reply View | 0 replies
  
  thegrim000 5 days ago
  
  Well that's because, again, it was indeed algorithms doing the work, and the people were only used to verify / train the system, after the fact. People keep, intentionally, conflating the two things, doing everything in their power to say (or strongly imply), that the people involved were managing the orders in real time, which is a lie. You are the one pushing misinformation here.
  
  Reply View | 1 reply
  
  freejazz 4 days ago
  
  Automated checkout cashier except that you need a human to verify the work of the automated checkout cashier. Brilliant.
  
  Reply View | 0 replies
  
  CamperBob2 5 days ago
  
  Who cares how they monitor and validate transactions? That's Amazon's problem, not mine.
  Indians, AI, whatever, meh.
  
  Reply View | 2 replies
  
  colinplamondon 5 days ago
  
  I think investors like Amazon taking shots like this? It was never a broad roll-out, 43 stores is micro-scale for Amazon.
  Still, would love to see a breakdown of why it didn't improve. Regardless of the accuracy at launch, I'd think that advances in AI would have been massively to their advantage. I wonder if security degradation hit them hard.
  The entire system depends on a level of social trust that doesn't exist in American cities today. Similarly, the "Dash Cart" seems like a cheaper and easier way to accomplish the same thing.
  At the end of the day, there's also a mismatch in the use case. If I'm going to a smaller format store, like they had, I'm not buying a ton of stuff. Self checkout is great, and minimal friction.
  I'd think that improving the UX of self-checkout gets 80% of the way there with way less fraud, way less theft, and way less technology.
  Still, I think it's wicked cool they took a big shot.
  I know someone that worked on the project in the early days. It was always incredibly difficult technology, they were always behind on their accuracy targets, and the solutions were increasingly kludgy as they layered more and more complex systems on top. An honorable failure.
  A lot of smart people really tried to make it work.
  
  Reply View | 1 reply
  
  Cornbilly 5 days ago
  
  That's great but they could have been honest up-front and said "The plan is that this is eventually fully-automated but we estimate that it needs supervised training for X amount of time in order to handle Y% of transactions automatically".
  But this is tech and you just lie because hardly anyone in the investor class knows enough to call you out on it or they are just going with the lie to make a buck off of other rubes.
  Privacy concerns aside, I thought it was a cool project. I agree that “convenience store” was probably not the best target but I think it was an effective enough proof of concept (creating a decent sized chain of them probably wasn’t the best idea) . I’ve seen the system used more effectively in smaller situations like stadium concessions, where the duration of the transactions needs to be very short to facilitate throughout.
  
  Reply View | 0 replies
- [removed] 3 days ago
  
  [deleted]
  
  Reply View | 0 replies
swiftcoder 5 days ago

It's also pretty par for the course from Amazon automation initiatives. Like Glacier being marketed as robotic tape drive loaders, where in reality it is mostly just regular old S3 running on the outdated server clusters.

Reply View | 0 replies
madeofpalk 5 days ago

Isn't the consequence that that they're shutting the stores down?

Reply View | 0 replies
dyauspitr 5 days ago

It’s autonomous 80% of the time. That’s significant. Put another way, they only had to hire 1000 people instead of 5000.

Reply View | 3 replies
- mrguyorama 4 days ago
  
  It only takes 1 employee to staff 20 self checkouts for comparison.
  For a full fat grocery store. With zero change or adjustment to the rest of the grocery store. And customers weirdly like self checkouts even when they are a dramatically worse outcome (compared to the highish bar of well trained cashiers)
  
  Reply View | 2 replies
  
  octoberfranklin 3 days ago
  
  We like self-checkout because there's hardly ever a line.
  An idle self-checkout machine costs the store almost nothing. An idle cashier costs the store wages. So the stores will always skimp on cashiers, leading to lines, wasting my time.
  
  Reply View | 0 replies
  
  dyauspitr 4 days ago
  
  What exactly is the point of a well trained cashier, what service do they provide. I guess I appreciate what the bagger does and the cashier knows the codes for the loose vegetables but those are minor benefits in my opinion.
  
  Reply View | 0 replies
jandrese 5 days ago

What's the crime? If lying about AI capabilities is a crime we have some billionaires in big trouble.

Reply View | 6 replies
- kube-system 5 days ago
  
  If it's a publicly traded company, everything is securities fraud.
  
  Reply View | 4 replies
  
  jandrese 5 days ago
  
  Which hardly anybody ever gets prosecuted for.
  
  Reply View | 3 replies
- Cornbilly 5 days ago
  
  AI is not unique in this regard. We just saw the same thing with the crypto/blockchain nonsense.
  Regulation lags so far behind that you can get away with bad behavior long enough that, by the time regulation catches up, you can buy your way out of consequences.
  
  Reply View | 0 replies

ed_mercer 5 days ago

This was proven to be false on the WAN show. Only 20% of transactions were low confidence and handled by mechanical turk.

https://m.youtube.com/watch?v=433kipkEERY&t=8479s

Reply View 10 replies

larrik 5 days ago

20% seems like a "significant portion" to me

Reply View | 1 reply
- Breza 2 days ago
  
  For sure! Twenty percent moves you from "Game changing tech" to "Slightly improved self checkout."
  
  Reply View | 0 replies
mjr00 5 days ago

20% is an incredibly high number though, if a store has 400 people/hour that means you're manually reviewing 80 transactions per hour, over one transaction per minute. That's multiple human employees.

Reply View | 4 replies
- iLoveOncall 5 days ago
  
  One transaction per minute is nothing at all when the transaction can be as simple as "did the person put that back on the shelf" with a 5 seconds clip.
  
  Reply View | 3 replies
  
  freejazz 4 days ago
  
  If it was clear from just a 5 second clip it probably wouldn't have needed to be reviewed
  
  Reply View | 2 replies
pessimizer 5 days ago

Proven "false." I've noticed that if one admits the truth with a dismissive or offended tone, you can just continue to claim the lie and through sheer force of will people will still go with it.
I think people just think that they must be misunderstanding something; that nobody could claim one thing while offering evidence of its opposite. 1/5 of purchases lose their significance.

Reply View | 0 replies
EdiX 5 days ago

Nothing has been "proven". The original story was The Information (paywalled article) reshared by Business Insider [1] and claimed that 70% of the transactions were reviewed by an indian. The source was an anonymous source.
Business Insider also reached out to Amazon at the time and a spokesperson denied that actually reviewed any transactions.
This "proven false" thing is just another anonymous source claiming that actually it was only 20%.
So you actually have no proof of anything, you just have three persons claiming three different things (0%, 20% and 70%).
[1] https://www.businessinsider.com/amazons-just-walk-out-actual...

Reply View | 0 replies
whateveracct 5 days ago

Transactions or grabs? Cuz I grab >5 things every time..so it stands to reason Indians always reviewed me.

Reply View | 0 replies

gorgoiler 4 days ago

I’m skeptical of this scoop.

It’s reasonable to expect a system like Amazon’s to use human feedback in training, and to quote the article linked on Wikipedia:

> Amazon said that the India-based team only assisted in training the model [and validating] a small minority of shopping visits.

Reply View 1 reply

hereonout2 4 days ago

I went to Lidl UKs first walk out shop a few weeks ago. You get the bill and receipts about 40 minutes after you've left.
It certainly felt like it could have been sent off to a lower paid country for a human to tot up.
Also consider you're in the store for what, 10 mins - that's a lot of video processing presumably using state of the art CV models. It's quite possibly cheaper to pay a human than rent the H100 to do it.

Reply View | 0 replies

theanonymousone 5 days ago

Why did "outsourced workers get (relatively) much more expensive after"?

Reply View 17 replies

foxyv 5 days ago

Essentially the thinking went. If everyone is remote, why not hire remote workers from countries that are a lot cheaper. Suddenly you had a hard time finding contractors and FTEs from those countries because everyone was hiring them. At the same time it got really hard for entry level developers in the USA to find work.
The supply/demand curve shifted and now those workers are becoming more expensive while domestic workers are becoming cheaper.

Reply View | 0 replies
giraffe_lady 5 days ago

India specifically is in the middle of a massive years-long labor movement that is changing the terms of work there and I believe shifting the degree of alignment with western corporate outsourcing though I'm not very informed about the details.
Scale is beyond comprehension though, there were 250 million people on strike one day last summer. This is not ever really covered in western media or mentioned on HN for reasons that are surely not interesting or worth pondering at all.

Reply View | 12 replies
- givemeethekeys 5 days ago
  
  Americans can’t afford to strike like that.
  
  Reply View | 11 replies
  
  dragonwriter 5 days ago
  
  No one (at a national scale) can afford to strike like that, except people who have an understanding of why they even more can't afford not to strike like that.
  
  Reply View | 0 replies
  
  linkregister 5 days ago
  
  You're most likely correct; I originally started writing this comment to refute your statement, but found that my assumptions appear to be wrong.
  Americans have the nearly the highest nominal and PPP income of OECD countries as of 2024, only behind Luxembourg, Iceland, and Switzerland [1].
  India experiences substantially higher shelter and food insecurity and poverty rates than the United States.
  However, tech workers in Bangalore are paid an order of magnitude higher than prevailing local wages in other sectors, at around ₹2M (₹20 lakh) [2]. Median annual rents for 2BHK (2 bedroom) apartments appear to be around 1/10th of that figure at ₹3 lahk in desirable neighborhoods [3].
  It appears to be reasonable for a technology worker to be able to perform a sustained strike. I have never personally traveled to Bangalore, though I have lived in places where cost of living is under a tenth of median American income.
  I invite correction by people with first hand knowledge about cost of living in Bangalore.
  1. https://www.oecd.org/en/data/indicators/average-annual-wages...
  2. https://timesofindia.indiatimes.com/city/bengaluru/median-te...
  3. https://www.birlaevara.org.in/best-areas-in-bangalore-for-re...
  
  Reply View | 6 replies
  
  netsharc 5 days ago
  
  And Indians can?
  When India "shut down" for Covid, day labourers suddenly had no income, and no government support - they had to walk all the way to their home province (can't remember if the trains were even running).
  But oh well, Uberizing employment means the run-of-the-mill American worker can also live like that in the future... progress!
  
  Reply View | 1 reply
  
  giraffe_lady 5 days ago
  
  Americans have chosen to learn exactly how good they have had it. You get to watch!
  
  Reply View | 0 replies
  
  esseph 5 days ago
  
  Can't afford not to.
  
  Reply View | 0 replies
mjr00 5 days ago

Great question. I'm not an economist so I have no idea why. The outsourcing rates I've all seen have gotten way higher in the past ~10 years though.

Reply View | 2 replies
- Insanity 5 days ago
  
  Beyond just the usual inflation?
  I'm not an economist either, but I also assume that as the country attracts more local talent for local companies, the competition for outsourcing becomes harder. (i.e, you now have to pay more than the local companies).
  All just speculation on my part though, I really have no clue either.
  
  Reply View | 1 reply
  
  PaulHoule 5 days ago
  
  People from Bangalore were telling me it was getting crazy expensive to live there (by Indian standards) circa 2013.
  
  Reply View | 0 replies

thinkingtoilet 5 days ago

Another case where AI = "actually Indians". It's funny how often this has happened.

Reply View 3 replies

Dylan16807 5 days ago

Maybe. I'd really want to know what percent of items (not transactions) needed review. 1,000 people to oversee how much revenue?
Theoretically if it was 99% computer and 1% human, that's enough to mess up the economics but it's not a bait and switch like some companies have done.

Reply View | 0 replies
kkkqkqkqkqlqlql 5 days ago

I remember this case the one who put "Actually Indians" in my mind. What other instances do you know?
(Not to refute your point, of course, I am just curious)

Reply View | 1 reply
- Sateeshm 5 days ago
  
  Builder ai
  
  Reply View | 0 replies

andoando 4 days ago

I wonder if they were doing the same thing for palm recognition

Reply View 0 replies

adamsb6 5 days ago

People don’t know what the H is in RLHF.

Reply View 0 replies