Comment by pona-a
Not quite. The paper uses its own N-gram rules with positive/negative/invariant weights as a rudimentary attention, and these rules are distilled from the model itself.
This, as I found out from this repo [0] linked in the Twitter thread in the documentation (which for some reason they didn't just link to directly), seems to be a regular Markov chain of context, if it even builds a stochastic matrix. See algorithm below.
Current prompt
"Article: (CNN)French striker Bafetimbi Gomis, who has a history of [...]
Summary: French stri"
Prompt lookup algorithm
1. Get last few tokens from prompt -"French stri"
2. Search for "French stri" in prompt
3. Match found - return next k tokens after match as candidate completion -"ker Bafetimbi Gomis, who has"
Candidate tokens
"ker Bafetimbi Gomis, who has"
[0] https://github.com/apoorvumang/prompt-lookup-decoding