Comment by pona-a

Not quite. The paper uses its own N-gram rules with positive/negative/invariant weights as a rudimentary attention, and these rules are distilled from the model itself.

This, as I found out from this repo [0] linked in the Twitter thread in the documentation (which for some reason they didn't just link to directly), seems to be a regular Markov chain of context, if it even builds a stochastic matrix. See algorithm below.

  Current prompt
  "Article: (CNN)French striker Bafetimbi Gomis, who has a history of [...]
  Summary: French stri"

  Prompt lookup algorithm
  1. Get last few tokens from prompt -"French stri"
  2. Search for "French stri" in prompt
  3. Match found - return next k tokens after match as candidate completion -"ker Bafetimbi Gomis, who has"

  Candidate tokens
  "ker Bafetimbi Gomis, who has"

[0] https://github.com/apoorvumang/prompt-lookup-decoding