Comment by BoredPositron

Comment by BoredPositron 6 months ago

View on Hacker News

You can disable clip l on flux without a loss in quality. You are also making an elephant out of a fly. CLIP is used everywhere.

doctorpangloss 6 months ago

Consider another interpretation: CLIP L in Flux can be disabled without a loss in quality because the way it is used is buggy!

Reply View 4 replies

BoredPositron 6 months ago

oh lord.

Reply View | 3 replies
- doctorpangloss 6 months ago
  
  The truth is that the CLIP conditioning in Flux works well for Dreambooth style fine tuning where tokenization bugs can be acute, but not so severe as to cause the low impact of CLIP on their dev model. It is likely more impactful on their pro / max models but only BFL could say so.
  
  Reply View | 2 replies
  
  BoredPositron 6 months ago
  
  That's absolute nonsense.
  
  Reply View | 1 reply
  
  doctorpangloss 6 months ago
  
  okay well, there are a few things that are known to be true: (1) clip's tokenizer in diffusers, the reference source in BFL's repo, and in openai's repo, is buggy (2) many clip prompts are observed to have a low impact in the flux dev and schnell models. it is very likely to be true that (1) the tokenizer in the BFL reference source and openai's repo does not match the tokenizer used in training openai's clip or the text conditioning for any of the flux checkpoints (2) the guidance and timestep distillation play a role in weakening the role of clip (3) it is practical to fine tune clip on more image-caption pairs. if you care about fine tuning, the tokenization bugs matter. everything else is hard to prove.
  
  Reply View | 0 replies