Comment by scotty79
Images are not that big. Each text token is a multidimensional vector.
There were recent observations that rendering the text as an image and ingesting the image might actually be more efficient than using text embedding.
Images are not that big. Each text token is a multidimensional vector.
There were recent observations that rendering the text as an image and ingesting the image might actually be more efficient than using text embedding.