Comment by rurban
We know that the libc strstr performs horrible, but musl is fast and state of the art.
Now we just need a name, and we can add it to the smart shootout. Wonder how it compares to the best SIMD algos
We know that the libc strstr performs horrible, but musl is fast and state of the art.
Now we just need a name, and we can add it to the smart shootout. Wonder how it compares to the best SIMD algos
Would be good to have more SIMD algorithms in smart. Maybe I'll have a play with it if I get time.
Looking at them, would need a little work to make them safe (not reading past needle or haystack). Probably not too much effort, may need a different search for data mod block size at the end.
If you're interested in a rough benchmark, I compared musl's “two way” with a variation on this algorithm for my SIMD optimized libc linked in a sister comment.
The benchmark involves finding this line in that file: https://github.com/ncruces/go-sqlite3/blob/v0.26.1/sqlite3/l...
The improvements over musl are in this table: https://github.com/ncruces/go-sqlite3/tree/v0.26.1/sqlite3/l...
I'd say it's a decent improvement, if not spectacular. musl does particularly well with known length strings (memmem), whereas I'm allowed to cheat for unknown length strings (strstr) because the Wasm environment allows me to skirt some UB.
The NUL terminated string curve ball wrecks many good algorithms.