Comment by el_don_almighty

Comment by el_don_almighty 7 hours ago

1 reply

I have been looking for something that would ingest a decade of old Word and PowerPoint documents and convert them into a standardized format where the individual elements could be repurposed for other formats. This seems like a critical building block for a system that would accomplish this task.

Now I need a catalog, archive, or historian function that archives and pulls the elements easily. Amazing work!

pxc 3 hours ago

Can't you just start with unoconv or pandoc, then maybe use an LLM to clean up after converting to plain text?