jasim 4 days ago

Good question. LLMs are surprisingly less fragile than hand-coded parsers for unstructured data like the ones in a bank statement.

And to be clear - I'm not sending the entire statement to Claude; instead, only the account name/narration of those transactions for which I already don't have a mapping. Claude then returns a well-formatted JSON that maps "amzn0026765260@apl" to "expenses:amazon", and "Veena Fuels" to "expenses:vehicle" and so on.

I can also pass in general instructions saying that "Restaurants and food-related accounts are categorized under 'expenses:food'", and it does a good job of mapping most of my dining out expenses to the correct account head.

The actual generation of journal entries are done by a simple Python script. The mapping used to be the hardest part, and what used to need custom classification models is just a simple prompt with LLM.