Comment by fwip

Comment by fwip 16 hours ago

3 replies

It might be the stupidest, but stupid in the sense of "the simplest thing that could possibly work."

When FASTA was invented, Sanger sequencing reads would be around a thousand bases in length. Even back then, disk space wasn't so precious that you couldn't spend several kilobytes on the results of your experiment. Plus, being able to view your results with `more` is a useful feature when you're working with data of that size.

And, despite its simplicity, it has worked for forty years.

michaelhoffman 15 hours ago

When FASTA was invented in 1985, generally sequencing reads would be about half that.

The simplicity of FASTA seems like a dream compared to the GenBank flat file format used before then. And around the year 2000, less computationally-inclined scientists were storing sequence in Microsoft Word binary .doc files.

A lot of file formats (including bioinformatics formats!) have come and gone in that time period. I don't think many would design it this way today, but it has a lot of nice features like the ones you point out that led to its longevity.

attractivechaos 15 hours ago

FASTA was invented in late 1980s. At that time, unix tools often limited line length. Even in early 2000s, some unix tools (on AIX as I remember) still had this limit.