Show HN: GlyphLang – An AI-first programming language
34 points by goose0004 a day ago
While working on a proof of concept project, I kept hitting Claude's token limit 30-60 minutes into their 5-hour sessions. The accumulating context from the codebase was eating through tokens fast. So I built a language designed to be generated by AI rather than written by humans.
GlyphLang
GlyphLang replaces verbose keywords with symbols that tokenize more efficiently:
# Python
@app.route('/users/<id>')
def get_user(id):
user = db.query("SELECT * FROM users WHERE id = ?", id)
return jsonify(user)
# GlyphLang
@ GET /users/:id {
$ user = db.query("SELECT * FROM users WHERE id = ?", id)
> user
}
@ = route, $ = variable, > = return. Initial benchmarks show ~45% fewer tokens than Python, ~63% fewer than Java.
In practice, that means more logic fits in context, and sessions stretch longer before hitting limits. The AI maintains a broader view of your codebase throughout.Before anyone asks: no, this isn't APL with extra steps. APL, Perl, and Forth are symbol-heavy but optimized for mathematical notation, human terseness, or machine efficiency. GlyphLang is specifically optimized for how modern LLMs tokenize. It's designed to be generated by AI and reviewed by humans, not the other way around. That said, it's still readable enough to be written or tweaked if the occasion requires.
It's still a work in progress, but it's a usable language with a bytecode compiler, JIT, LSP, VS Code extension, PostgreSQL, WebSockets, async/await, generics.
I've found that short symbols cause collisions with other tokens in the llms vocabulary. It is generally much better to have long descriptive names for everything in a language than short ones.
An example that shocked me was using an xml translation of C for better vector search. The lack of curly braces made the model return much more relavent code than using anything else, including enriching the database with ctags.
>GlyphLang is specifically optimized for how modern LLMs tokenize.
This is extremely dubious. The vocabulary of tokens isn't conserved inside model families, let alone entirely different types of models. The only thing they are all good at is tokenizing English.