RiteshKhanna
I like building things, testing ideas, and seeing what holds up.
Apr 8, 2026
I Tested Meta Muse Spark Against 4 Frontier Models
Three tests, five models, zero retries. Meta crushed vision and analysis but face-planted on code. ChatGPT wrote the worst code and produced the best output. Meta wins overall — barely.

Recent posts
I Tested Meta Muse Spark Against 4 Frontier Models
Three tests, five models, zero retries. Meta crushed vision and analysis but face-planted on code. ChatGPT wrote the worst code and produced the best output. Meta wins overall — barely.
I Tried to Build Wordle with Gemma 4
Google’s latest open-weights model versus Claude Code, head to head on a simple coding task. One took 2 hours. The other took 2 minutes.
19 Years of Hacker News, Visualized
I queried 47.5 million items from the complete Hacker News archive to find out what the community really cares about, when to post, and who the power users are.
Conway's Game of Life
An interactive cellular automaton — draw a pattern, press play, and watch complexity emerge from four simple rules.
Fun with Strings
A physics playground disguised as a page of prose. Every letter is a rigid body — click one and watch the words come apart.
I Built a Browser-Based AI Music Generator
A working AI music platform — type a prompt, get a full song with vocals, lyrics, and cover art. One person, open-source models.
Same Size, 3 Years of Progress: Retraining My Image Prompt Generator
In 2023 I fine-tuned GPT-2 355M to write image prompts. Now I retrained the concept with Qwen3-0.6B on Claude-generated data. Same size class, dramatically different results.
AI Music Is Good But Not Great. Can We Fix That?
I generated 230 AI songs across 10 genres, built a model to rate them, and used the data to 4x the hit rate of producing good tracks. All on a MacBook.
Front End Competition
I gave Claude Code and Codex the exact same prompt and deployed whatever came out. No edits, no re-rolls.
I Tried to One-Shot an Entire Web App with Claude Code
20,000 lines of code in 12 hours. It worked. It was also terrible. Here’s what I learned about the gap between “functional” and “shippable.”