← Back to blog
·TechnologyInternet

The Small Web Is Actually Massive

A Hacker News frontpage discussion reveals that the 'small web' of personal sites and indie projects is far bigger than anyone assumed.

There was a discussion on the Hacker News frontpage recently about the "small web" - personal blogs, indie projects, small community forums, single-person websites - and the most interesting finding was that it's not small at all. It's massive. We just can't see it because our tools for measuring the web are calibrated for big platforms.

The numbers shared in the discussion were striking. Millions of personal blogs. Tens of thousands of active small forums. Hundreds of thousands of indie project sites. Newsletter archives. Wikis. Digital gardens. Photo blogs. Recipe sites. Hobby pages. The long tail of the web is longer than most people imagine.

I think we've been suffering from a measurement bias. When we talk about "the web," we mean the platforms: Google, YouTube, Reddit, Twitter, Facebook, Wikipedia. These are the sites that show up in traffic rankings because traffic rankings measure centralized popularity. The small web is, by definition, decentralized. No single small site has meaningful traffic. In aggregate, they represent a huge portion of the web's actual content.

This matters for several reasons.

First, the small web is where the good content lives. I know that sounds like romanticizing, but think about it. When someone maintains a personal blog about, say, woodworking, they're writing because they love woodworking. There's no SEO optimization. No engagement farming. No algorithmic pressure to produce daily content regardless of quality. The result is thoughtful, detailed, honest writing. Compare that to the fifteenth "10 Best Woodworking Tools of 2026" article from a content mill, and the quality difference is obvious.

Second, the small web is harder for AI to access and learn from. Large language models are trained primarily on the big, crawlable web. Small sites often have robots.txt restrictions, or they're structured in non-standard ways, or they simply aren't linked to by enough big sites to get crawled regularly. This means AI models may be systematically underweighting the highest-quality human knowledge.

Third, the small web is resilient in ways the big web isn't. When Twitter changes its API or Reddit locks down its data or a platform dies, entire communities lose their content. Personal sites on personal domains with personal hosting don't have that dependency. Your blog doesn't go away because a VC-funded company ran out of runway.

The HN discussion surfaced some practical observations too. People noted that discovering small web content is genuinely hard. Google's search has become increasingly platform-biased, surfacing Reddit results and big-name publications while burying personal sites. Alternative search engines like Marginalia (which specifically indexes the small web) and curated directories are filling this gap, but they're still niche.

There's an interesting connection to AI here. As AI-generated content floods the big web, the small web's value increases. A personal blog with a human voice, genuine expertise, and years of authentic content becomes more valuable precisely because it's harder to fake. You can generate a thousand SEO articles with AI, but you can't generate a decade of thoughtful personal writing about a niche topic.

I've been thinking about this in the context of my own writing and web presence. Having a personal site - a real one, not just a social media profile - is an investment in the open web. Every person who publishes on their own domain instead of exclusively on platforms is casting a vote for the small web's future.

The platforms want you to believe they are the web. They're not. They're landlords sitting on top of the web. The actual web - the one Tim Berners-Lee built - is made of personal pages linking to other personal pages. And apparently, that web is doing just fine. Better than fine, actually. It's massive, it's growing, and it's producing the best content on the internet.

We just need better tools to find it.