What if PageRank Was a Mistake?

A provocation on our algorithmic present and a curational (curative?) alt-history.

Assumed audience: People who care about the ethics and implications of technology—and fans of alternative histories.

Epistemic status: A provocation and a thought experiment.

In the early days of the web, Page Rank seemed like magic: a clean win. What were the downsides? Open up Google, type in a query, and just like that you would see an automatically crowdsourced1 set of results, based on how much other people linked to them. As more than a few people have noted recently, though,2 results (on Google in particular but also search engines in general) have degraded. Gamification of results started early and has only accelerated in the last decade. What if — and I know this is crazy, but stay with me — lists of links built by hand were actually better? Not in spite of, but precisely because of their inability to scale, and therefore to be gamed at scale?

I am left imagining a world where Larry and Sergey never had their idea, or never made it popular, and the algorithmic search engine never took off, and we lived instead in a world where curation ruled: hand-tuned lists of the best places to find answers to the best question in an actual web of links, as Tim (and Ted and arguably even Vannevar) intended?


A web built first of all on curation would not have magically solved all our problems. If anything, it would have been that much easier to buy your way onto the best of” lists and stay there. The whole point of PageRank was that it was a fair and impartial algorithm, not something which could be biased by conscious human choices. Until, of course, we realized that it was biased from the outset by the choices of its creators, and that those choices could in turn be exploited.

(No technology is unable to be exploited. This is why, when thinking about technologies we create, we should always ask not if it will be abused, but rather: by whom, and how.)

But the distortions would have been different. They would have been, in some sense, more traditional: the normal human foibles, expressed in <a> tags, magnified perhaps a little over the old idea of a physical directory (because space is no longer a limitation) — but not much. Not to the hubristic scale of Google’s current dream:

Our mission is to organize the world’s information and make it universally accessible and useful.

The world’s information! All of it! And if that were not enough: to make it not only universally accessible (a debatable goal to say the least: should everything be accessible everywhere to everyone?) but useful: as if the utility of information were something to be determined, defined, and demarcated by a single megacorporate entity. Maybe hand-curated links would have been less stupid than that. Or at least: more distributedly-stupid, less powerfully-stupid, and therefore actually less bad.

I wonder.


Update, evening of 2021/08/27: a couple interesting bits from some colleagues after I shared this on LinkedIn:

  • Caitlin O’Connor pointed out that PageRank itself is fully general mathematically — it’s just eigenvectors after all!3 — so it’s the Google implementation (and decades-long elaboration) that’s particularly interesting here.

  • Adam Hobson reminded me of Mahalo, which was an actual take at re-instantiating this alt-history back in the last 2000’s, and also shared Benedict Evans’ 2016 dictum (from Lists are the New Search):

    All curation grows until it requires search. All search grows until it requires curation.


Update, evening of 2021/08/28: Robin Sloan suggested to me a connection with librarians, whose dispositions and discipline might have made something very different of a curated web. Not for nothing so many sci-fi stories of libraries. (I still think public libraries are an under-appreciated, nearly-miraculous experiment!).

On which note: for the current-day, real-world intersection of librarians at their best and the internet, see Dan Cohen and his Humane Ingenuity, which is one of my favorite newsletters: not least for its own careful work of curation!


Notes

  1. This usage is, of course, something of an anachronism: the internet-age neologism crowdsourcing” was coined in 2005. ↩︎

  2. See, for example, this tweet thread — with a ThreadReader unroll here, though who knows for how long. Another provocation for another day: imagine if we, uhh, blogged? ↩︎

  3. …which, like me, you might not remember from your linear algebra or physics. In an amusing bit of happenstance, I just re-taught myself what eigenvectors and eigenvalues were two nights ago, though! ↩︎