Feeds are Not Fit for Gardening

—at least, in their current instantiations with RSS, Atom, JSON Feed, etc.

This item is a work in progress. It is subject to change radically at any time—up to and including being deleted entirely if I change my mind about it! Because it is a draft, it is not included in the site’s feeds; if you want to watch its progress, you will need to do that very old thing and bookmark this page and return to it over time until it is published.

Given its preliminary, work-in-progress status, I would appreciate it very much if you do not share this to social media, Hacker News, etc.—call it part of the contract for “working with the garage door open”. Thanks!

Assumed audience: Tinkerers, spec-writers, protocol-builders, gardeners: people willing to walk down new avenues in 2023—people, that is, who are up for revisiting some of the assumptions which have governed the web for the past few decades.

A bit of context: Reading Maggie Appleton’s essay A Brief History & Ethos of the Digital Garden and Mike Caufield’s talk–essay The Garden and the Stream: A Technopastoral.

TODO: write an introduction!

A thought I’ve had bouncing around for a while, most of all since starting to think about Obsidian Publish and similar:

All existing feed systems broadly assume streams”, and are difficult if not impossible to use with gardens”.

Here’s what I mean by that: RSS, Atom, and JSON Feed all notionally include the idea of being able to mark items as having been updated, but in practice that ability is little-used, deeply hobbled, and therefore largely irrelevant. That in turn means that the existing feed specifications serve reasonably well to publish new items but very poorly to notify subscribers about changes to existing items. They are therefore very poorly suited for gardens” in the sense described by Appleton, Caufield, and others.

Feeds today

There are a host of basic, ecosystem-wide specification/protocol-level reasons why this is so.

First, not all feed readers make active use of that information even when it is available. A real-world example: Feedbin — long my preferred feed reader service — does correctly handle updates, even having a dedicated section in its sidebar for it; but NetNewsWire, my current go-to app for reading the feeds I subscribe to in Feedbin does not. While NetNewsWire will update the text it displays for a given item, it provides no signal at all that the item has changed, and therefore that you might want to revisit it. Notably, NetNewsWire is normal here; Feedbin is the outlier.

This has a significant downside even for stream-type content. News publishers and blog authors alike regularly make meaningful edits to their content. Many a news story bears the stamp of changes — sometimes changes critical to key points in the story! — after publishing:

A previous version of this article said… It has been corrected to say… We apologize for the error.

Unless you happen to come back to the article, though, it is unlikely you will see that — and all the more so if you read it via a feed, because such changes are often not surfaced at all, still less highlighted.

Second, many publishing systems do not use updates meaningfully anyway. Well behaved feed generators can update old items, including an updated at” time stamp, but they do not have to, and not all do. (This is another reason that you are unlikely to see corrections appear in your news reader.) That concomitantly decreases the value of implementing support for handling updates. This reinforces the tendency for feed reader applications to ignore updates.

Third, the existing feed specifications all handle updates differently. RSS has a single pubDate value and Atom a single updated value, the idea in both cases apparently being that the distinction between when an item was first published and when it was most recently updated doesn’t matter (The specs don’t say why, though, so that’s just my hypothesizing.) JSON Feed sensibly supplies both date_published and date_modified.

That means that a feed reader service or application like Feedbin or NetNewsWire needs to be a lot smarter, though. It cannot rely on publishers updating pubDate in RSS or updated in Atom; nor on having JSON Feed items, since they are still by far the minority; nor on those items correctly using date_modified, since both date_published and date_modified are optional. Net, the reader service has to keep track of previous versions itself using some kind of caching mechanism.

This further increases the cost of implementing handling for updates for feed readers, which again decreases the likelihood they will do so.

Fourth, the updates pushed out to a feed item are often trivial. While some changes are important, many are incidental: correcting a typo, updating the phrasing in a particular paragraph, etc. — the kinds of things which may help a bit with polish but don’t fundamentally affect the argument of a given post/essay/etc.

This is so notwithstanding the specifications’ own allowances for just this issue! The W3C description for the Atom <updated> tag makes this explicit:

This value need not change after a typo is fixed, only after a substantial modification.

However, tools have to support the distinction between a typo” and a substantial modification”. Most do not. Indeed, no publishing system I am aware of has any native mechanics for this distinction — no doubt in part because it is irrelevant to nearly all consumers, given reader apps’ general lack of support for the feature.

Fifth, there are no mechanics for specifying exactly what changed in a given update. As with the divergence between the specs , this means that feed readers have to do extra work to be able to provide their users with info about what actually changed. They need to cache the previous version of the story and then do some kind of reasonably useful word-by-word diff between the various revisions.

Finally, many consumers of feeds — a feed reader, a podcast app, IndieWeb tools built on feeds, and so on — implicitly or explicitly reject feeds which are too large. This is generally a simple practical consideration: If you run a service which consumes many feeds, the sheer bandwidth involved becomes prohibitively expensive. (The original RSS 1.0 spec also explicitly recommended a maximum of 15 items per feed, and its pre-1.0 versions mandated that. Although those decades-old specs are hardly binding today, norms like that change only slowly, if at all.) This website’s untruncated feed is over 2MB and it has only a few hundred entries.1 A more active or larger website would blow past those limits in a matter of months, if not weeks. Even if I wanted to use the limited ecosystem support for updates, in practice I cannot.

In sum, then, the current feed ecosystem — specifications and implementations alike — thus has only the barest support for updates”. They feel bolted on, an afterthought. There is poor support for generating them, reading them, and even — perhaps most fundamentally — for transmitting them.

This systemic gap falls out — mostly implicitly — from the design of the specifications. RSS and JSON Feed both make their update fields optional, while Atom requires each <entry> to have an <update> value; but all of them implicitly assume a temporal stream, and reading their specs makes this obvious.

We can start with RSS, the progenitor of all later feed specs. The name gives it away from the start: RSS is really simple syndication” — syndication’s long history being about sharing news items to multiple publications. Thus, the spec describes its <item> tag like this:

A channel may contain any number of <item>s. An item may represent a story” — much like a story in a newspaper or magazine; if so its description is a synopsis of the story, and the link points to the full story.

This does not mandate a stream, but it deeply implies one. We see the same in the Atom spec: it is likewise defined as The Atom Syndication Format: syndication again. As with RSS, the spec explicitly states its stream-oriented purpose:

The primary use case that Atom addresses is the syndication of Web content such as weblogs and news headlines to Web sites as well as directly to user agents.

No surprise, the JSON Feed spec has a similar blurb:

Think of a blog or microblog, Twitter or Facebook timeline, set of commits to a repository, or even a server log. These are all lists, and each could be described by a feed.

Notice that these are all — more or less explicitly — designed for temporal streams. This focus is perfectly reasonable; it was not a failing of the specifications’ authors but rather a success. Feeds do the job they were designed for, and do it well. But that job was syndication of streams, not invitations to come see how a garden has changed and grown.

This is one reason that even many blogs whose authors explicitly think of them as gardens are effectively write-only. Each entry is atomic — not (only) in the Zettelkasten sense that they represent just a single discrete idea, but also in the sense that they represent only a single point in time. That temporal atomicity can make a stream-style site useful for tracing the development of an author’s thought, if one is so inclined. (More on that below.) It necessarily means, though, that individual atoms are not sprouts, growing into more fully-formed versions of the thought themselves.

Hacks

Even assuming that the vision I outlined is appealing, it will take time for these ideas to percolate, time for specs to be written and implemented, time for readers to add better support. What might we do in the meantime? How can we hack better support in now, using the existing infrastructure? Some ideas:

Publish items which are just a collection of links to recently updated items in the garden. This approach has a number of things going for it. First, and most important, it is easy to bolt onto” the existing feed ecosystem. Feed readers do not need to change anything. Publishing tools only need to add the ability to identify changed items and generate a list. For a traditional CMS with dynamic content, this is just a matter of noticing that an item already exists in the database and flagging it as a change accordingly. Notes-publishing tools like Obsidian Publish could integrate the same capability along similar lines.

The problem is slightly more complex for tooling built on static site generators (Jekyll, Hugo, Zola, Pelican, 11ty etc.), in that they tend to be single-shot build systems — at most with a build cache, and require no particular versioning or deployment strategy. In practice, however, static site generators are very often used with version control systems. Scripting the generation of new updates” sections is therefore possible, if not necessarily straightforward; the fact that it is a bit more fiddly is simply par for the course for static site generators.

Some degree of interactivity here would be helpful. Authors should be able to opt individual posts in or out of that list of updates, to avoid the it was just a typo fix” updates. They should also be able to summarize the changes, or to customize how much of the content surrounding the change is included — even if there are good defaults — so that the published updates can be presented in a way that is most helpful to readers.

Split the feeds

Another useful move might be to split up a site’s feeds between garden” and stream” content. Stream entries might continue to be a simple queue of items, with some relatively low limit on the number of entries in the feed and the expectation that readers will be unlikely to be notified about changes after the fact. They could continue to be full text — or not, as makes sense for any particular publication. They should continue to use existing feed specifications!

Garden entries might be shaped quite differently, and there are any number of different approaches one could take. Here are two:

Embrace the summary

A garden’s feed could include only a summary of, and any recent changes to, each item. A garden feed would thus primarily serve as an up-to-date list of ideas growing” in the garden. (Here, the incentive to either click through or actively ask for the full text would be a necessary workaround for the limitations of existing feed reader apps.) Embracing this split should allow the garden feed to include all the entries, for most kinds of sites: no matter how large any individual part of the garden grew, the reference to it in the feed would be a few kilobytes at most, allowing for a feed with hundreds of items in it while still fitting under the common 1MB limit.

Making this split would, for good and for ill, intensify the existing tendency to treat stream entries as one-and-done.

This approach could also work in tandem with the Links to updates” approach suggested above. Indeed, it would likely be necessary: We will have to take it as a given, at least initially, that most readers will not make use of the information about any given item’s having been updated. Accordingly, it might make sense to keep all item garden items in the list, for the sake of the rare reader apps which do support rendering updates, and also to publish update entries.

In this model, having many recent updates in the feed may not make sense. Instead, the feed could — potentially — publish only a single recent updates” entry at a time (always with a unique ID), replacing it whenever updates are published. This would also help with keeping the summary small.

Supporting this split would also require new publishing infrastructure and tools. The challenge is not in splitting out garden content vs. stream content into different feeds: many existing CMSs already handle this correctly, and could be extended to support different rendering patterns for different kinds of feeds. (I could build this for this site’s feeds in 15 minutes or so, for example.) No, the work to be done is in enabling authors and publishers to describe their updates — easily.

Here is one potential flow:

I describe this flow in terms of a traditional database-backed CMS rather than a static site generator because doing this with a static site generator, while possible, is substantially more difficult!

On first publishing a new item to the garden, show a user interface asking for a summary. This could be distinct from the summary used for SEO purposes, or it could be used for both.

Under the hood, update the schema (database or otherwise) to keep track of a few additional pieces of information:

  • When the last update” item was published. (For a newly-“planted” garden this would be never”.)
  • For each item in the garden, track three facts about its most recent update, if any: the summary of the changes, the time the update was made, and whether the update has been published.

On updating an existing item from the garden, present the changes and a prompt to summarize them. The presentation could be as minimal as the result of running diff on the two text sources, or it could be an elegant presentation of how it looks when rendered on the site, as makes sense for the software in question. (WordPress and Pelican should probably do different things here!) The prompt should also allow the author to decide that this change does not need to be treated as an update” at all: correcting nto” to not” is very different from adding an entirely new section to an essay.

In the case where the author indicates that there is no update, the item itself might be marked as last modified” for bookkeeping purposes, and a CMS might even use that for the modified field in an Atom feed, but the next update” entry should not include it.

When there are updates which have not yet been published, and the author publishes other items, prompt to see if they want to publish the updates as well. Alternatively and/or complementarily, maintain a view of unpublished updates” which the user can choose to include in a new update post. Once an item update has been published, mark it as having been published so that it is no longer included in future updates.

Here’s a hypothetical JSON Feed entry for an update like this (here I’m skipping the overall feed data; this would live in the items list):

{
  "id": "",
  "author": {
    "name": "Chris Krycho",
    "url": "https://v5.chriskrycho.com/"
  },
  "title": "Updates: March 29, 2023",
  "date_published": "2023-03-29T09:39:00.000-0600",
  "content_html": "<ul><li><a href=\"https://v5.chriskrycho.com/essays/feeds-are-not-fit-for-gardening/\">Feeds are Not Fit for Gardening</a>: Finished drafting the section <a href=\"https://v5.chriskrycho.com/essays/feeds-are-not-fit-for-gardening/#embrace-the-summary\">Hacks: Split the Feed: Embrace the Summary</a>, with a proposal for how a <abbr>CMS</abbr> might handle one possible flow for updates while using the “hacks” mode proposed for retrofitting garden-type changes into a traditional feed.</li></ul>",
}

You could of course format this any number of ways, and include other useful information in it, like the previous time it had been updated. But at a minimum this makes it possible for a classic feed format to present garden-style updates with a minimum of additional infrastructure.

Embrace the updates

Garden feeds might instead choose to publish update summaries and the full content of each item, including changes made to it over time. Instead of just embracing the current ecosystem limitations, this approach pushes on them with the

  • TODO: distinguishing whether items should be updated (typos vs. not again) → does it update the item or not?
  • generating update lists

Batches of updates

Depending on how often a user publishes updates to a garden, they might want to be able to batch up their changes to publish a set of them all together. TODO

A New Kind of Feed

As useful as these kinds of hacks might be, though, the fact that we have to hack them in this way in the first place is suggestive. What would a protocol for updates which treats gardens” as a top priority look like?

TODO: actually sketch out these changes!

Note that these changes are not the same as the set of changes which would make updates more useful for stream-like content, but they have some overlap. TODO: what overlap?

Should we even call this new thing a feed”? Perhaps not. Right now, I am going with a Garden”, and for the sake of nominal uniqueness (and maybe a bit of nostalgia), the implementations will be grdn.

TODO: keep going!

There will nonetheless also be a lot of commonalities with traditional feeds. After all, even with a garden, the point here is to provide a mechanism for readers to be notified of changes.

You can find here the current state of the protocol, as I am thinking of it so far (with basically zero formality).

At this point, I think this should be entirely transport-mechanism agnostic. I expect it mostly to be supplied as JSON and/or XML, the same as the existing feed protocols, at least initially, but as a matter of convenience rather than as a mandate. If you wanted to publish it with Protobuf or BSON instead, I suspect (many!) fewer clients would consume it, but I don’t see any reason this protocol should define anything other than the structure of the data.

Some of the biggest open questions to me at the moment:

  • How to avoid duplicating changes” content in the garden while still providing both useful top-level summary and lower-level ?

  • Whether it makes any sense to distinguish link and id? Currently thinking no” since [cool URIs don’t change][uri], but then one way you might use a garden” would include changing your name about where something ought to live. Maybe that’s fine, and the spec just needs a concept of a redirected-to” in it.

  • The serialization structure this should take for different formats, since there are different affordances in (e.g.) JSON vs. XML — JSON requires you to go with full objects a lot more often. For example, to do the equivalent of XML’s <link rel='...' href='...' />, JSON requires link: { rel: "...", href: "..." }. — and whether and how to encode this in the spec itself?

  • Stylistic considerations? Self-closing items with attributes vs. enclosed content in the XML output? Content types — HTML? Markdown? How are they indicated?

As of January 21, 2023, there are very basic Rust and TypeScript implementations of this available on npm and crates.io.3 The packages don’t do much yet: just parse and provide type definitions for this rough first draft. It’s a start, though!

TODO: elaborate, finish, etc.


Notes

  1. A few years ago, I had to start truncating the feeds from my own sites just to make them work with micro.blog, which has an unofficial 1MB limit on the size of the feeds it would consume. notably, this is only for the current (v5) version of the site, which launched in November 2019. You can imagine how much larger the feed would be if it included the earlier iterations of the site. ↩︎

  2. Thanks to Stephen Carradini for this example. ↩︎

  3. I picked those two languages because they are my go-to languages at this point: I work all day every day in TS and Rust remains my favorite ↩︎