brent timothy saner on 3 Dec 2018 09:52:29 -0800


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] OT: MontCo Dispatch system web & RSS


On 12/3/18 12:18 PM, JP Vossen wrote:
> Yes, I should have mentioned that too.  The RSS XML has TTL = "2" which
> I assume is 2 hours, but it does go very fast.  My RSS reader is now
> checking every 30 mins and I've got 175 records since 9:05 last night.
> 

the "ttl" element actually indicates how long an item should be *cached*
for:

"ttl stands for time to live. It's a number of minutes that indicates
how long a channel can be cached before refreshing from the source. More
info here."[0]

i think they're either:

1.) manually culling old entries from the feed, or (more likely)

2.) dynamically serving (or periodically generating) the feed, and
selecting either a range of time (where entries older than "foo"
wouldn't be collated into the generated feed) or a number (where more
than X previous entries will be included).

which makes sense, if i was running a feed that had 5-20(?) new entries
a day, i'd probably be using method #2 above.

> Note any persistent local storage with one of those scripts would need
> to deal with duplicates when you run the script every N period of time.
> It looks like the "description" field could be used for a unique key to
> handle that.
> 
> Later,
> JP

so you'd actually have to hash against a combination of the title and
pubDate, or yeah- the description (but that makes it hard to sort).

but here's the kicker. according to RSS spec[1], there's NOTHING that
says two entries MUST be unique because RSS <item>s are serialized
sequentially (or, technically once the logic is applied by
parsers/clients, reverse-sequential). you don't even need to have a
<title> in an item. the only required subelement in items is
<description>. IDEALLY, they'd specify a <guid> item for each entry
which would solve this entire issue.

thankfully, the (Apple) podcast[2] and (Google) podcast/"play"cast[3]
specs place a MUCH stronger encouragement that the <guid> object is
present and unique per EACH <item> and it's become a de facto "best
practice" as most clients rely on this subelement. (for
Sysadministrivia, i use a literal checksum hash[4] of the file referenced).


[0] https://cyber.harvard.edu/rss/rss.html
    https://cyber.harvard.edu/rss/rss.html#ltttlgtSubelementOfLtchannelgt

[1] https://cyber.harvard.edu/rss/rss.html#hrelementsOfLtitemgt

[2] https://help.apple.com/itc/podcasts_connect/#/itcb54353390

[3] https://support.google.com/googleplay/podcasts/answer/6260341#series

[4] view-source:https://sysadministrivia.com/podcast

NOTE: alternate link for above rss spec is:
      https://validator.w3.org/feed/docs/rss2.html

Attachment: signature.asc
Description: OpenPGP digital signature

___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug