JP Vossen on 2 Dec 2018 17:53:19 -0800


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] OT: MontCo Dispatch system web & RSS


On 11/27/18 9:59 PM, brent timothy saner wrote:
because you can never have too much python fun, i wrote a parser (see
attached). it currently just converts it to JSON, but if you know a
little python yourself i'm sure you can extend it into some fun. it's
more important that everyone learn how awesome XML is. ;)

it's obj-oriented - the parsed RSS XML is converted to a python dict in
a class attribute, so you can import it and instance it all you want, etc.

Brent, I finally got around to playing with that--very cool! In the same spirit of "fun" I grabbed some code out of "6.3 Parsing Simple XML Data" from page 183 of _Python Cookbook 3_ and got this much simpler but less featurefull version to dump the 2 fields that have the useful content.

----
#!/usr/bin/env python3
# montco_incidents1.py
# 2018-12-02: JP from "6.3 Parsing Simple XML Data" from page 183 of _Python Cookbook 3_

from urllib.request import urlopen
from xml.etree.ElementTree import parse

def main():
    # Download the RSS feed and parse it
    u = urlopen('https://webapp02.montcopa.org/eoc/cadinfo/livecadrss.asp')
    doc = parse(u)

    # Extract and print tags of interest
    for item in doc.iterfind('channel/item') :

        title       = item.findtext('title')
        description = item.findtext('description')
#link = item.findtext('link') # Always: http://www.montcopa.org/webcad
        #date        = item.findtext('pubDate') # Better date in desc.

        print(title,description)

if __name__ == '__main__':
    main()
----

As usual my MUA is mangling the lines a bit.

Then I really went nuts, because I've got the RSS feed in Liferea, which uses SQLite, so...

----
$ sqlite3 /path/to/liferea.db

### Get the key (node_if) for the subscription
sqlite> select * from node where title like 'Montco%';
node_id parent_id title type expanded view_mode sort_column sort_reversed ---------- ---------- ---------- ---------- ---------- ---------- ----------- ------------- ctdhnix fmmnrqj MontCo PA rss 0 3 0 1 pusppbl gbhbemq MontCo Inc rss 0 3 0 1


### Get local stuff in my RSS DB:
$ sqlite3 /ssd/home/jp/liferea/liferea.db "select title,description from items where node_id = 'pusppbl'" | perl -pe 's!\|<div xmlns="http://www.w3.org/1999/xhtml";><p>!\t!; s!;</p></div>!!; s/&amp;/&/g;' | egrep 'COLLEGEVILLE|ROYERSFORD|TRAPPE|UPPER PROVIDENCE' | cat -n
...

### List the types in my DB
$ sqlite3 /ssd/home/jp/liferea/liferea.db "select title from items where node_id = 'pusppbl'" | sort -u | cat -n
     1	EMS: ABDOMINAL PAINS
     2	EMS: ALLERGIC REACTION
     3	EMS: ALTERED MENTAL STATUS
...
    40	Traffic: ROAD OBSTRUCTION -
    41	Traffic: VEHICLE ACCIDENT -
    42	Traffic: VEHICLE FIRE -


### List the cities in my DB
$ sqlite3 /ssd/home/jp/liferea/liferea.db "select description from items where node_id = 'pusppbl'" | cut -d';' -f3 | perl -pe 's/^\s+| +$//g;' | sort -u | cat -n
     1	2018-12-02 @ 03:11:39
     2	2018-12-02 @ 17:40:21-Station:STA8
     3	ABINGTON
     4	AMBLER
     5	BRIDGEPORT
     6	BUCKS COUNTY
...
    58	WHITEMARSH
    59	WHITPAIN
    60	WORCESTER
----

Pretty cool stuff. It might be a fun weekend project to wrap Flask or web2py around Brent's code to make a little searchable web GUI.

Later,
JP
--  -------------------------------------------------------------------
JP Vossen, CISSP | http://www.jpsdomain.org/ | http://bashcookbook.com/
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug