[08:16:50] <chromas> #tell TheMightyBuzzard it's the ~= isn't it? in D, that's append instead of regex or whatever it is in perl
[08:22:34] <chromas> The entry point is acquire_article, which loads the page with curl and parses it into a tree with my little toy parser (in a different module), then deletes a few nodes and searches for the longest contiguous text chunk and goes up the tree to find its <div> or <article> (grand)parent and calls that the article
[08:24:21] <chromas> Some of the other article extractors like boilerpipe and newscat (and even the Reader mode in Firefox (which is just a chunk of javascript)) measure some statistics to pick out the paragraphs
[08:51:42] <chromas> =submit https://www.reuters.com +https://www.theguardian.com/science/2018/oct/31/stephen-hawking-phd-thesis-and-wheelchair-to-sell-in-online-auction-christies +http://go.theregister.com/feed/www.theregister.co.uk/2018/10/24/stephen_hawking_auction/
[08:59:55] <chromas> =submit Enable JS for maximum security: https://9to5google.com +https://venturebeat.com/2018/10/31/google-beefs-up-account-security-with-new-step-by-step-checkup-notifications-and-javascript-requirement/ +https://www.zdnet.com/article/google-wont-let-you-sign-in-if-you-disabled-javascript-in-your-browser/
[09:00:20] <chromas> Damn title branding
[10:27:13] <TheMightyBuzzard> chromas, s'not that. i just didn't have enough coffee in me to even read my own code when i said something.
[15:27:24] <chromas> =g'day exec
[15:27:30] <chromas> ~g'day upstart
[15:28:16] <chromas> ~g'day >adverb socialistically
[15:28:31] <chromas> ~g'day >adverb communistically
[15:32:23] <chromas> #smakeadd IBM System/D
[16:47:19] <chromas> =submit https://phys.org
[16:58:02] <chromas> =submit http://theconversation.com
