#dev | Logs for 2017-12-29
« return
[00:29:57] <Bytram> TheMightyBuzzard: ping
[00:30:28] * Bytram is trying to figure out how you implemented the fix for whitespace-only comment / commentsubject
[00:32:16] * Bytram is thinking it is in Access.pm ??
[00:33:03] <Bytram> hrmmm: $content_slice =~ s/[\h\p{Cc}\p{Cf}/$nbsp_space/g;
[00:33:38] <TheMightyBuzzard> pong
[00:33:51] <Bytram> lol
[00:34:05] <TheMightyBuzzard> is that copy pasta?
[00:34:12] <Bytram> reading Perl regexps is like reading line noise
[00:34:13] <Bytram> yup
[00:34:16] <TheMightyBuzzard> if it is, there's a missing ]
[00:34:39] <Bytram> Access.pm line 618
[00:34:39] <TheMightyBuzzard> but since dev isn't croaking, i assume you're looking at an old copy
[00:34:50] * Bytram is looking at github
[00:34:56] <Bytram> https://github.com
[00:34:57] <upstart> ^ 03Nerf whitespace jiggery pokery via unicode · TheMightyBuzzard/rehash@7580d5b · GitHub
[00:35:30] <TheMightyBuzzard> yeah, you're looking at an old copy
[00:36:00] <Bytram> umm, no? that is in the green text, marked with '+'
[00:36:00] <TheMightyBuzzard> it's $content_slice =~ s/[\h\p{Cc}\p{Cf}]/$nbsp_space/g; in the current pr
[00:36:15] <Bytram> linky please?
[00:36:44] <TheMightyBuzzard> https://github.com
[00:36:46] <upstart> ^ 03Wobblywilly by TheMightyBuzzard · Pull Request #420 · SoylentNews/rehash · GitHub ( https://github.com )
[00:36:59] <Bytram> mare sea! clicky...
[00:37:53] <TheMightyBuzzard> \p{Cc} matches all control codes, \p{Cf} matches all formatting marks.
[00:38:22] <Bytram> was kinda guessing it was something like that from context and what I found on FileInfo
[00:38:23] <TheMightyBuzzard> \h matches all horizontal whitespace
[00:38:38] <Bytram> space, tab, and the like
[00:39:10] <TheMightyBuzzard> and nbsp, etc...
[00:39:23] <Bytram> okay
[00:40:17] * Bytram would love to see an enumeration of all chars that each of those match: \p{Cc} and \p{Cf}
[00:40:27] <TheMightyBuzzard> feel free to get chromas to help you test that. he's downright creative about being a nuisance.
[00:41:23] <TheMightyBuzzard> Bytram, have to read the perl source for that. i don't know except that they're supposed to be complete sets.
[00:41:23] * Bytram ain't no slouch either, but would prefer to not have the fox check the henhouse
[00:41:58] <TheMightyBuzzard> hey, he volunteered to be an ed. slap some feathers on him and make him lay eggs.
[00:42:29] <Bytram> yeah... I was thinking I just need something that creates each char from 0x0 through 0x1ffff as a var, and then I can compare if that char matches a pattern
[00:44:15] <Bytram> for (i=0; i<=0x1fffff; i++); c=sprinf("%c",i); if (c ~ /\p{Cc}/) print c + ' matches \p{Cc}'; end
[00:44:25] <Bytram> ^^^ something like that
[00:44:38] * Bytram yanks upper bound out of htin air
[00:45:13] <chromas> Don't forget to encode the number into utf-8
[00:45:23] <Bytram> chromas++
[00:45:23] <Bender> karma - chromas: 6
[00:46:53] <TheMightyBuzzard> Bytram, easier in perl.
[00:47:09] <TheMightyBuzzard> perl -E 'use utf8; foreach(0x0..0x1ffff){ say chr($_); }'
[00:47:16] <Bytram> agreed! Was using that as pseudocode to give the gist (Jist?) of what I was looking for.
[00:47:49] <Bytram> ohhhh....
[00:47:52] * Bytram smiles
[00:48:12] <TheMightyBuzzard> thas not complete but it's easy to finish
[00:48:52] * Bytram last programmed in perl, ummm, last century or thereabouts
[00:49:14] <TheMightyBuzzard> blarg. fine. i'll do it. it's gonna take a lot longer than normal though.
[00:50:02] <Bytram> I'm thinknig if I had a file with one char per line...
[00:50:44] <chromas> Didn't you make something like that a while back?
[00:50:50] <Bytram> yes
[00:51:01] <chromas> Use a regex to find it :)
[00:51:44] <chromas> I think those character map tools would have a similar file too
[00:51:53] <chromas> But with a description for each char
[00:52:02] <Bytram> that's kinda what I am thinking, but I gots a windows box
[00:52:18] * Bytram has WAY too many windows open
[00:52:56] * chromas remembers the days of Hace Task Tamer
[00:53:15] <Bytram> Hace?
[00:53:22] <Bytram> #g Hace Task Tamer
[00:53:23] <MrPlow> https://www.facebook.com - "Task Tamers - Professional Organising Service, Brisbane, Queensland, Australia. 128 likes. Welcome to Task Tamers Professional Organising Service where..."
[00:53:37] * Bytram is still confused
[00:53:44] <chromas> the branding of the guy who made some Windows tools. I think the name was changed
[00:53:51] <TheMightyBuzzard> perl -E 'use utf8; foreach(0x0..0x1ffff){ my $chr = chr($_); if($chr =~ /[\h\p{Cc}\p{Cf}]/) { printf ("0x%x\n", $_); } }'
[00:53:55] <Bytram> oh. k. tx!
[00:54:43] * Bytram just closed a dozen cmd windows
[00:55:02] <chromas> It let you group stuff on your taskbar, before Windows had it.
[00:55:41] <TheMightyBuzzard> https://tmb.dedyn.io if you just want the output.
[00:56:14] <Bytram> charon: got it!
[00:56:19] <Bytram> TheMightyBuzzard: muchos cracias!
[00:56:36] <TheMightyBuzzard> see, sombrero
[00:56:43] <Bytram> ROFLMAO!
[00:57:04] <Bytram> so, those are the characters that matched... \p{Cc} ???
[00:57:38] <TheMightyBuzzard> matched one of \h \p{Cc} or \p{Cf}
[00:58:06] <TheMightyBuzzard> need em separated?
[00:58:32] <chromas> Make it output xml
[00:58:44] <Bytram> #smake chromas
[00:58:44] * MrPlow smakes chromas upside the head with a pick'n'flick
[00:58:58] <Bytram> hmmm,
[00:59:21] <Bytram> you said horizonrtal whitespace... what about vertical whitespace chars?
[00:59:22] <TheMightyBuzzard> https://tmb.dedyn.io https://tmb.dedyn.io https://tmb.dedyn.io
[00:59:39] <Bytram> brilliant!
[01:00:18] <TheMightyBuzzard> they're not counted in that particular algorithm because reasons i can't remember right now
[01:00:27] <Bytram> nod nod
[01:00:40] <Bytram> prolly hard to paste a vertical tab into a text-field
[01:01:02] <TheMightyBuzzard> they ARE later down on 2148 though
[01:02:31] <TheMightyBuzzard> cmn32480, you should make chromas do stuff. he foolishly admitted he might be willing to correct typos, fix links, and 2nd stories.
[01:03:14] <TheMightyBuzzard> oh, that's why. we're checking line by line in that part of the algorithm.
[01:03:30] <TheMightyBuzzard> wait, no, slice by slice
[01:04:05] <Bytram> also linefeed char 0xa would also be considered vertical whitespace as well as the vertical tab 0xb
[01:04:29] * Bytram would like a slice of chocolate cake
[01:04:39] <TheMightyBuzzard> if we need to add it, we'll add it. it's just a \v
[01:05:10] <chromas> Bytram.eat(cake["chocolate"][]);
[01:05:18] <Bytram> botm of em are matched in \p{Cc}
[01:05:22] * TheMightyBuzzard is annoyed that 0x7 doesn't still make things go bing
[01:05:33] <chromas> They go google now
[01:05:39] * Bytram notes it does still work in a windows CMD.exe window
[01:05:46] <TheMightyBuzzard> #yt the machine that goes bing
[01:05:46] <MrPlow> https://www.youtube.com
[01:07:54] <chromas> You just need to hack your terminal. Mine pops a notification and optionally plays a noise for the bel char
[01:08:58] <Bytram> TheMightyBuzzard: noice! You gots the ‎ and ‏ chars, too!
[01:09:22] <TheMightyBuzzard> oh? i must look into this.
[01:10:07] <Bytram> that came about when I was first testinf UTF-8 implementation...
[01:10:34] * Bytram had forgotten he tested that.
[01:11:09] <Bytram> well, if those pattern-matches you displayed are what the code actually checks, then I think you covered all the bases... BUT...
[01:11:33] <Bytram> need to check a few of 'em on the site itself, just to confirm, yanno?
[01:17:24] <TheMightyBuzzard> yar. in theory, theory and practice are the same. in practice, they're not.
[01:17:49] <Bytram> yuppers on that one!
[01:18:05] * Bytram needs to make some dinner... back in a bit
[01:18:32] * Bytram has already downloaded the pattern-matched files to his PC
[02:03:29] <Bytram> wrt pattern-matching in perl and regexp's that match various Control and other patterns: http://www.regular-expressions.info
[02:03:30] <exec> └─ 13Regex Tutorial - Unicode Characters and Properties
[12:19:41] <Bytram> TheMightyBuzzard: yo! Any chance you could post your perl scripts (for enumerating \p{Cf} \p{Cc} etc.) here for posterity? (and later regression testing)
[12:19:48] <Bytram> oh
[12:19:50] <Bytram> ~gday TheMightyBuzzard
[12:19:52] * exec allegedly pesters an overflowing treasure chest of feminists with TheMightyBuzzard
[12:20:06] <TheMightyBuzzard> not a script, just a one-liner
[12:20:29] <TheMightyBuzzard> perl -E 'use utf8; foreach(0x0..0x1ffff){ my $chr = chr($_); if($chr =~ /[\h\p{Cc}\p{Cf}]/) { printf ("0x%x\n", $_); } }'
[12:20:46] <Bytram> muchos graciass!
[12:20:49] <TheMightyBuzzard> remove the ones not desired from within the []
[12:20:55] <TheMightyBuzzard> or add \v
[12:21:07] <Bytram> nod nod
[12:21:37] <TheMightyBuzzard> needa smoke, back in 10
[12:21:38] <Bytram> feeling any better?
[12:23:12] * Bytram certainly hopes so, as he fetches another tissue and blows his runny nose while giving 'thanks' to a co-woprker who came to work sick.
[12:33:37] <TheMightyBuzzard> naw. only on day three since the first tickle in my nose.
[12:33:46] * Bytram notes she is a single mom and has used up all her vacation time for the year.
[12:33:57] <Bytram> nod nod
[12:34:02] <Bytram> hope it passes quickly!
[12:34:11] <TheMightyBuzzard> shit, i got distracted and forgot to go smoke.
[12:34:21] * Bytram plans to pick up two quarts of extra-spicey hot-n-sour soup on his way home from work tonight.
[12:42:54] <TheMightyBuzzard> i cheated and picked up three frozen packs of general tso's chicken from the store since i knew spicy was going to be an extra happy thing for a bit. also so i don't have to think much about dinner for a while.
[12:49:11] <Bytram> I would, but the Chinese food place I frequent in only a couple hundred yards from where I work and has authentic chinese food... not just the pupu platter, steak terriyaki, americanized stuff.
[12:58:39] <Bytram> TheMightyBuzzard: how hard would it be to count the number of hits on stories for this year? (Since 2017-01-01)??
[12:59:51] <Bytram> well, I can see kinda how to do it for stories...
[13:00:31] <Bytram> select sum(hits) from stories where time >= "2017-01-01" ;
[13:00:42] <Bytram> ==> 9189228
[13:01:38] <Bytram> iow nvm
[13:02:45] <TheMightyBuzzard> unpossible really. caching makes that highly inaccurate.