#dev | Logs for 2018-12-03
« return
[02:30:47] <takyon> my extension can make changes to the story the moment you click edit and it loads the textarea containing the story contents. mostly spacing changes. abbreviations could be handled at the same time. although we might want to click a button or have a setting that disables it from happening automatically
[02:31:13] <takyon> I want to clean up my extension first before we decide to mash these two together thoguh
[02:52:11] <Deucalion> I'm beginning to think takyon is an Ethanol-Fueled alter ego... always bringing their extension into the conversation.... /joke off
[03:24:06] <TheMightyBuzzard> woah, #dev got talked in today!
[03:25:50] <TheMightyBuzzard> #smake chromas
[03:25:50] * MrPlow smakes chromas upside the head with a pick'n'flick
[03:25:55] <TheMightyBuzzard> no javascript!
[03:27:19] <TheMightyBuzzard> fyngyrz, probably not all that difficult but it'd be lighter code if we just beat the editors until they started using abbr tags.
[03:28:42] <TheMightyBuzzard> we run plenty of filters already on output. this would be a heavy one since it would have to go through the story hundreds of times done as you'd expect.
[03:29:18] <TheMightyBuzzard> i might find a quicker way to handle it though and stories aren't all that big that a few hundred passes would be much of a problem.
[05:33:30] <fyngyrz> TheMightyBuzzard, it would be best if the abbr work was done when the front page was displayed. The data's already been fetched, and all it takes is a quick scan of the text against a dictionary. The code is simple and very fast, and the dictionary would be loaded once per entire multi-TFS page display. We're talking fractions of a second here.
[05:33:50] <fyngyrz> the reason it is best is because the dictionary can be changed post-story
[05:34:01] <fyngyrz> that adds abbr's, and can update existing ones
[05:34:51] <fyngyrz> and the point of it is, it's much easier on the editors when they put the story together if they know these terms are going to be explained. It should show that on the edit preview as well, of course
[05:37:59] <fyngyrz> basically, the page loads the dictionary from file; then the story body is scanned for caps-or-caps/numbers sequences, skipping ahead through the inside of any HTML tags; they are captured, used as keys to a dictionary, and then the replacement is done. That's just about all there is to it. In Python, without being fancy, it's 35 lines of clean code.
[05:38:40] <fyngyrz> I do the replacement into a new string so that it's built from the original and the new tags are appended instead of the original content. Very quick.
[05:39:26] <fyngyrz> Quite likely it'd be even faster in Perl, too :)
[05:40:23] <fyngyrz> it's also just one pass
[05:40:28] <fyngyrz> not hundreds of them
[05:41:16] <fyngyrz> take a look at my code, here: https://github.com
[05:41:33] <fyngyrz> presently at line 245, the method makeacros(text)
[05:43:28] <fyngyrz> BTW, in Python, u'' means a unicode string
[05:43:40] <fyngyrz> this is fed unicode, and returns unicode
[05:45:09] <fyngyrz> TTYT
[12:24:25] <TheMightyBuzzard> fyngyrz, passing TFA once checking every word against 400 acronyms or passing it 400 times checking against one makes no difference in speed. optimizing to only check upper-case(with the possibility of numbers as well) words should help but i couldn't say how much.
[12:25:46] <TheMightyBuzzard> as for individual speed, you have to consider that we do something similar dozens of times already for various other reasons. it adds up, so you try to keep each instance as tight as possible.
[12:27:27] <TheMightyBuzzard> you'd also need to check it wasn't already between some abbr tags.
[12:31:17] <TheMightyBuzzard> that could be done in the same regex as matching the acronym but it would slow down the matching a little bit
[12:56:36] <Bytram> fyngyrz: I now have a better idea of what you are proposing, now that I've seen it at work in a recent story, and used that technique myself in the SpaceX two launches in ~24h story.
[12:57:57] <Bytram> As tempting it is to do this after-the-fact, I am mindful that TLA --> expansion is a one-to-many mapping (example: does TLA == "Three Letter Acronym"? or "Three Letter Agency"?)
[13:00:01] <Bytram> So, as part of the initial construction of a story, providing a means to make it easier to replace a given abbreviation with the <abbr title="text">ABBR</abbr> would be preferable. At least in my eyes.
[13:00:59] <Bytram> It's still early in the day for me and I'm a bit under-caffeinated, so there is likely to be more, later. But those are my current thoughts on the idea. Definitely sounds like it has potential in some form or another!
[15:40:23] <fyngyrz> Bytram, The way I handle multiple defs is TLA,,Three Letter Agency; Three Letter Acronym.
[15:40:36] <fyngyrz> So the defs remain current and appropriate.
[15:42:20] <fyngyrz> TheMightBuzzard, you don't check each word against 400 acronyms. You do one access to a dictionary. In Python, at least, this is fast, because dictionaries are mapped, not looked up.
[15:43:16] <fyngyrz> Bytram, also, if done after the fact, defs can not only be expanded, but improved, edited, etc.
[15:44:41] <fyngyrz> TheMIghtyBuzzard, to use this, you'd never use abbr tags when editing. That shouldn't be a problem, considering that previous to this, they were rarely, if ever, used.
[15:46:41] <fyngyrz> Again, my code should serve as an example: one linear pass through TFA, dictionary based access when cap sequences are hit. It's very fast and very efficient. I leave it to you, of course, just know that it's easy, quick and efficient.
[15:54:40] <fyngyrz> Considering the really bad Db performance, I would definitely not put this in the Db until/unless the Db organization can be fixed. Just an editable file,
[16:15:09] <Bytram> fyngyrz: so, umm, if I have a story that contains "KLOC" and, after the fact, the story is loaded, what would the abbr tag look like? cf: https://www.abbreviations.com
[16:16:05] <fyngyrz> You'd start with the one that is referenced in the story. So if it's "Thousnads of ilnes of code", then the def is:
[16:16:17] <fyngyrz> KLOC,,Thousands of Lines Of Code
[16:16:51] <fyngyrz> Odds are excellent that's the only def relevant to SN; however...
[16:16:59] <Bytram> https://www.abbreviations.com
[16:17:25] <fyngyrz> say you had a story later that had the kids thing; then the listing is updated to:
[16:17:49] <fyngyrz> TLOC,,Thousands of Lines Of Code; Kids For Landcare Classroom
[16:18:37] <fyngyrz> Same answer. Def is what you need, no more than that, and added to as what you need changes
[16:19:40] <fyngyrz> current TLA is:
[16:19:40] <fyngyrz> TLA,,Three Letter Agency (FBI, CIA, NSA, etc.)
[16:20:27] <Bytram> don't get me wrong, I *like* the idea of abbreviation expansion in a story. My concern lies with how to make the *correct* decision for which expansion to instantiate for each abbreviation *after* a story has gone live.
[16:20:41] <Bytram> here's a better example...
[16:20:48] <Bytram> https://www.abbreviations.com
[16:20:57] <Bytram> At The Moment
[16:21:05] <fyngyrz> The thing to consider here is that what we would actually need is only things that are relevant to what we have stories about. So 99% of the defs on places like abbreviations.com are irrelevant and shouldn't be there
[16:21:12] <Bytram> Asynchronous Transfer Mode
[16:21:18] <Bytram> Automated Teller Machine
[16:21:39] <fyngyrz> Yep. Well, same answer. We do what is relevant to what we write, ignore the rest
[16:22:00] <fyngyrz> if we never talk about asynch transfer mode, there's no reason for it to be in the def
[16:22:11] * Bytram curses his connection that keeps going up and down.
[16:22:32] <fyngyrz> I built the current fil4e out of acronyms people were actually using on the site (and slashdot, and reddit)
[16:22:49] <Bytram> good sources, yes.
[16:22:56] <fyngyrz> it's very reasonable so far, and I've no reason to think it would not continue to be that way
[16:23:13] <Bytram> BTW, that example using ATM came from a comment that I recall seeing on /.
[16:23:26] <fyngyrz> I did some testing with large defs, and both Chrome and Safari seem to handle them very gracefully
[16:23:52] <Bytram> instantiate it in the story and have it remain static in the DB as the story text... I would have absolutely no issues with that.
[16:24:56] <fyngyrz> well, you and the coders would work that out. My opinion on it is on record pretty clearly, and that's the only oar I have to put in the water.
[16:25:20] <Bytram> doing something, after the story has been saved in the DB, and instantiate it on the fly... it will *proabably* work *most* of the time, but I've seen way too many collisions on abbreviations over my years... gives me a MAJOR case of the heebee geebies/
[16:26:14] <fyngyrz> this does NOT collide. This provides multiple defs. Like a dictionary does. Words many times have multiple meanings. It's not an actual problem (in fact, soemtimes it's funny)
[16:26:48] <Bytram> I think I understand where you are coming from. Realize I come from a QA background so I'm naturally gonna find the places where things are not gonna hold up/
[16:26:59] <fyngyrz> except you haven't. :)
[16:27:37] <Bytram> so, please show me how you would expand my prior example: ATM I am debugging the ATM comms for this ATM.
[16:28:01] <fyngyrz> TLA on day one because of story A: "Automated Teller Machine"
[16:28:17] <fyngyrz> TLA on day two because of story B: "Automated Teller Machine; At The Moment"
[16:28:27] <fyngyrz> that's all there is to it
[16:28:32] <fyngyrz> no collision
[16:28:36] <fyngyrz> just multiple def
[16:29:04] <fyngyrz> add as many as you like
[16:29:13] <Bytram> please show me what the HTML would like for expanding that sentence
[16:29:37] <fyngyrz> give me a sec to add the multiples
[16:29:43] <Bytram> k
[16:29:44] <fyngyrz> to the defs file
[16:29:53] <Bytram> Time's up! =)
[16:30:01] <Bytram> j/k
[16:31:00] <fyngyrz> <abbr title="Automated Teller Machine; At The Moment; Asynchronous Transfer Mode">ATM</abbr> I am debugging the <abbr title="Automated Teller Machine; At The Moment; Asynchronous Transfer Mode">ATM</abbr> comms for this <abbr title="Automated Teller Machine; At The Moment; Asynchronous Transfer Mode">ATM</abbr>.
[16:31:16] <Bytram> after this, I'm going to have to restart my computer... I'm getting connection dropouts every 30 seconds or so, so I see IRC disconnect and reconnect repeatedly.
[16:31:21] <fyngyrz> That's a pretty severe corner case, of course
[16:31:39] <fyngyrz> but it's perfectly reasonable
[16:31:59] <fyngyrz> and can you imagine having to hand code it? How annoying!
[16:32:07] <fyngyrz> even this:
[16:32:45] <Bytram> <fyngyrz> <abbr title="Automated Teller Machine; At The Moment; Asynchronous Transfer Mode">ATM</abbr> I am debugging the <abbr title="Automated Teller Machine; At The Moment; Asynchronous Transfer Mode">ATM</abbr> comms for this <abbr title="Automated Teller Machine; At The Moment; Asynchronous Transfer Mode">ATM</abbr>.
[16:32:53] <Bytram> hold on..
[16:32:56] <fyngyrz> <abbr title="At The Moment">ATM</abbr> I am debugging the <abbr title="Asynchronous Transfer Mode">ATM</abbr> comms for this <abbr title="Automated Teller Machine;">ATM</abbr>.
[16:33:08] <Bytram> <abbr title="At The Moment">ATM</abbr> I am debugging the <abbr title="Asynchronous Transfer Mode">ATM</abbr> comms for this <abbr title="Automated Teller Machine">ATM</abbr>.
[16:33:13] <Bytram> that's better
[16:33:22] <fyngyrz> it's a hella lot of editing work
[16:33:30] <fyngyrz> the whole point is to avoid that
[16:34:07] <fyngyrz> there's no way to automate what you're thinking of - it has to be manual, and in that case, this isn't a step forward
[16:34:20] <fyngyrz> also, again, it's a severe corner case
[16:35:08] <fyngyrz> with Javascript, it could be handled contextually, but this place is so anti javascript I didn't even bother to suggest it
[16:36:06] <fyngyrz> also... if it's done per-TFS...
[16:36:52] <fyngyrz> then you have to re-enter each locally different def, every time you post a TFS.
[16:37:21] <fyngyrz> this makes more editor work. My suggestion was aimed at reducing editor work, because it's difficult enough as it is
[16:37:47] <fyngyrz> it also really appeals to me that defs can be improved over time.
[16:38:04] <Bytram> I appreciate that. Really, I do! I don't like doing any more work than I have to!
[16:38:05] <fyngyrz> everything from typos to clearer or more concise defs
[16:38:25] <fyngyrz> if they're canned, then you go to improve one, and that's the only one that gets improved
[16:38:36] <fyngyrz> the same amount of work for a lot smaller fix
[16:38:50] <fyngyrz> well, I think you can see where I'm coming from
[16:39:08] <Bytram> okay got dropped again.
[16:39:11] <fyngyrz> no matter what, defs can be edited. The difference is, fix 'em all, or just fix one
[16:39:27] * fyngyrz curses your ISP for you
[16:39:28] <Bytram> and again. can't take this anymore
[16:39:38] <Bytram> rebooting... merci!
[16:39:48] * Bytram wanders off to reboot stuff and see if it helps
[16:40:02] <Bytram> maybe quite a while
[16:40:06] <Bytram> many thanks for the converstaion
[16:40:12] <Bytram> ciao for now
[16:40:15] <fyngyrz> aye
[20:02:23] <fyngyrz> A QA engineer walks into a bar. Orders a beer. Orders 0 beers. Orders 99999 beers. Orders a lizard. Orders -1 beers. Orders a skdfhiuh. First real customer walks into the bar, asks where the bathroom is. Bar bursts into flames.
[20:02:41] <fyngyrz> @bytram
[20:02:46] <fyngyrz> :)