Re: [ba-ohs-talk] bootstrap list message content & purple numbers
On Mon, 10 Dec 2001, Peter Jones wrote: (01)
> I've just hacked a desperate perl script (yep, I need the practice) that
> accesses the HTML archives for
> ba-unrev-talk, in the hopes of being able to add some interesting metadata
> to the backlink db... eventually. (02)
Now _this_ is the kind of message I like to see! :-) (03)
Let me save you some trouble Peter (and anybody else who wants to hack on
this). The code I wrote to do the purple numbers and backlink extraction
is a filter for MHonArc (http://www.mhonarc.org/). I've been meaning to
release the code; I've just been lazy. (04)
If you want to do this kind of hacking, it's better to start with the
MHonArc filter. It'll give you nice, programmatic access to the e-mail
metadata; no need to deserialize ugly, serialized HTML data. I'll be
happy to step you through the code. MHonArc is nice and powerful, but its
internals leave something to be desired. (05)
> And then I noticed something incidentally potentially irksome about purple
> numbering in this message
>
> http://www.bootstrap.org/lists/ba-unrev-talk/0111/msg00014.html
>
> Lots of sentences and paragraphs, but only 1 purple number because the '>'s
> cloud the issue. (06)
That is correct. A consequence of my least-effort algorithm. :-( (07)
> Would it be better to replace >s with indents in the HTML prior to adding
> purple numbering? (08)
Not sure I understand the suggestion. Are you suggesting not purple
numbering these quotes at all? (09)
-Eugene (010)
--
+=== Eugene Eric Kim ===== eekim@eekim.com ===== http://www.eekim.com/ ===+
| "Writer's block is a fancy term made up by whiners so they |
+===== can have an excuse to drink alcohol." --Steve Martin ===========+ (011)