Re: [unrev-II] Re: HtmlDOM -- XML -- Xmail

From: Eric Armstrong (
Date: Tue Mar 14 2000 - 16:41:38 PST

  • Next message: Jon Winters: "Re: [unrev-II] The perils of high technology... (fwd)"

    From: Eric Armstrong <>

    Jeff Miller wrote:

    > While I agree that html does not meet our needs. I don't think that a
    > conversion tool is out of the question as most people I know that hand
    > code
    > html document do so without use of <p><h3>wwww</h3> wierdness that you
    > were
    > talking about and composer style editors generate know styles. (Of
    > course,
    > it's only as good as the user at the keyboard). In, lets pick a
    > number,
    > 80% of html out there could be converted with a conversion tool and
    > then
    > cleaned up. Leaving the miss interpreted and plain ugly for the luckly
    > human.

    I understand that most people don't. I don't, which is why those cases
    occurred to me. But the program you write has to be prepared for a
    of eventualities. Basically, you can't depend on finding an </h3> as a
    So you have to figure out what to do if you see <h2>, <h1>, <h4>, <h5>,
    <table>, or one of the other possible terminators. (I'm not really
    certain that
    h3 doesn't require a terminator, but I know that a <dd> entry, for
    can be terminated by </dd>, <dt>, <dd>, or </dl>.) As the number of
    combinations goes up, program complexity rises dramatically -- a
    situation that
    is no doubt responsible for xml's insistence that every element be

    When you see <p>...<h3>, for example, you could simply assume that the
    <h3> starts a new header. Right? But what do you do when </p> *is*
    present, as in <p>...<h3>...</h3>...</p>? Do you simply ignore the </p>?

    Do you throw an error that says the document is not well constructed? Or

    do you convert the h3 to a font tag? Ignoring the </p> seems like the
    course, but then where do you put the text, and how does it relate to
    comes after?

    The problems are solvable -- all problems are. As a matter of
    though, does it make sense to pour a lot of time and energy into solving

    them, or does it make more sense to skip forward one generation and take

    advantage of the structuring that XML provides?

    If XML does *not* become the lingua franca of the Web, it would seem
    that solving the HTML problems would be worthwhile. But if it *does*
    become the standard, then solving HTML's problems is so much wasted

    My crystal ball is dusty. How's yours?

    Get a NextCard Visa, in 30 seconds! Get rates as low as
    0.0% Intro or 9.9% Fixed APR and no hidden fees.
    Apply NOW!

    Community email addresses:
      Post message:
      List owner:

    Shortcut URL to this page:

    This archive was generated by hypermail 2b29 : Tue Mar 14 2000 - 16:48:36 PST