ebxml-regrep message

Subject: Re: Bizcodes - was (Re: SubmissionPackage DTD)
From: Jon Bosak <bosak@boethius.eng.sun.com>
To: ebxml-regrep@lists.oasis-open.org
Date: Fri, 2 Jun 2000 21:29:44 -0700 (PDT)
I realize now that some people have seen quoted fragments of my
end of a discussion posted to another list but haven't seen the
entirety of what I said, because I wasn't subscribed to this list.
(Such are the evils of cross-posting.)

Here is a message I posted in reply to Duane on 22 May, another
message in reply to David on 31 May, and then a more recent
comment by David and my reply to it.

Jon

==================================================================

 Date: Mon, 22 May 2000 14:54:03 -0700 (PDT)
 From: Jon Bosak <bosak@boethius.eng.sun.com>
 To: regrep@lists.oasis-open.org
 CC: ebxml-regrep@lists.oasis-open.org
 Subject: Re: SubmissionPackage DTD

 [duane@xmlglobal.com:]

 | It has been suggested that we use a unique ID which is
 | semantically meaningful to at least one person on the planet.  The
 | logic there is while no one can define a key which is universally
 | acceptable but at least some will be able to make use of it.

 I am among those holding the opinion that labels meaningful to
 someone are better than labels meaningful to no one.  The
 utilitarian calculus on this is pretty simple; if U is the utility
 to an individual of being able to use a mnemonically significant
 set of terms and N is the number of people for whom a set of terms
 in mnemonically significant, then

    U * N > U * 0

 for any N greater than zero.

 | <IMHO> this is a *really* bad idea.  First off, the political
 | upheavel of EDI vs. other standards competing for defining this
 | semantically meaningful component will stall all future work.

 The assumption here is that the use of *any* meaningful labels
 would entail "the political upheaval of EDI vs. other standards".
 I'm not seeing any reasoning here to establish that the one thing
 would follow from the other.

 | Secondly, the unique key is for a Machine to find only, not a
 | human.

 I see no support for this assertion.  On general principles I find
 highly doubtful the proposition that no human being will find
 mnemonic labels useful.  At the very least, we humans engaged in
 defining the set of labels will find mnemonic qualities useful.
 This is particularly true in the case of names for low-level data
 elements.

 | Therefore, the machine does not care about semantics on the query
 | side, only that it can find the item and the item is unique.

 From this it follows that the machine does not care whether labels
 are mnemonic.  Computationally, mnemonics are free.

 | Once it has a pointer to the item, the semantics for the calling
 | app can be derived by performing a "get" type request on the item
 | and examining what it is equivalent to.  I human shouldbe able to
 | search on the semantic terms it wants.

 Yes, but developers (in particular developers of standards) will
 have to work with these codes all the time.

 If it doesn't make any difference to the end users and it doesn't
 make any difference to the machine, then in my opinion the
 position that we use terms that are meaningful to no one when we
 could just as easily use terms that are meaningful to many of the
 developers just doesn't make sense.

 Another argument that some people have advanced for using
 meaningless codes is that they are language-neutral.  To me this
 is identical to an argument that programming languages should use
 arbitrary strings for command keywords.  I don't buy that, either.

 Having said that, I would like to venture a modest proposal that
 may help to allay the concerns of both those who fear a
 terminology struggle between EDI factions as well as those who
 seek political correctness in the choice of language.  I suggest
 that wherever we need a list of terms in circumstances where the
 choice of language is computationally insignificant, we use Latin.
 The character set is even smaller than US ASCII; it's as
 politically neutral a choice as can be achieved in terms that are
 mnemonic to anyone; and if the terms are intelligently selected,
 they will be mnemonic to almost everyone actually involved in this
 work.

 I originally put this proposal forward as a joke, but compared to
 the absurd conclusion that the optimum solution is the one that
 inconveniences the greatest number of people, it's starting to
 look genuinely attractive.

 Jon

==================================================================

 Date: Wed, 31 May 2000 23:19:54 -0700 (PDT)
 From: Jon Bosak <bosak@boethius.eng.sun.com>
 To: Gnosis_@compuserve.com
 CC: ebxml-regrep@lists.oasis-open.org
 Subject: Re: Bizcodes - was (Re: SubmissionPackage DTD)

 I've removed the copy to the OASIS regrep TC; I don't think that's
 the right place for this discussion.

 | Jon, you just spotted the Golden Goose!  Just like with UPC and EAN
 | for barcodes, you need to have a central registration authority.
 | 
 | That is why I was whittering on about the DOI.org work - they are 
 | registration service for HTML labels; we need similar for XML.

 But why can't we just register "persona" or something like that?
 What is it about a nonsense string that makes it easier to
 register?

 Consider:

    price           ebxml:pretium         ebxml:109384
    date            ebxml:dies            ebxml:799421
    size            ebxml:mensura         ebxml:796593
    weight          ebxml:pondus          ebxml:996324
    address         ebxml:locus           ebxml:582010

 What makes the codes in the third column easier to register than
 the ones in the second column?

 | Therefore a request to the central registry for a 'WAM#####' code
 | would automatically resolve to the Wal-Mart registry server, etc.
 | 
 | Where have you seen this before?  Yeah - DNS servers and mirrors.
 | No surprises here.

 You appear to be reinventing URNs.  But I'm seeing nothing here
 that requires unique identifiers to be arbitrary.

 Jon

==================================================================

[Gnosis_@compuserve.com:]

| Now - if you call something 'price' it invokes all kinds of
| knee-jerk stuff from people - is it tax paid, invoice, net,
| dealer, and so on!?

But I wasn't suggesting we call it "price."  I was suggesting that
we use the Latin equivalent "pretium," which is not a word in any
modern language that I know of but would be mnemonic to over 80
percent of the current online linguistic population.  Even
projecting steep increases in Asian participation by 2005, we
would still have over 60 percent of the online world speaking
languages in which "pretium" is mnemonic.

I haven't seen a single persuasive argument yet why we should
choose to use codes that aren't mnemonic when we could just as
easily use ones that are.

Jon


=======================================================================
= This is ebxml-regrep, the general mailing list for the ebXML        =
= Registry/Repository project team. The owner of this list is         =
= owner-ebxml-regrep@oasis-open.org                                   =
=                                                                     =
= To unsubscribe, send mail to majordomo@lists.oasis-open.org with    =
= the following in the body of the message:                           =
=      unsubscribe ebxml-regrep                                       =
= If you are subscribed using a different email address, put the      =
= address you subscribed with at the end of the line; e.g.            =
=      unsubscribe ebxml-regrep myname@company.com                    =
=======================================================================
References:
- Re: Bizcodes - was (Re: SubmissionPackage DTD)
  - From: David RR Webber <Gnosis_@compuserve.com>