[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: RE: Units of Measure
Hi All, It's nice to see such a lively discussion on lists. But I fear a point I was trying to make in my posting has been overlooked. I was not trying to provide a recommendation for syntax to represent codes, I was trying to point out that some codes provide a different semantic function than do other codes. Personally, I would not have coded Hispanic as a data element value, as it makes it harder to hang a semantic ID on the value. But I might have coded the value as <Hispanic/> (where Hispanic was in some default namespace, as I do feel that all element names saved in a repository should be contained within some namespace). And I could live with someone who coded Hispanic as the value of an element, provided that element definition provided access to a process that knew how to use the value to yield its semantics. Syntax must lead in some manner to understandable semantics. In X12, the trail leads through paper documents. In ebXML, the trail must lead through XML processable snippets. I fully expect that a given semantic entity will often find itself represented in more than one manner across the available XML syntax alternatives. I do not expect 'code list' values always to be represented by their 'code', whatever the syntax. I believe a code list like Unit of Measurement will be torn apart, and the units they represent will be related to the individual entities they modify, so that the syntax prevents things like '<length>15<FeetPerSecond/></length>'. I suspect it will often prove convenient to specify code list values as attribute values, which means the parent element will need to provide a process to identify from the code list value the semantic entity the code represents. On the other hand, when some other representation proves more (useful|concise|readable|whatever) in a given environment, that's fine - so long as the sementic entity represetned by the code is identifiable. What worries me is the thought that someone will use a code list (e.g., an X12 code list) as an attribute value to an XML element; not provide a means to get at the semantics the code represents (the XML representation of the semantics of course), and think that the job is done. Yet I frequently see code lists represented without thought given to how the recipient processor is going to be able to identify the semantics associated with the code value. I forgot to mention it, but code list values also (conceptually) reside in (rather restricted) namespaces. That will likely keep to a minimum the set of semantic code values an implementation might choose to represent outside an attribute list, lest they risk name collisions in whatever namespace they are defining their messages. Cheers, Bob -----Original Message----- From: William J. Kammerer [mailto:wkammerer@foresightcorp.com] Sent: Friday, July 07, 2000 3:07 PM To: ebxml-core@lists.oasis-open.org Subject: Re: Units of Measure Bob Miller brought up interesting aspects of codification, especially those derived from X12, and means of representing aspects of coded semantics in XML. Bob says some code lists perform a 'text alias' service - e.g., X12 DE 1109 Race or Ethnicity Code has values 7 (Not Provided), C (Caucasian), H (Hispanic), etc. He says the same code could be represented in XML as <Ethnicity>Hispanic</Ethnicity>. I think there are a couple of problems with this technique of spelling out the value (of what used to be a simple code). Though I'll admit it's easier for a human reader to discern the ethnicity in Bob's example, it will be harder for programs to process. Especially considering all the attendant problems of capitalization (is 'Hispanic' the same as 'hispanic'?) and misspellings. Data categorized by D.E. 1109 will often be mechanically sorted and collated by ethnic and racial categories - small discrepancies in the element value will make this difficult. This is unlike misspellings and the like in street addresses, which are generally read by humans only (it's usually only the 9 digit ZIP which has to be exact for data processing needs). It would have been easier if the OMB had devised a definitive code list for racial and ethnic classifications in its directives, which could then be used unchanged for data processing purposes. Instead, the OMB just rambles on and talks about various types of ethnicity and racial classifications, leaving it up to X12 to come up with a code list and to deal with the various ambiguities (maybe this isn't a problem in EDIFACT since other countries aren't as obsessed with "classifying" their subjects). Besides the problems with capitalization and misspellings, you would also have to deal with the complete redesignation of categories, depending on political whim and correctness. What was a "Caucasian" yesterday may be a "White" today, and "Anglo" tomorrow (regardless how absurd this sounds since almost all English-speaking Euro-Americans aren't of English descent at all, and it's probably offensive to Irish and German Americans). How would a program deal with these renamings - keep a table of all the text synonyms? No, I think classification by codes, effective in EDI, is needed just as much in XML based core components. Then there's the translatability problem with the so-called "Textual" codes. Certainly country and currency codes would fall into Bob's category of 'text alias'. Maybe <Country>Germany</Country> might be more understandable by the casual reader, but <Country>DE</Country> using the ISO country code is easier to process by programs, and is far more standardized. And <Country>Deutschland</Country> is just as readable by English speakers. And is it "United States" or "United States of America," or is it "Bundesrepublik Deutschland" instead of "Germany"? Are we going to expect our programs to do on-the-fly translations, or to maintain complicated synonym tables? - remember, I have to know the country so I can prepare the proper customs papers, calculate shipping, etc. etc. Codes were invented to remove ambiguity - they're just as necessary for unambiguous interpretation of XML data as with EDI. And for those codes which perform a 'reference' service, as unit of measure in Bob's example <Weight Code='Pounds'>10</Weight>, I don't know why there would be any doubt we would use the standard UN/ECE recommendation 20 for UOM. What kind of "Pounds" is Bob talking about? - I'm sure there are different types, like dry pounds. And "Pounds" is plural - what if I have only 1 pound???? Does my program have to account for the 's' somehow? Are we trying to make "readable" XML for idiots, or are we trying to find a better way to perform automated B2B interoperability? Let's just use the standardized codes which were invented for practical reasons pre-dating even Traditional EDI. William J. Kammerer FORESIGHT Corp. 4950 Blazer Memorial Pkwy. Dublin, OH USA 43017-3305 +1 614 791-1600 Visit FORESIGHT Corp. at http://www.foresightcorp.com/ "Commerce for a New World" ======================================================================= = This is ebxml-core, the general mailing list for the ebXML = = Core Components project team. The owner of this list is = = owner-ebxml-core@oasis-open.org = = = = To unsubscribe, send mail to majordomo@lists.oasis-open.org with = = the following in the body of the message: = = unsubscribe ebxml-core = = If you are subscribed using a different email address, put the = = address you subscribed with at the end of the line; e.g. = = unsubscribe ebxml-core myname@company.com = =======================================================================
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC