[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Re: Datatype section
Phil Goatly reminds us in that "any XML message only contains text data," which "is why, for example, EDIFACT distinguishes only between alphanumeric (fixed & variable length) and Numeric (fixed & variable length). It makes no assumptions as to how this data will be 'converted' on a host machine, or how the developer of a piece of end user software will implement these 'data types'." See http://lists.ebxml.org/archives/ebxml-bp/200101/msg00045.html. Certainly data types will be represented on different machine architectures in various ways, but we can make assumptions on their behaviors. If Integers, Floating points and character strings did not behave as we would expect them to, no data interchange would be possible between two vendors' environments. It's true that EDIFACT relies on two simple data types, alphanumeric and numeric (and the rare alphabetic). But business semantics dictate that more complicated structure lies within, based on context. For example, EDIFACT numerics cover any of the numeric formats described by ISO 6093:1985 Representation of numerical values in character strings for information interchange. These would include the familiar Fortran formats for integer, decimal and floating point (with the "E" exponent). See EDIFACT Syntax Version 4, in ISO 9735-1: Syntax rules common to all parts, together with syntax service directories for each of the parts, at http://www.gefeg.com/jswg/s4/data/9735-1.pdf. XML Schema Part 2: Datatypes, whence Betty Harvey derived her Datatypes document, provides a more hardware "lookish" form of datatypes, and one built on a 32-bit word at that; otherwise how could the maxInclusive of the "long" type end up being an oddball 9223372036854775807? I'm assuming that this big number is what fits in a doubleword, something like 2**63, with one bit left over for the sign. It gets even more hardware'ish when talking about 'float' and 'double' datatypes, by referring to the IEEE 754-1985 Standard for Binary Floating-Point Arithmetic, which has nothing to do with representing floating point numbers in text, as does ISO 6093. But now that we have Schema as a base for ebXML work, we can make explicit what was implicit in EDIFACT. There are lots of things in business documents that we think of only in whole number terms - Line Item numbers, Number of originals or copies of a document required, Consignment load sequence number, Number of stages, No. of Significant digits, Number of packages, Total number of items, and Number of stops, inter alia. These can be represented with the unsignedLong, unsignedInt, unsignedShort and unsignedByte datatypes, depending on reasonable expectations; for example, Line numbers may get real big, and unsignedLong will probably be useful. Number of originals or copies of a document required, on the other hand, would probably fit in an unsignedByte (which could hold up to 256). The decimal datatype would be suitable for where simple decimal numbers will suffice (e.g., currency conversion rates, price per unit, etc.). Prescribing floats or doubles in a core component would probably only rarely be done, say for radioactivity measurements in Becquerels. EDIFACT had to lump all sorts of formats for date and time in an alphanumeric, with the formatting context described by an associated qualifier. The various timeInstant, date, time, and timePeriod schema datatypes can now be can be used to restrict the allowable values, with violations detectable at the syntax analysis (parsing) stage. Schema pattern components can use regular expressions to enforce syntax checking, for example in product codes and invoice numbers; e.g., a product code might always be "AB" followed by two numerics, a dash, and three numerics. William J. Kammerer FORESIGHT Corp. 4950 Blazer Memorial Pkwy. Dublin, OH USA 43017-3305 +1 614 791-1600 Visit FORESIGHT Corp. at http://www.foresightcorp.com/ "Commerce for a New World"
Powered by eList eXpress LLC