[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Re: Problems with Classifications
Strong analysis of the weaknesses of representing a classification scheme with nested containment! Recalling the requirement not to "reinvent the wheel", has the WG considered representing classification schemata with ISO 13250 for "Topic Maps" or is (forthcoming) allied specification, XTM, or topic maps for XML? Topic maps meet the requirements for classification schemes expressed below. Some pointers: http://www.oasis-open.org/cover/topicMaps.html http://www.topicmaps.com/ Sam Hunting EComXML XML Evangelist ----- Original Message ----- From: "Len Gallagher" <LGallagher@nist.gov> To: "Nieman, Scott" <Scott.Nieman@NorstanConsulting.com> Cc: <ebxml-regrep@lists.ebxml.org> Sent: Friday, September 29, 2000 10:31 AM Subject: Problems with Classifications > > ebXML RegRep group, > > At our teleconference yesterday we agreed to go with version 0.2 of the > Information Model specification for the Proof of Concept (poc) > demonstration in Tokyo. That seems OK since a demo can always work around > problems. However, I think the representation of classification schemes in > version 0.2 is broken and implementors should be aware of that as they > prepare their products for the demo. > > Consider the following simple classification scheme for wood (W) that could > be used to classifiy all wood as either hardwood (HW) or softwood (SW). > > W > / \ > / \ > HW SW > > This would be represented in the version 0.2 information model as 3 > ClassificationNode's, where each node is a ManagedObject with one > additional attribute, namely a pointer to its parent. So the three nodes > would be represented as: > > (W, -) > (HW,W) > (SW,W) > > These three nodes are stand alone objects, the model does not capture the > fact that the definer thought of them as a group of three nodes to > determine a classification scheme. > > Later a different definer wants to create a classification scheme for > materials (M) and wants to re-use the existing scheme for wood. The intent > is that all materials will be classified as plastic (P), wood (W) and its > subclassifications, or other (O), and the intended classification scheme is: > > M > / | \ > / | \ > P W O > / \ > / \ > HW SW > > The new definer creates the new nodes for M, P, and O, but wants to > reference W for wood. There is no way the second definer can have W point > to M as parent because W already has a null pointer for its parent > attribute, and that node is owned by a different definer. So the definer > will have to create a new node W' that points to M as its parent and > somehow has a Ref to W as that portion of the hierarchy. We'll get the > following representation: > > (M, -) > (P, M) > (W',M) with Ref to W > (O, M) > > The only way to capture this reference to W in the existing model is for > there to be a strong 1:1 association from W' to W captured in the metadata > for W'. This is possible, but the association would have to be uniquely typed. > > Now suppose a product is classified as HW, and a user asks the registry to > provide the path to HW from the root node. The representation is broken > because there is no unique path to a root; instead, there are two choices: > W-->HW derived from the first classification scheme or M-->W'-->HW > derived from the second classification scheme. > > There is no way for the system to determine which path is intended because > the system didn't retain the information that there were two separate > classification schemes involved in the definitions! And the user has no way > to communicate to the system which classification scheme is intended. > > SOLUTION > > It was a mistake to try to simplify the notion of classification scheme by > deleting the notion of a classification scheme being a separate object with > a fixed set of nodes as its content. We need to go back to the concepts as > discussed in version 0.1 and think of a classification scheme as a distinct > object, with a fixed number of nodes, and a partial ordering over those > nodes to determine a fixed hierarchy. > > All that is needed to achieve the desired result is an identifier for each > separate classification scheme, say A for the original scheme and B for the > derived one. Then the classification HW specified by the user could be > qualified by the intended classification scheme, A or B, to determine which > path is the correct root-to-node path for that node. > > As a follow-on, we could relax the requirement that each node be a > ManagedObject and instead only require that each classification scheme be a > Managed <space> Object with metadata captured in a corresponding > ManagedObject. Then one could easily create new classification schemes > using existing classification schems for its various nodes, with no > abiquity when questions involving predecessors, descendents, or levels are > posed. The UML diagram I distributed earlier this week defines the > necessary subtype relationships and associations among > ClassificationScheme, ClassificationItem (or Node), and > ClassificationLevel. The diagram is missing the notion that a > ClassificationItem could reference another ClassificationScheme, but that > is an upward compatible extension. > > -- Len > > p.s. May I again respectfully submit that Managed <space> Object be called > a RegisteredObject, or possibly a ManagedObject if that fits better with > ebXML terminology in other working groups, and that what's currently called > ManagedObject be re-named a RegistryEntry. This would relieve untold > confusion! Too many people think of the managed object being the Profile > or BusinessProcess that is registered instead of the metadata for that object. > > > > > At 06:18 PM 9/28/00 , Nieman, Scott wrote: > >The dialup information is: > >USA: 800.892.0357 > >Sorry no toll-free for International callers: usa 612.352.7899 > >25 call-in sites are reserved > > > >Meeting ID: 0942 > > > >Agenda: > >1) registry service v0.8 review > >2) repository information model > > > >Scott > > ************************************************************** > Len Gallagher LGallagher@nist.gov > NIST Work: 301-975-3251 > Bldg 820 Room 562 Home: 301-424-1928 > Gaithersburg, MD 20899-8970 USA Fax: 301-948-6213 > ************************************************************** >
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC