[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Subject: Topic 1: Terminolgy alignment
Background -------------- As I had indicated there were 4 major areas identified in the excellent review meetings we have had on the 2 draft specs. We need team discussion before we reach agreement on 4 broad areas for discussion: To recap, these 4 ares where: 1. Termonilogy alignment 2. Managed Object Model 3. Association Model 4. Classification Model This intent is for us to start separate threads of discussion on each of these major areas with the goal of resolving the issues in each area. If an issue happens to overlap threads then lets put it in the more specific area. This email starts the discussion on the first topic ------------------------ Rationale And Philosophy ------------------------------ First I would like to give the detailed rationale in the thought process that led me to the terminology choices made in the Repository Information Mode (RIS) document. Much of this thought process was developed in earlier drafts of RS spec (v01. to 0.4) that were reviwed by the team. I used all the relevant work that I knew about (ISO 11179, OASIS, UDDI, UML, BP, CC, TP working groups, past experience) as input, with the understanding that I would pick the gems from each, improve as needed and avoid the mistakes I perceived they had made. I tried to be consistent in terminology where it made sense and improved it where it made sense. I found the OASIS document to be better than the ISO document. So I borrowed more from OASIS. Some examples of exact terminology from OASIS are (description, globalName, mimeType, Organization, Classification, ClassificationItem, ClassificationScheme, ClassificationLevel, Query, Request, Association, SubmittingOrganization). As I said earlier, OASIS should be quite pleased at the amount of work that is already leveraged in our current specs. Some examples where we could obviously avoid past mistakes were (regStatus => registryStatus), since it is more obvious. Is regStatus registry status or registration status? Well if it is actually registration status then I have made my point that I misunderstood what it meant. And besides what is the cost of a few characters in a class definition? Some basic conventions I followed where those that transcend RegRep but are established standards today. For example I used whats called camel case naming convention where class name begins with upper case (upper camel case) and attribute names begin with lower case (lower camel case). It is called camel case beacuse of the humps; Multi-word names always use upper case in subsequent names after the first (e.g. 'registryStatus' for attribute and 'RegistryStatus' for class name). So for example description was with lower case rather than upper case as Len suggested it should be. I am a little surprised by that level of expected consistency ('d' Vs. 'D') with OASIS or anything else. I hope that the team does not expect us to use OASIS verbatim. Team share your thoughts about the correctness of above philosophy. Attribute name choices --------------------------- > Excerpt from Len's suggestions > 16) I'm OK with the attributes for RegistryItem in Section 3.2 as a > starting point, but think ebXML will want to add more registry-specific > attributes similar to what OASIS does. In particular, the following maps > the proposed ebXML attribute names to OASIS attribute names: General issue I feel stronlgly is that attributes start with lower case and follow camel case. This is established convention and it makes it easy to distinguish attributes from classes. > > id --> RAitemId RAitemId is not obvious. I assume it means Registry Authority Item Id. But what it really is I assume is the id for the RegisteredItem in OASIS. If so it should be called Registered Item Item Id. But then we have two items. My rational was simply that the ManagedObject class has an id so lets call it 'id'. In fact I have since changed it for v0.2 to 'guid' based on feedback from Tech Architecture (TA) team that ebXML TA wants to institutionalize Global Unique ID across the board. GUID are also an established standard. I liked that idea a lot. So I recomend we change id to guid and we will have alignement with TA for a reason that I can support in good consience. > uri --> ObjectLocation uri is more obvious in this age of the internet. It gives you an idea of the data type and underlying concept without even saying it. Team share thoughts. > type --> DefnSource, PrimaryClass, SubClass I actually would recommend getting rid of type all together. The class already knows its type. That is why we have a class hieracrchy and not just ManagedObject. Also, we had established in earlier meetings that the OASIS PrimaryClass, SubClass was inadequate. My approach is to identify tyTeam share thoughts. > name --> CommonName I will next recommend we get rid of globalName. If that is done there is only one name associated with the object in its metadata and calling it name should suffice. After all what does 'Common' add or convey? Team share thoughts. > globalName --> AssignedURN In the review I pointed out that OASIS had this mixing of 2 distinct concepts and needs: -The need for unique identification for reference puproses (ala database primary key) -The need for a user friendly name associated with that object I pointed out the the user friendly name does not need to be unique. Only the guid needs to be. ebXML TA has an established position on GUIDs as I said earlier and one that I strongly endorse. I will go on a long limb here and say we get rid of globalName altogether. Team share thoughts. > description --> Description > mimetype --> MimeType Camel case is the clear choice. This sort of critique makes me uneasy. OASIS should be happy we are consistent to the degree that this document already is. > majorVersion --> partially to Version > minorVersion --> partially to Version This is not a naming issue but a modeling one but.... I chose to not have a separate Version object in the model which encapsulated the two objects because it seemed to be unnecessary overhead. That would have led to a reference from a ManagedObject to a Version object. In a relational DB implementation it would be an extra join for no good reason that I could think of. So unless we identify a good reason I vote we opt for simplicity (fewer classes in model, fewer joins, less comlicated queries in implementations). Team share thoughts. > registryStatus --> RegStatus I have already made the point on obviousnes. Is RegStatus registry status or registration status? Whichever it is, why dont we spell it out? What do we save by abbreviating? 4 bytes in a repository that is likely to hold terabytes? FWIW, I originaly called it just plain status. I borrowed what made sense from OASIS that it is better to say it is a registryStatus so we should use the better name from OASIS, only get rid of the abbreviation. This is a good example of borrowing a good thing but improving upon it if needed. ExternalData Vs. RelatedData ------------------------------------ > 9) With the above clarifications, I'm OK with the content of Section 2.3.2, > although I'd prefer a term for these objects as something other than > "External Object". How about "Related Data". Note that there's no > requirement that these things even have a persistent object identifier, > just a Name and a URL is sufficient. The concept is exactly that of OASIS RelatedData. I thought it was a good idea, and I borrowed. It is a named hyperlink with none of the overhead of the managed object. It was a good idea but I felt it needed improvement. My problem with RelatedData was that it was ambiguous. The key concept is that the object is external and that it's life cycle is not managed by RS. Related is not the key concept because two objects that are associated could be considered as related. A TPA that use a Process could be considered Related to the innocent bystander. External was my first stab. I am open to other suggestions. However, Related is ambiguous and ambiguity is something I would like us to avoid. The 'Data' part did not feel right because I felt that everything in the model is an Object (more on this in topic 2 tommorow). Having a common base class for everything in the model has been a very valuable thing in SmallTalk and Java compared to C++. It gives you a place to put functionality that you want available consistently in the entire model. Some alternatives that I had thought of were UnmanagedObject, HyperLink, LinkedObject. In retrospect I think I actually prefer LinkedObject more than ExternalObject. Team share your thoughts. Term Object --------------- I believe it was Scott or Len who said in meeting that we should get rid of the use of 'Object' every where. In initial drafts I used the term Document instead of object. So there was a DocumentManager instead of a ObjectManager. It was pointed out in previous reviews that based on decisions made in Brussels meeting, the term Object was better since it generalized documents and allowed for content that was not documents (e.g. a java jar file with code implementing some process). I immediately loved the excellent idea and changed all references from Document to Object. IMHO, Nothing represents and instance of 'stuff' better than 'Object'. It is more obvious than 'item' or 'component' oe 'entry'. It is intuitively obvious to most programmers. BTW we need to make sure that we have a process in place that does not revisit old decisions without a solid reason. It will slow us down otherwise. Scott please take note of process issue. 'Managed Object' Vs. 'Registered Object' ------------------------------------------------- > 4) In Section 3, I'm having some difficulty with the treatment of nearly > everything as a Managed Object. Clearly, everything the information model > talks about is an object managed by the Registry/Repository, but not > everything has the attributes that are specified for Managed Objects. > Figure 4 says that things like Associations, Classifications, > ClassificationLevels, etc., are all subclasses of ManagedObject, but > clearly not every instance of one of these items has all of the atrtributes > specified for Managed Objects in Section 3.2. I think we need to scrap the > notion of Managed Object as being too broad. A better notion is Registered > Object. Then we could say that every instance of a class in the UML model > is a managed object, but not every managed object is a registered object. > 5) I'd prefer to make a further distinction between Registered Object and > an item in the Registry. Normally, every item in the registry would point > to a registered object - but there are exceptions! A registered object may > be withdrawn, but there still is an item in the registry explaining what it > was. Other applications may be pointing to a registry item and we want > that pointer to make sense, even if the registered object itself is > withdrawn. The registry item would still have metadata, giving the status > of the registered object as "withdrawn" and the effective date of the > withdrawal. Even after an item is replaced, deprecated, or withdrawn, a > user could ask "does my registered DTD point to any registered objects that > have since been modified, enhanced, or withdrawn?". > > > 6) I think the distinction between registered object and registry item can > be clarified by doing something analogous to what OASIS does. I.e. a > registry item is an instance of the RegistryItem class defined in the UML > information model, and a registered object is an instance of some virtual > Repository class defined elsewhere. The only access to the Repository class > is through Registry Services. In my subsequent comments, I'll use the terms > "registered object" and "registry item" with this meaning. I'll also assume > that there's been a global replacement in the specification of "Registry > Item" for "Managed Object", of "registry item" for "managed object", and in > Section 2.3 of "registry item" for "object". I believe all of this is > consistent with the basic definition in Section 2.3 of managed object as an > instance of metadata, not the content of a registered object. These detailed comments have overlaps with topic 2, 3 and 4. However, I will try and address the terminology aspects here with details on later topics tomorrow. Here is my thought process behind why ManagedObject. I view two types of content in the repository. Content whose lifecycle is managed by the repository (i.e. submitted objects) and content whose life cycle is not managed by the repository. The first, I refer to as a managed object. The second I refer to as a external object (which we have already discusses earlier). ManagedObject is also consistent with Object Management Service in RS. It is a service that manages the life cycle of managed objects. I think the real issue is that there is confusion between the use of the term 'managed object' when refering to: -the actual content (e.g. a DTD, a TPA) which in many cases is an XML document and not an instance of a class in the model -The class in the model that provides meta-data for the actual content, which is in the model and may be implemented as a relational table, a java class, a class in an OODB in an implementation of RR. I actually tried to avoid the confusion in lines 155-158 of RIF v0.1. I guess I failed. I also tried to use 'ManagedObject' Vs. 'managed object' where the former was the metadata while the later was the actual content. FWIW, earlier version of RS spec refered to ManagedObject as ObjectMetaData. Then I changed for the following reasons: -It is really ManagedObjectMetadata -Once we have sub-classes of ManagedObject (e.g. Process, TPA) then do we now have ProcessMetaData, TPAMetaData? It seemed that if we could use the same name the relationship would be more obvious. I see a simple solution to the above confusion. We make sure that we state the ManagedObject' Vs. 'managed object' convention in the General Conventions section and make sure all use is consistent with the convention. Team share your thoughts. Closing Remarks -------------------- Len (and others) has made a ton of excellent suggestions in his very detailed comments (e.g. Organization has roles of RA SO etc. and not a base class for those concepts). Many of those comments fall in the future topics for discussion. Those will be thought provoking discussions and will not be obvious or easy to resolve. However, he has also made quite comments in this terminology topics which I frankly feel where overly prescriptive. I was thinking that the OASIS team would be very supportive of v0.1 because of the obvious salute to the quality of their work in the number of concepts, terminologies and ideas borrowed. I was actually quite surprised by the level of prescriptivity and expectation of conformance to the level of 'd' Vs. 'D'. If the expectation is that we should adopt OASIS then that is not just an RR issue. It has huge implications on almost all WGs because they would then have to align their meta-model work to be consistent. Decisions they have made such as TAs guid decision, would have to be scrapped. IMHO, we will do ebXML a dis-service with an approach like that. It will unravel much of the progress we have made so far. The philosophy I followed of beg/borrow whatever makes sense and improve as needed, is one that the other WGs are also following. Finally, I implore the team to focus on the real major issues of getting the modeling issues resolved (next few topics) many of which are in the rest of Len's email. They are complex issues that need our collective energies. We need to bring our lifes experiences to help get this right, but we also need to bring an open mind to do things better than we have done in our past projects. So let us resolve to quickly resolve the terminology issues and begin the hard work ahead. -- Humbly submitted, Farrukh
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]
Powered by eList eXpress LLC