ebxml-regrep message

Subject: FW: re "What is a Repository?"
From: "Nieman, Scott" <Scott.Nieman@norstanconsulting.com>
To: ebxml-regrep@lists.oasis-open.org
Date: Thu, 30 Dec 1999 11:02:02 -0600
<Response>
<Intro>Yes, everyone I have been on vacation, and will be on vacation
tomorrow 12/31 as well.  I will TRY to respond to some the messages today.

Terry, here are some responses to your Notes.  And YES, this MAY seem like
technical mumbo jumbo, but I believe we need to get down to this level to
deliver a specification/ system that provides SIMPLE interfaces.  The
SIMPLER the interface, quite often the more difficult the implementation -
"information hiding" at its best.  
</Intro>
<Subject>Mapping to the M3 level
<Terry>"unnecessary work to do the decomposition"</Terry>
<Scott>I do not agree that it is unnecessary to map to M3.  If a goal is to
support the software development life cycle (SDLC [ISO 11179]), it seems
mandatory to me unless you have an infinite budget.  As Giuseppe Facchetti
from IBM pointed out in his XMI presentation, there is a significant time,
labor,
and cost savings to map from M2 to M3. Mathematically this is combinational
theory:  there are [n!]/[r!*[n-r]!] mappings to map M2 to M2 where n is the
number of M2 levels to map taken r at a time where r=2.  Therefore the M2 to
M2
approach ultimately results in unmanageable spaghetti.

By mapping an M2 instance to M3 the total number of maps is 2n.  In other
words, if I wanted to generated Java code from XML Schema, I only need to
map XML Schema
to the M3 level ONCE and let the M3 to Java mapping accomplish the code
generation.  Therefore, once I have developed a map from the XML Schema
metamodel to the M3, I am done with my work and I can use a transformation
tool to persist an XML Schema document into the repository.  Otherwise I
would be busy mapping XML Schema to Java syntax one day (more like months -
which I see
some XML vendors attempting to do this now -- oops!), C++ next, Smalltalk,
Perl, Python, and so forth.  This approach suggests that it also becomes the
responsibility of the Java community to manage the M3 to Java mapping (which
they already have
signed up for from a MOF perspective).  Same for the C++ community etc.,
everyone manages their own destiny.

DTDs and Schema could easily be served up in the M3 form, but they would
likely be a parsed and reconstructed on demand, and if performance becomes
an issue, parsed AND stored in their original form as a blob or file.

Finally, I believe, even though this is out for debate, that a truly
bidirectional isomorphoric repository cannot exist at the M2 level since one
M2 metamodel may produce more or less artifacts than another M2 metamodel
[ref: Object Oriented Metamethods, Brian Henderson-Sellers 1997].  That
means the M2 repository has a specific, limited focus, instead of focusing
on the SDLC.  It is also not extensible.  Therefore a M2 level based
repository may not be able generate a mapping to another M2 metamodel since
the M2 repository may not be able to completely hold specific information
required by the other M2 metamodel.  An M3 level repository allows one to
"fill in the blanks" and store the required information for the targeted M2
physically in the repository.  Another M2 metamodel may never need that
stored information so it would be ignored in its mapping
from the M3.

I will try to put a page up on the site after the new year trying to clarify
this.
</Scott>
</Subject>
<Subject>Scope of ebXML
<Terry>"For ebxml purposes, which I take to be the storage of schemas for
defined e-commerce documents"</Terry>
<Scott>I would like to relate this to the eCo Framework if possible.  In
that context, I believe our domain is the Services, Interactions, Documents,
and Information Items, not just documents.   Certainly, TMWG has been
focused on the Services and Interaction layers through the promotion of UML
business process modeling, as well as generating the Documents and
Information Items from these models. (Personally I view the eCo Framework as
the closest 'physical' thing to the ISO 14662 Open-edi Reference Model, if
minor tweeks are made to it.)  In the formation of the ebXML project, there
was much discussion surround reuse of the OASIS repository effort, to see if
it could also fit the needs of TMWG for storing UML business process models.


One could argue that we may need to be responsible for all the eCo "type
registries" as well, if the Architecture group wants to adopt eCo.
</Scott>
</Subject>
<Subject>Why
<Terry> "I am not sure that "why" is required, although I can see the point
of it."</Terry>
<Scott>Well, to be honest, this uncertainty is one of the very reasons why
traditional EDI has failed, because the EDI abuser-base needs to know what
context data should be used and what problem it solves (the why). We learned
that the best way to put data into context was though process modeling, and
that business process improvements can be made through analysis of these
models and evaluation of alternatives.  This type of analysis is why larger,
established organizations have been able to eliminate the transmission of
purchase orders, invoices, etc.

I relate the "why" to the "Services" and "Interaction" layers of eCo.

I want to be able to post the question: "I want to buy a DSL modem, using my
corporate purchase card."  Parsing this, the "DSL modem" relates to the
"services layer", and the "buy" and "corporate purchase card" relate to the
"interaction layer".  "Why" comes into play because additional information
may be required since the corporate purchase card is not exactly a VISA card
and we may need to know who the corporation is and qualify them (another
interaction potentially).
</Scott>
</Subject>
<Subject>How
<Terry>"Even "how" isn't obviously required."</Terry>
<Scott>The "how" allows traceability throughout the SDLC to the "tools" that
were used in refining a particular model.  For example:  Someone had an old
SGML DTD (M1 level), and loaded it into an M3 repository. I now realize
through searching that some of that DTD is useful to extend my UML model
since it covers a specialization that is new to me.  I add its contribution
to the model and I can accurately track my UML models history through
versioning.  It could also identify an XML Schema was manually created using
an editor from one that was autogenerated off a UML model.
</Scott>
</Subject>
<Subject>Component Diagram / Locator Service
<Terry>"I can't see any need for the M2 if the objects in the M3 have
identifiers".  It's true if the M1 uses, for example, URNs, it needs a URN
resolution service, but I don't see why that's an M2.</Terry>
<Scott>I knew this would throw people off.  Associated with the URN is a
fragment of the metadata, right? (e.g. a tagname like "firstName" or a DTD
itself).  ANYTIME you are storing M1 level metadata instances in a database,
you need a DB schema that describes "data about metadata".  This DB schema
is at the M2 level which is defined as a "metamodel" (X3.285 is accurately
titled).  The URN is a pointer/identifier about that metadata instance to
the specific repository instance.  Therefore the resolution/locator service
has certain
metadata search capabilities and links to repository instances if more
information is needed.  The locator service contains enough of a subset of
the metadata to
enable this search AND contains the URN to point at the repository.  The
registry is purely administrative, storing ONLY information like "Joe Young
from Norstan Consulting registered XYZ DTD on December 28, 1999" etc.   Its
database is at the M1 level since its instances are actual company, people,
date/time object instances. 

What we are trying to show through the component diagram is since there are
different types of information to be stored (M0 administrative object
instances and M1 metadata object instances), it is best to partition them
into separate
software components since their functional scope is unique and their
software interfaces are
going to be different (which will be clearer as the modeling continues).
They MAY physically be deployed on the same server that may be considered
the "registry", but that is not necessarily the case since different
technologies may be used in its implementation.
</Scott>
</Subject>
</Response>
-----Original Message-----
From: Terry Allen [mailto:tallen@sonic.net]
Sent: Thursday, December 23, 1999 12:57 PM
To: ebxml-regrep@lists.oasis-open.org
Subject: re "What is a Repository?"


Notes on "What is a Repository Anyhow" (metalevels.html)
Terry Allen

This makes sense to me; I take note that the goal is "to store
the model so that a development tool could use the information",
which is more specific that what OASIS is doing, and requires,
as the OASIS spec does not, that the contents of data element
dictionaries be decomposed into a common format - or, in
the words of this document, "mapped to the M3 meta-metamodel
level."

OASIS isn't requiring this because one of our goals is to enable
DTDs and schemas to be served for parsing of instances, and
it seems like unnecessary work to do the decomposition.  For
ebxml purposes, which I take to be the storage of schemas
for defined e-commerce documents, the decomposition makes sense
(although for e-commerce in general there will be some documents
that are treated as attachments and the schemas for which needn't
be stored in this repository).

"The question is whether these documents should be parsed into their
subatomic artifacts and stored into indexed tables in a relational
database, or do we need to wait for query technologies such as
XQL to enable high performance search capabilities?"

Not my area of expertise, though we can ask for opinions from
others of the OASIS Regrep TC.  It isn't a new question, and
I believe the SGML world has learned to live with relational
databases just fine.  I do know at least one person who's not
using a database at all (and it's not me using my file system
and grep ...).  The issue is probably one of what tech is available
for use at our target date (whenever that is).

"The Registry - The main aspect is that a registry implies "to register"
meaning: what am I registering, who am I, when did I register it (also
implies what version), how was it created, why did I create it
 (what problem domain) and where is this information located.
The "who", "when" and "how" are administrative functions.
The complications occur with the "what" and the "where". If the
"what" really resides in a repository "somewhere", there must
be sufficient information about the "what" to point to the "where"
 WITHOUT a complete replication of the repository or multiple
repositories for that matter (try stating that 100 times fast). Specifically

some metamodel representations such as UML can generate
great VOLUMES of rich, semantic information. What is the answer
to limiting the amount of information in the registry, but ensuring
enough to find the appropriate information within the repository?"

I am not sure that "why" is required, although I can see the point
of it.  Even "how" isn't obviously required.  For OASIS we made this
the 11179 administrative metadata plus a couple classifications
(see the some of the .ent files at the OASIS site for these).  I suppose
the answer here relates to "what do you want to see in the interface
to the registry?"

X3.285 and metamodels.  I agree it's confusing, and in fact I'm
waiting for the next Open Forum to get caught up on X3.285 again.
Intellectually I'm uncomfortable with the OMG requirement for a M3
level, but not enough to worry me.  Especially as others seem to
like it!  The diagram with M1, M2, and M3 doesn't not make sense
to me; I can't see any need for the M2 if the objects in the M3 have
identifiers.  It's true if the M1 uses, for example, URNs, it needs a
URN resolution service, but I don't see why that's an M2.

Reading the description below the diagram, I have to say I think
the "locator service" is just part of the registry (the M1).  Why
isn't it?

As for the conclusions:

1. the UML Use Case modeling continue to serve as the basis of the
ebXML Registry and Repository effort without jumping into an
implementation,

2. X3.285 information utilized as much as possible with potential
convergence of the repository functionality to a pure meta-metamodel
such as the Meta Object Facility, and

3. the revised work effort be submitted to SC32 and X3L8 for review
and incorporation into ISO 11179.

are all fine by me.  I believe that what's in the OASIS spec is a bottom-up
approach to the problem that ebxml is attacking from the top down, and
that as we're in agreement on the core metadata, we'll be able to hook
up in the middle, perhaps with some adjustments on either end.  I'm
still concerned about XMI, but I'm sure I'll learn more in Santa Fe, if
not sooner.











regards, Terry