ebxml-regrep message

Subject: RE: Review of thoughts on Ad Hoc Queries
From: David RR Webber <Gnosis_@compuserve.com>
To: "Nieman, Scott" <Scott.Nieman@NorstanConsulting.com>
Date: Tue, 09 Jan 2001 10:11:56 -0500
Message text written by "Nieman, Scott"
>
Should we have a straw vote on the OASIS 12/20 query approach vs. XPATH vs.
OQL?  If so, it needs to be discussed, detailed with mappings to the RIM,
then voted upon based on analysis.  these mappings could make the RS
document more complicated.  

e.g.,

Syntax            Mapping to RIM
OQL               Direct
XPATH             Requires creating virtual document views of RIM
OASIS             ????

It took us one full day to do this for XPATH, will it take another f2f for
the OASIS approach????

Regards,

Scott<

>>>>>>>>>>>>>>>>>>>

Scott,

Thanks for the assessment here.

I think we've so far been missing something vital here - what
Farrukh calls a 'breakthrough' or two.   It's tough getting everything
down on email - or even articulating this stuff clearly on calls
or in meetings sometimes.  Sometimes it takes a while for
everyone to understand all the nuances and facets.

Foreinstance I see the OASIS/NIST approach as basically a 
container in XML.  Nothing more.  It serves to deliver a 
simple set of locator terms.   It does not mandate any
particular query syntax.

In following this theme in the proposal I posted, I allowed
for four different locator terms. 

a) Simple XML tag + operator + value

b) XPath locator (path/node expression only) + operator + value

c) Specific focused queries on GUID, UID, URN, etc.

d) ANY = HTML style - I don't care about context - show me all
     hits.

e) ability to specify returned content mode.

The difference is that this avoids the complex XPath extensions
in favour of just the nodes and paths piece.

Why would we do all this?   What I see is that d) is the defacto
standard for the Web today.  By doing a) and b) we are giving
people a better pair of tweazers to pick the right piece with.
However we are avoiding the complexity of %like% (can I say
I do not see anything to like about %like% for our business
use?)

Then c) is the obvious focused queries that we can clearly
see are needed.  What c) overcomes is the need to write
complex joins and other cascading queries - because we 
can go straight to the right pieces.   As Farrukh noted this
works because registry is not a collection of random 
content - but content with classification with GUID and UID
associations - and exploiting this directly allows us to 
significantly lower the bar in terms of complex generalized
searching algorithms.  Simply put - we do not need OQL 
at this time to get us a functional registry.  Most especially
when - by Farrukh's own criteria - the need is to be able
to implement registry with even simple XML flat files
a XML parser and sub-directories at the bottom most level.

Then e) is a vital piece. OQL was never designed to work
with returning pieces of XML content.   We saw this 
shortcoming from the PoC in Tokyo - returning URL's to
the actual content instead of the content itself.  And then
there are slices into the content within nodes - one of the
key points of tree heirarchies and XML itself.  Providing
this is a mandatory need - and I believe part of what
CC is asking for in their analysis to be able to group
XML content returned.

OK - that said - do we need to take two days here and 
some of us work on a short few pager draft that spells
this out in more detail?  Including a set of access models
based on the RIM we feel we need (classification -> content
locations, and content <-> content associations, et al).
I'm still not clear on that part especially.  And this seems
to be the heart of the matter.   What is the level of complexity
we need and why?

Maybe we could break this down.  We could do that bit
first, and then look at the three approaches, OQL, XPath,
and containered focused querying (CFQ).  
Notice that CFQ extends the simple OASIS model to
fit the ebXML RIM and requirements.  My sense is we
are much closer than we were five days ago, and that
a few more days are worth spending at this point.

Also five days ago we were struggling for clear examples
in XML of content.   I'm not sure we have that solved yet.
Have we taken say the CPP for example, added
some classification structure for them, added
GUID and UID references to some sample company
registrations XML content, and then run some business
case queries - \industry = 'plumbing'  AND \city='Cleveland'
thru to sanity check it all?

Let's clearly understand the alternatives here - I'm not
sure reading all that's been presented thus far that I
yet do.

Thanks, DW.