[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Tags (searching)



On Jun 12,  9:41pm, Joe Cooper wrote:
> Subject: Re: Tags
> > [...]
> > Again, I don't think we have anything to define subsets, except
> > to remove those "depreciated" tags.  Until we have some search
> > engines that take advantage of the DocBook markup, there's no
> > reason to define any more than "ok to use" and "don't use"
>
> Yes and no.  If the search engine is designed to our task, then wouldn't
> it be nice (and easier for the search engine) if we all agreed to mark
> out our figures in the same way?  And our code examples?  And our
> command line examples?  Most of these things are obvious from the
> DocBook specification really...but maybe some of it needs to be
> codified.  I don't know.
>
> > > Yes. How about "Subsets". We need three -
> > > Required, permitted, and searchable.
> >
> > Hmm, that would make four, the way that I count.  However, they would
> > definitely have some overlap.  Required would be the ones that you
MUST
> > have, in order to have a valid HOWTO document.  Permitted would be
> > ones that are allowed in HOWTOs, but not required.  Searchable would
> > be some from both sets, although not necessarily all of either set.
> > These would be the ones that our search engine/viewer understands.
> > The last set would be restricted tags, which would basically be any
> > tags that we don't want people to use.
>
> Ok.  Four sets it is.  And work on making the 'searchable' set match the
> superset containing both required and permitted?  No reason not to keep
> improving the search engine until it can provide complete indexing of
> the entire LDP.
>
> > I've put some minor thought into doing this, but it's a big enough
> > project that I need to get back up to speed with programming first.
>
> I know the feeling.  I can whip out a perl script to do little
> things...but a database caliber query tool is a very daunting task.  I
> keep hoping a perl master will step up to the plate.  My $50USD offer
> still stands for an intelligent context sensitive DocBook search script
> that works well for the LDP.  Others are welcome to add to that pot.  I
> imagine if the pot gets big enough someone will take the week or two it
> will take to make something happen.

Once AGAIN I want to re-interate...

1) There is a tag/field/structured search capability that is near
   completion -- the OMF metadata framework.

   Please see http://metalab.unc.edu/osrt/omf/

2) One goal of the LDP *should be* to utilize this technology as a
   primary tag/field/structured search capability, in addition
   to the "shotgun" full-text search.

   There is no sense to re-inventing the wheel if this provides the
   necessary capabilities and can be easily hooked into the LDP.

3) I initially helped to populate the database with a set of HOWTOs
   and guides which I filtered to XML "records" from their native
   linuxdoc. This filter needs some work, needs to be exapnded to
   use DocBOok, but the concept is in place and will work!

   I intend to continue with this work/effort, unless I hear otherwise
   from the LDP.

I also tend to believe that setting up "tag sets" is far too limiting
and taking the inherent power of the SGML away from the author. The DTD
is already designed/constructed to allow for "tag sets", dependant upon
the top-level container tag chosen - <book> or <article>. Let's
concentrate on the development of solid, robust templates and a brief
set of authoring rules/guidelines (such as those Poet mentioned -
no minimalization of tags, no use of deprecated tags, etc). Getting
into specific tag usage and/or customized tags leads us astray from the
standard...not a good thing to do.

imo, of course.

regards,


-- 
Greg Ferguson     - s/w engr / mtlhd         | gferg@sgi.com
SGI Tech Pubs     - http://techpubs.sgi.com  | 
Linux Doc Project - http://www.linuxdoc.org  |


--  
To UNSUBSCRIBE, email to ldp-discuss-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org