Subject:
|
Re: Element search with fuzzy categories
|
Newsgroups:
|
lugnet.admin.database
|
Date:
|
Mon, 11 Jan 1999 01:08:32 GMT
|
Viewed:
|
941 times
|
| |
| |
"Tim McSweeney" <tim##NO_SPAM##@ams.co.nz> writes:
> > Food for thought: If there were a category called desert, you'd want Egypt
> > to come up under desert, but not mummies (right?). How can "stops" be put
> > in to halt expansions like that?
>
> Why would you not want mummies in the desert? Ok I can sort of see why, but
> they are related, (By about 80% :)
Well, I guess it comes up from that I'd expect desert to go to Egypt and
Outback maybe but I'd probably be surprised how mummies came up in a query on
desert. Maybe if it told me that mummies came up indirectly through Egypt...
> The percentage cutoff would be useful here. If the cutoff is set to x%
> then once a subgroup falls below x% relative to the _Original_ query (with
> all the modifiers between the two multipled in) then the search should stop
> exploring that branch.
> Maybe The percentages need to be tweaked a bit.
I'm thinkin' there probably are many cases where the percentages might
naturally be as high as 90% or 100% -- so there'd still need to be stops
somehow to avoid artificial/contrived contortions in the percentages. The
desert->Egypt->mummies example I gave is a poor one, so I'll try to think up a
better one.
> Questions.
>
> How does the search currently know when to stop?
You mean in the /dbtoys/elementsearch/ example? There, it just expands
everything on downward until it hits zero -- no percentage cutoff there.
> How does it deal with circular references (eg. Tyre=100%Tire Tire=100%Tyre)
It uses a "visited" table/checklist/hash so that it doesn't go down sub-graphs
more than once. However, it -will- re-explore and re-expand a sub-graph if
ever a fuzzy-percentage on something is greater on subsequent encounters than
it was on previous encounters. It uses a queue to track all pending sub-graph
exploration needs, and orders these by current fuzzy-weight to minimize the
amount of sub-graph re-exploration.
> If a category is reachable by Two different paths through the concept space
> then which does it follow? (The one with the highest % would be my guess)
Yes, first the highest, and it will only re-explore it if it re-encounters it
again at a higher percentage. If it re-encounters it again at a lower
percentage, it ignores it because the percentages are multiplicative rather
than additive.
> How are you going to get al the data and relationships entered it seems
> rather tedious.
See <http://www.lugnet.com/plan/> in the "Phase II" section.
> Congratulations on a cool search engine!
Well, thanks. :) I don't know that it's anything new, though -- the algorithm
probably was invented and published 20 or 30 years ago by someone... I haven't
scoured the journals to see...
--Todd
|
|
Message has 1 Reply: | | Re: Element search with fuzzy categories
|
| (...) on (...) Egypt... That sounds like a good idea. If the results told you something of the logic the engine followed to get to each hit you would have a better chance of rewording your query to avoid the spurious results. from a UI point of view (...) (26 years ago, 11-Jan-99, to lugnet.admin.database)
|
Message is in Reply To:
| | Re: Element search with fuzzy categories
|
| Wow! (...) Why would you not want mummies in the desert? Ok I can sort of see why, but they are related, (By about 80% :) The percentage cutoff would be useful here. If the cutoff is set to x% then once a subgroup falls below x% relative to the (...) (26 years ago, 10-Jan-99, to lugnet.admin.database)
|
5 Messages in This Thread:
- Entire Thread on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
This Message and its Replies on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
|
|
|
Active threads in Database
|
|
|
|