用例
当用户访问我的网站时,他们会遇到一个类似于 SO 的搜索框.他们可以使用计划文本搜索结果..net 问题"、封闭式问题"、.net 和 java"等.搜索的功能与 SO 略有不同,因为它将尽可能多地尝试数据库的架构,而不是直接全文搜索.所以.net 问题"只会搜索 .net 问题而不是 .net 答案(可能不适用于 SO 案例,这里只是一个例子),封闭问题"将返回封闭的问题,.net 和 java"问题将返回与 .net 和 java 相关的问题,仅此而已.
When a user goes to my website, they will be confronted with a search box much like SO. They can search for results using plan text. ".net questions", "closed questions", ".net and java", etc.. The search will function a bit different that SO, in that it will try to as much as possible of the schema of the database rather than a straight fulltext search. So ".net questions" will only search for .net questions as opposed to .net answers (probably not applicable to SO case, just an example here), "closed questions" will return questions that are closed, ".net and java" questions will return questions that relate to .net and java and nothing else.
问题
我对这些词不太熟悉,但我基本上想为 SQL 驱动的搜索做一个关键字.我知道数据库的架构,我也可以对数据库进行数据挖掘.我想知道在我尝试实施之前已经存在的任何当前方法.我想这个问题是针对所述问题的良好设计.
I'm not too familiar with the words but I basically want to do a keyword to SQL driven search. I know the schema of the database and I also can datamine the database. I want to know any current approaches there that existing out already before I try to implement this. I guess this question is for what is a good design for the stated problem.
建议
到目前为止我提出的解决方案看起来像这样
My proposed solution so far looks something like this
想法/建议/链接?
我经营一家数字音乐商店,采用单一搜索",根据关键字的出现次数和产品出现的模式对关键字进行加权,例如.具有不同的列,如艺术家"、标题"或出版商".
I run a digital music store with a "single search" that weights keywords based on their occurrences and the schema in which Products appear, eg. with different columns like "Artist", "Title" or "Publisher".
产品也与专辑和播放列表有关,但为了更简单的说明,我只详细说明产品关键字的索引和查询.
Products are also related to albums and playlists, but for simpler explanation, I will only elaborate on the indexing and querying of Products' Keywords.
Keywords
表 - 每个可能被搜索的词的加权表(因此,它在某处被引用),每个记录的数据如下:
Keywords
table - a weighted table for every word that could possibly be searched for (hence, it is referenced somewhere) with the following data for each record:
ProductKeywords
表 - 任何产品字段(或列)引用的每个关键字的加权表,每条记录包含以下数据:
ProductKeywords
table - a weighted table for every keyword referenced by any of a product's fields (or columns) with the following data for each record:
权重值表示单词出现的频率.匹配权重较低的关键字更独特",更有可能成为搜索对象.这样,经常出现的词会自动降低权重",例如.the"、a"或I".但是,最好在索引之前去除那些常见词的原子出现.
The weighting value is an indication of how often the words occurs. Matching keywords with a lower weight are "more unique" and are more likely to be what is being searched for. In this way, words occurring often are automatically "down-weighted", eg. "the", "a" or "I". However, it is best to strip out atomic occurrences of those common words before indexing.
我使用整数进行加权,但使用十进制值会提供更多功能,排序可能会稍微慢一些.
I used integers for weighting, but using a decimal value will offer more versatility, possibly with slightly slower sorting.
每当更新任何产品字段时,例如.艺术家或标题(这种情况不经常发生),数据库触发器在交易中重新索引产品的关键字:
Whenever any product field is updated, eg. Artist or Title (which does not happen that often), a database trigger re-indexes the product's keywords like so inside a transaction:
ProductKeywords
表中的产品相关以进行直接匹配.ProductKeyword
权重调整中减去 1 或 2.ProductKeywords
table for a direct match.ProductKeyword
weight adjustment.这篇关于SQL 搜索的关键字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!