ElasticSearch索引与搜索时间分析器(ElasticSearch index vs search time analyzer)
遇到一个问题,这让我觉得我没有完全理解ElasticSearch 5.5中的索引与搜索时间分析。
假设我有一个只有
name
和state
的人的基本索引。 为了简单起见,我将al => alabama
设置为唯一的状态同义词。PUT people { "mappings": { "person": { "properties": { "name": { "type": "text" }, "state": { "type": "text", "analyzer": "us_state" } } } }, "settings": { "analysis": { "filter": { "state_synonyms": { "type": "synonym", "synonyms": "al => alabama" } }, "analyzer": { "us_state": { "filter": [ "standard", "lowercase", "state_synonyms" ], "type": "custom", "tokenizer": "standard" } } } } }
我的理解是,当我索引一个文档时,
state
字段数据将被索引为扩展的同义词形式。 这可以运行测试:GET people/_analyze { "text": "al", "field": "state" }
返回
{ "tokens": [ { "token": "alabama", "start_offset": 0, "end_offset": 2, "type": "SYNONYM", "position": 0 } ] }
看起来不错,让我们索引一个文件:
POST people/person { "name": "dave", "state": "al" }
并执行搜索:
GET people/person/_search { "query": { "bool": { "should": [ { "term": { "state": "al" } } ] } } }
它什么都不返回:
{ "took": 3, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 0, "max_score": null, "hits": [] } }
我希望我的搜索中的
al
可以通过相同的us_state
分析器运行并匹配我的文档。 但是,如果我将查询更改为:
"term": { "state": "alabama" }
Running into a problem which makes me think I don't fully understand index vs search time analysis in ElasticSearch 5.5.
Let's say I have a basic index for a person with just a
name
and astate
. For simplicity I have setal => alabama
as the only state synonym.PUT people { "mappings": { "person": { "properties": { "name": { "type": "text" }, "state": { "type": "text", "analyzer": "us_state" } } } }, "settings": { "analysis": { "filter": { "state_synonyms": { "type": "synonym", "synonyms": "al => alabama" } }, "analyzer": { "us_state": { "filter": [ "standard", "lowercase", "state_synonyms" ], "type": "custom", "tokenizer": "standard" } } } } }
My understanding is that when I index a document that the
state
field data will be indexed as the expanded synonym form. This can be tested running:GET people/_analyze { "text": "al", "field": "state" }
which returns
{ "tokens": [ { "token": "alabama", "start_offset": 0, "end_offset": 2, "type": "SYNONYM", "position": 0 } ] }
Looks good, let's index a document:
POST people/person { "name": "dave", "state": "al" }
And perform a search:
GET people/person/_search { "query": { "bool": { "should": [ { "term": { "state": "al" } } ] } } }
which returns nothing:
{ "took": 3, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 0, "max_score": null, "hits": [] } }
I would expect the
al
in my search to be run through the sameus_state
analyzer and match my document. However, the search does work if I change my query to:
"term": { "state": "alabama" }
原文:https://stackoverflow.com/questions/50614230
最满意答案
命名约定中最重要的是一致性。 只要它是理智且一致的,你几乎可以找出任何命名约定。
话虽这么说,在这种情况下我的名字可能会更加冗长。 路径可能已经足够好了,但我更
UserLib.js
根据你的例子看到UserRoutes.js
,UserModel.js
甚至UserLib.js
。在我的一些node.js项目中,我甚至不使用.js扩展名。 我的路线例如是
user.routes
。 很容易根据不同的扩展名更改编辑器中的语法突出显示。The most important thing in your naming convention is consistency. You can figure out pretty much any naming convention as long as it is sane and consistent.
That being said, I would probably be more verbose in my names in this case. Paths might be good enough, but I would rather see
UserRoutes.js
,UserModel.js
and maybe evenUserLib.js
based on your examples.In some of my node.js projects, I have even taken to not using a .js extension. My routes for instance would be
user.routes
. It is easy enough to change the syntax highlighting in editors based on different extensions.
相关问答
更多-
Microsoft提供了一体化代码框架编码标准 ,其中包含一套完整的规则和准则。 (也可在这里 ) 本文档描述了Microsoft All-In-One Code Framework项目团队使用的本地C ++和.NET(C#和VB.NET)编程的编码风格指南。 There is the All-In-One Code Framework Coding Standards from Microsoft which contains a complete set of rules and guidelines. ...
-
我遵循道格拉斯Crockford的 JavaScript 代码约定 。 我也使用他的JSLint工具来验证遵循这些约定。 I follow Douglas Crockford's code conventions for javascript. I also use his JSLint tool to validate following those conventions.
-
线程的命名约定(Naming conventions for threads?)[2023-02-04]
据我所知,没有标准。 在这段时间里,我发现这些准则是有帮助的: 使用短名称,因为它们不会使日志文件中的行太长。 在开始时创建重要部分的名称。 图形用户界面中的日志查看器倾向于具有列列,而线列通常较小,或者由您读取其他所有列。 不要在线程名称中使用“thread”一词,因为它很明显。 使线程名称容易地grep-able。 避免类似的声音线程名称 如果您有几个相同性质的线程,则可以枚举它们的ID,这些ID对应用程序的一个执行或一个日志文件是唯一的,取决于您的日志记录习惯。 避免使用“WorkerThread”( ... -
去const命名约定(Go naming conventions for const)[2022-06-01]
标准库使用骆驼案,所以我建议你这样做。 第一个字母是大写或小写,具体取决于是否要导出常量。 几个例子: md5.BlockSize os.O_RDONLY是一个例外,因为它直接从POSIX借来。 os.PathSeparator The standard library uses camel-case, so I advise you do that as well. The first letter is uppercase or lowercase depending on whether you wa ... -
PostgreSQL命名约定(PostgreSQL naming conventions)[2023-09-04]
关于表名,案例等,普遍的惯例是: SQL关键字: UPPER CASE 名称(标识符): lower_case_with_underscores 例如 : UPDATE my_table SET name = 5; 这不是写在石头上,但强烈建议小写的标识符位,IMO。 Postgresql不引用时会对标识符进行异常处理(实际上将它们折叠成小写内部),引用时敏感地显示案例; 很多人不知道这个特质。 使用始终小写,你是安全的。 无论如何,使用camelCase或PascalCase (或UPPER_CASE ... -
在不事先知道命名约定的情况下调用javascript数组(Calling a javascript array without knowing naming conventions in advance)[2023-07-30]
你需要使用[] -operator: data.setValue(i, 1, response.d[i][columnLabel]); obj.property等同于obj['property'] 。 You need to use the []-operator: data.setValue(i, 1, response.d[i][columnLabel]); obj.property is equivalent to obj['property']. -
这不是人类的选择,它是一个缩小的类名。 Facebook规模的线路上的字节数是可测量且成本高昂的。 It's not a human choice, it's a minified classname. Bytes on the wire at Facebook's scale are measurable and costly.
-
命名约定中最重要的是一致性。 只要它是理智且一致的,你几乎可以找出任何命名约定。 话虽这么说,在这种情况下我的名字可能会更加冗长。 路径可能已经足够好了,但我更UserLib.js根据你的例子看到UserRoutes.js , UserModel.js甚至UserLib.js 。 在我的一些node.js项目中,我甚至不使用.js扩展名。 我的路线例如是user.routes 。 很容易根据不同的扩展名更改编辑器中的语法突出显示。 The most important thing in your namin ...
-
Node的默认约定是camelCase。 但是文件系统模块中的函数根据它们各自的POSIX C接口函数命名。 例如readdir , readlink 。 这些函数名称是Linux开发人员所熟知的,因此它们经常决定按原样使用它们(作为单个单词),而不是驼峰。 Node's default convention is camelCase. But functions in file system module named according to their respective POSIX C interf ...
-
Spotify命名约定(Spotify Naming Conventions)[2023-07-20]
这些命名约定不是由Spotify设置的,而是由内容提供商设置的,因此Spotify没有正式的规范。 解决此问题的一种方法是在出现错误时存储轨道名称,并从该数据中学习(甚至可以通过机器学习)约定。 希望有所帮助! Those naming conventions aren't set by Spotify, but by the content provider, so there's no formal specification from Spotify. One way you could approa ...