首页 \ 问答 \ ElasticSearch索引与搜索时间分析器(ElasticSearch index vs search time analyzer)

ElasticSearch索引与搜索时间分析器(ElasticSearch index vs search time analyzer)

 遇到一个问题，这让我觉得我没有完全理解ElasticSearch 5.5中的索引与搜索时间分析。  
 假设我有一个只有name和state的人的基本索引。 为了简单起见，我将al => alabama设置为唯一的状态同义词。  
PUT people
{
  "mappings": {
    "person": {
      "properties": {
        "name": {
          "type": "text"
        },
        "state": {
          "type": "text",
          "analyzer": "us_state"
        }
      }
    }
  },
  "settings": {
    "analysis": {
      "filter": {
        "state_synonyms": {
          "type": "synonym",
          "synonyms": "al => alabama"
        }
      },
      "analyzer": {
        "us_state": {
          "filter": [
            "standard",
            "lowercase",
            "state_synonyms"
          ],
          "type": "custom",
          "tokenizer": "standard"
        }
      }
    }
  }
}
 
 我的理解是，当我索引一个文档时， state字段数据将被索引为扩展的同义词形式。 这可以运行测试：  
GET people/_analyze
{
  "text": "al",
  "field": "state"
}
 
 返回  
{
  "tokens": [
    {
      "token": "alabama",
      "start_offset": 0,
      "end_offset": 2,
      "type": "SYNONYM",
      "position": 0
    }
  ]
}
 
 看起来不错，让我们索引一个文件：  
POST people/person
{
  "name": "dave",
  "state": "al"
}
 
 并执行搜索：  
GET people/person/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "term": {
            "state": "al"
          }
        }
      ]
    }
  }
}
 
 它什么都不返回：  
{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}
 
 我希望我的搜索中的al可以通过相同的us_state分析器运行并匹配我的文档。 但是，如果我将查询更改为：  
 "term": { "state": "alabama" } 

Running into a problem which makes me think I don't fully understand index vs search time analysis in ElasticSearch 5.5. 
Let's say I have a basic index for a person with just a name and a state. For simplicity I have set al => alabama as the only state synonym. 
PUT people
{
  "mappings": {
    "person": {
      "properties": {
        "name": {
          "type": "text"
        },
        "state": {
          "type": "text",
          "analyzer": "us_state"
        }
      }
    }
  },
  "settings": {
    "analysis": {
      "filter": {
        "state_synonyms": {
          "type": "synonym",
          "synonyms": "al => alabama"
        }
      },
      "analyzer": {
        "us_state": {
          "filter": [
            "standard",
            "lowercase",
            "state_synonyms"
          ],
          "type": "custom",
          "tokenizer": "standard"
        }
      }
    }
  }
}
 
My understanding is that when I index a document that the state field data will be indexed as the expanded synonym form. This can be tested running: 
GET people/_analyze
{
  "text": "al",
  "field": "state"
}
 
which returns 
{
  "tokens": [
    {
      "token": "alabama",
      "start_offset": 0,
      "end_offset": 2,
      "type": "SYNONYM",
      "position": 0
    }
  ]
}
 
Looks good, let's index a document: 
POST people/person
{
  "name": "dave",
  "state": "al"
}
 
And perform a search: 
GET people/person/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "term": {
            "state": "al"
          }
        }
      ]
    }
  }
}
 
which returns nothing: 
{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}
 
I would expect the al in my search to be run through the same us_state analyzer and match my document. However, the search does work if I change my query to: 
"term": { "state": "alabama" } 

原文：https://stackoverflow.com/questions/50614230

更新时间：2022-08-25 08:08

最满意答案

 命名约定中最重要的是一致性。 只要它是理智且一致的，你几乎可以找出任何命名约定。  
 话虽这么说，在这种情况下我的名字可能会更加冗长。 路径可能已经足够好了，但我更UserLib.js根据你的例子看到UserRoutes.js ， UserModel.js甚至UserLib.js 。  
 在我的一些node.js项目中，我甚至不使用.js扩展名。 我的路线例如是user.routes 。 很容易根据不同的扩展名更改编辑器中的语法突出显示。 

The most important thing in your naming convention is consistency. You can figure out pretty much any naming convention as long as it is sane and consistent.  
That being said, I would probably be more verbose in my names in this case. Paths might be good enough, but I would rather see UserRoutes.js, UserModel.js and maybe even UserLib.js based on your examples.  
In some of my node.js projects, I have even taken to not using a .js extension. My routes for instance would be user.routes. It is easy enough to change the syntax highlighting in editors based on different extensions.

ElasticSearch索引与搜索时间分析器(ElasticSearch index vs search time analyzer)

最满意答案

相关问答

C＃中的命名约定是什么？(What are the naming conventions in C#? [closed])[2023-08-18]

JavaScript命名约定[关闭](JavaScript naming conventions [closed])[2023-05-21]

线程的命名约定(Naming conventions for threads?)[2023-02-04]

去const命名约定(Go naming conventions for const)[2022-06-01]

PostgreSQL命名约定(PostgreSQL naming conventions)[2023-09-04]

在不事先知道命名约定的情况下调用javascript数组(Calling a javascript array without knowing naming conventions in advance)[2023-07-30]

Facebook风格命名约定[关闭](Facebook Style Naming Conventions [closed])[2022-11-07]

使用nodejs时的JavaScript命名约定(JavaScript naming conventions when working with nodejs)[2023-06-09]

NodeJS的命名约定是什么？(What is the naming convention of NodeJS?)[2022-07-18]

Spotify命名约定(Spotify Naming Conventions)[2023-07-20]

相关文章

最新问答