首页 \ 问答 \ ElasticSearch - python中完成建议器的批量索引(ElasticSearch - bulk indexing for a completion suggester in python)

ElasticSearch - python中完成建议器的批量索引(ElasticSearch - bulk indexing for a completion suggester in python)

 我试图添加一个完成提示器来为我的Django应用中的搜索字段（使用Elastic Search 5.2.x和elasticseach-dsl）启用搜索类型。 在尝试了解很长一段时间后，我无法确定如何对建议者进行批量索引。 这是我的代码：  
class SchoolIndex(DocType):
    name = Text()
    school_type = Keyword()
    name_suggest = Completion()
 
 大宗索引如下：  
def bulk_indexing():
    SchoolIndex.init(index="school_index")
    es = Elasticsearch()
    bulk(client=es, actions=(a.indexing() for a in models.School.objects.all().iterator()))
 
 并且在models.py中定义了一个索引方法：  
def indexing(self):
       obj = SchoolIndex(
          meta = {'id': self.pk},
          name = self.name,
          school_type = self.school_type,
          name_suggest = {'input': self.name } <--- # what goes in here?
       )
       obj.save(index="school_index")
       return obj.to_dict(include_meta=True)
 
 根据ES 文档 ，建议像任何其他字段一样编入索引。 所以我可以在我的代码中的上面的name_suggest =语句中添加几个术语，这些术语在搜索时将匹配相应的字段。 但我的问题是如何做到这一点与大量的记录？ 我猜测ES会自动提出一些可以用作建议的术语。 例如：将短语中的每个单词用作术语。 我可以自己想出类似的东西（通过将每个短语分解成单词），但似乎并不直观，我自己可以这样做，因为我猜测已经有一种默认方式，用户可以根据需要进一步调整。 但在搜索相当长的一段时间后，在SO / blog / ES文档/ elasticsearch-dsl文档中找不到类似的内容。 （亚当·瓦蒂斯的这篇文章对我的入门非常有帮助）。 将欣赏任何指针。 

I am trying to add a completion suggester to enable search-as-you-type for a search field in my Django app (using Elastic Search 5.2.x and elasticseach-dsl). After trying to figure this out for a long time, I am not able to figure yet how to bulk index the suggester. Here's my code: 
class SchoolIndex(DocType):
    name = Text()
    school_type = Keyword()
    name_suggest = Completion()
 
Bulk indexing as follows: 
def bulk_indexing():
    SchoolIndex.init(index="school_index")
    es = Elasticsearch()
    bulk(client=es, actions=(a.indexing() for a in models.School.objects.all().iterator()))
 
And have defined an indexing method in models.py: 
def indexing(self):
       obj = SchoolIndex(
          meta = {'id': self.pk},
          name = self.name,
          school_type = self.school_type,
          name_suggest = {'input': self.name } <--- # what goes in here?
       )
       obj.save(index="school_index")
       return obj.to_dict(include_meta=True)
 
As per the ES docs, suggestions are indexed like any other field. So I could just put a few terms in the name_suggest = statement above in my code which will match the corresponding field, when searched. But my question is how to do that with a ton of records? I was guessing there would be a standard way for ES to automatically come up with a few terms that could be used as suggestions. For example: using each word in the phrase as a term. I could come up something like that on my own (by breaking each phrase into words) but it seems counter-intuitive to do that on my own since I'd guess there would already be a default way that the user could further tweak if needed. But couldn't find anything like that on SO/blogs/ES docs/elasticsearch-dsl docs after searching for quite sometime. (This post by Adam Wattis was very helpful in getting me started though). Will appreciate any pointers.

原文：https://stackoverflow.com/questions/43861059

更新时间：2023-05-08 21:05

最满意答案

总结我的评论：
1）Spring IOC容器管理bean从创建到销毁。 这意味着bean已经准备好了一个桶，这是一个随时可用的应用程序。 因此，有必要在编译时创建存储桶的内容，而不是运行时。 这不包括豆子的热交换..我希望这是你正在寻找的。
2）您可以根据需要创建任意数量的路径，将所有bean放入容器中....据我所知，您不能只更改源代码并将其与已运行的源代码同步，您必须至少做一个优雅的重启。 这有一个底线，Spring必须看看是否所有bean都已正确自动连接，没有循环依赖，并且在运行时期间没有源代码的期望。 当然你可以通过RMI获得你的bean，但这不算你已经宣布它。 所以是的，编译时间

Summing up my comments here :
1) Spring IOC container manages beans from its creation till destruction. What this means is the beans are ready in sort of a bucket, which is a ready to use application. Thus, it is necessary to create the contents of the bucket at compile time, rather then runtime. This does not include hot-swapping of beans.. I hope this is what you were looking for.
2) You can create as many routes as you want, those all beans will be put in the container.... And as far as I understand, you cannot just change your source code and synchronize it with already running one, you have to atleast do a graceful restart. There is a bottom line to this, Spring must see if all beans are properly autowired, no circular dependencies, and there is no expectation of source code during run time. Sure you can get your beans via RMI, but that doesn't count as you have declared it already. So yes, compile time it is

ElasticSearch - python中完成建议器的批量索引(ElasticSearch - bulk indexing for a completion suggester in python)

最满意答案

相关问答

Spring Data Project - Couchbase集成(Spring Data Project - Couchbase Integration)[2023-06-03]

Spring数据mongodb - 聚合框架集成(Spring data mongodb - aggregation framework integration)[2022-06-05]

Spring引导和Spring数据休息集成测试无法持久数据(Spring boot & Spring data rest integration test fails to persist data)[2023-01-23]

Spring Mqtt集成(Spring Mqtt Integration)[2022-08-21]

在Pentaho数据集成中将字段从字符串更改为Int(Changing a field from String to Int in Pentaho Data Integration)[2021-12-23]

在Spring集成中结合使用FailoverCCF和CachingCCF(Combining FailoverCCF and CachingCCF in Spring Integration)[2024-02-07]

Spring集成和更改数据路由(Spring integration and changing data routes)[2020-01-21]

Spring集成应用程序和缓存(Spring integration application and cache)[2023-05-29]

TCP“代理”通过Spring集成拦截TCP流量(TCP “proxy” to intercept TCP traffic with Spring integration)[2024-01-13]

以编程方式设置spring集成路由器属性(Set a spring integration router attribute programmatically)[2020-02-24]

相关文章

最新问答