ElasticSearch - python中完成建议器的批量索引(ElasticSearch - bulk indexing for a completion suggester in python)
我试图添加一个完成提示器来为我的Django应用中的搜索字段(使用Elastic Search 5.2.x和elasticseach-dsl)启用搜索类型。 在尝试了解很长一段时间后,我无法确定如何对建议者进行批量索引。 这是我的代码:
class SchoolIndex(DocType): name = Text() school_type = Keyword() name_suggest = Completion()
大宗索引如下:
def bulk_indexing(): SchoolIndex.init(index="school_index") es = Elasticsearch() bulk(client=es, actions=(a.indexing() for a in models.School.objects.all().iterator()))
并且在models.py中定义了一个索引方法:
def indexing(self): obj = SchoolIndex( meta = {'id': self.pk}, name = self.name, school_type = self.school_type, name_suggest = {'input': self.name } <--- # what goes in here? ) obj.save(index="school_index") return obj.to_dict(include_meta=True)
根据ES 文档 ,建议像任何其他字段一样编入索引。 所以我可以在我的代码中的上面的
name_suggest =
语句中添加几个术语,这些术语在搜索时将匹配相应的字段。 但我的问题是如何做到这一点与大量的记录? 我猜测ES会自动提出一些可以用作建议的术语。 例如:将短语中的每个单词用作术语。 我可以自己想出类似的东西(通过将每个短语分解成单词),但似乎并不直观,我自己可以这样做,因为我猜测已经有一种默认方式,用户可以根据需要进一步调整。 但在搜索相当长的一段时间后,在SO / blog / ES文档/ elasticsearch-dsl文档中找不到类似的内容。 (亚当·瓦蒂斯的这篇文章对我的入门非常有帮助)。 将欣赏任何指针。I am trying to add a completion suggester to enable search-as-you-type for a search field in my Django app (using Elastic Search 5.2.x and elasticseach-dsl). After trying to figure this out for a long time, I am not able to figure yet how to bulk index the suggester. Here's my code:
class SchoolIndex(DocType): name = Text() school_type = Keyword() name_suggest = Completion()
Bulk indexing as follows:
def bulk_indexing(): SchoolIndex.init(index="school_index") es = Elasticsearch() bulk(client=es, actions=(a.indexing() for a in models.School.objects.all().iterator()))
And have defined an indexing method in models.py:
def indexing(self): obj = SchoolIndex( meta = {'id': self.pk}, name = self.name, school_type = self.school_type, name_suggest = {'input': self.name } <--- # what goes in here? ) obj.save(index="school_index") return obj.to_dict(include_meta=True)
As per the ES docs, suggestions are indexed like any other field. So I could just put a few terms in the
name_suggest =
statement above in my code which will match the corresponding field, when searched. But my question is how to do that with a ton of records? I was guessing there would be a standard way for ES to automatically come up with a few terms that could be used as suggestions. For example: using each word in the phrase as a term. I could come up something like that on my own (by breaking each phrase into words) but it seems counter-intuitive to do that on my own since I'd guess there would already be a default way that the user could further tweak if needed. But couldn't find anything like that on SO/blogs/ES docs/elasticsearch-dsl docs after searching for quite sometime. (This post by Adam Wattis was very helpful in getting me started though). Will appreciate any pointers.
原文:https://stackoverflow.com/questions/43861059
最满意答案
总结我的评论:
1)Spring IOC容器管理bean从创建到销毁。 这意味着bean已经准备好了一个桶,这是一个随时可用的应用程序。 因此,有必要在编译时创建存储桶的内容,而不是运行时。 这不包括豆子的热交换..我希望这是你正在寻找的。
2)您可以根据需要创建任意数量的路径,将所有bean放入容器中....据我所知,您不能只更改源代码并将其与已运行的源代码同步,您必须至少做一个优雅的重启。 这有一个底线,Spring必须看看是否所有bean都已正确自动连接,没有循环依赖,并且在运行时期间没有源代码的期望。 当然你可以通过RMI获得你的bean,但这不算你已经宣布它。 所以是的,编译时间
Summing up my comments here :
1) Spring IOC container manages beans from its creation till destruction. What this means is the beans are ready in sort of a bucket, which is a ready to use application. Thus, it is necessary to create the contents of the bucket at compile time, rather then runtime. This does not include hot-swapping of beans.. I hope this is what you were looking for.
2) You can create as many routes as you want, those all beans will be put in the container.... And as far as I understand, you cannot just change your source code and synchronize it with already running one, you have to atleast do a graceful restart. There is a bottom line to this, Spring must see if all beans are properly autowired, no circular dependencies, and there is no expectation of source code during run time. Sure you can get your beans via RMI, but that doesn't count as you have declared it already. So yes, compile time it is
相关问答
更多-
Spring-Data项目的目的是在您定义接口时在运行时为您创建此类实现。 它将通过反射检查您为扫描配置的软件包,并在其中发现xxRepository接口,此时它将使用您选择的特定于商店的Spring Data子项目(此处为Couchbase)提供的基类来xxRepository的具体实现并注入它。 The aim of the Spring-Data project is to create such implementations at runtime for you when you only def ...
-
Spring Data MongoOperations.group()方法映射到db.collection.group() MongoDB命令,而不是$group聚合函数。 目前在Spring Data MongoDB中不支持聚合框架。 正如你所提到的,映射减少虽然受到支持 The Spring Data MongoOperations.group() method is mapped to db.collection.group() MongoDB command and not the $group a ...
-
Spring引导和Spring数据休息集成测试无法持久数据(Spring boot & Spring data rest integration test fails to persist data)[2023-01-23]
我正在使用UserRepository现在添加数据并完全删除了TestEntityManager。 它现在有效... I'm using the UserRepository to add the data now and have removed the TestEntityManager entirely. It works now... -
Spring Mqtt集成(Spring Mqtt Integration)[2022-08-21]
这与MQTT无关,您的MqttInboundBeans对您没有声明的UserService类型的Bean具有@Autowired依赖性。 编辑 新增示例: @SpringBootApplication @Controller public class So42578862Application { public static void main(String[] args) { SpringApplication.run(So42578862Application.class, a ... -
在Pentaho数据集成中将字段从字符串更改为Int(Changing a field from String to Int in Pentaho Data Integration)[2021-12-23]
选择/重命名值步骤,并更改元数据部分中的数据类型。 Take Select/Rename values Step, and change the data-type in meta-data section. -
在Spring集成中结合使用FailoverCCF和CachingCCF(Combining FailoverCCF and CachingCCF in Spring Integration)[2024-02-07]
我以为我们已经测试了缓存故障转移和故障转移缓存,但似乎我们只支持前者(缓存故障转移)...... https://github.com/spring-projects/spring-integration/pull/848 你有什么理由不能通过这种方式连接工厂吗? 请打开JIRA问题,以便我们可以支持两种方式,或者至少检测不起作用的配置。 I thought we had tested both caching failover and failover caching, but it seems we o ... -
总结我的评论: 1)Spring IOC容器管理bean从创建到销毁。 这意味着bean已经准备好了一个桶,这是一个随时可用的应用程序。 因此,有必要在编译时创建存储桶的内容,而不是运行时。 这不包括豆子的热交换..我希望这是你正在寻找的。 2)您可以根据需要创建任意数量的路径,将所有bean放入容器中....据我所知,您不能只更改源代码并将其与已运行的源代码同步,您必须至少做一个优雅的重启。 这有一个底线,Spring必须看看是否所有bean都已正确自动连接,没有循环依赖,并且在运行时期间没有源代码的期望 ...
-
使用
和Spring Cache Advice可以实现更优雅的解决方案。 所以,你的解决方案可能是这样的: tcp-client-server-multiplex显示了一种异步代理请求/回复方案的机制; 您需要数据中的某些内容才能将回复与请求相关联。 它使用聚合器来做到这一点。 使用自述文件中引用的更简单的tcp-client-server示例,使用网关并且框架可以处理相关性(但它不会处理高容量,除非您在出站端使用CachingClientConnectionFactory (这实际上是您的如果您无法添加一些与回复相关的数据,则只有选项。 如果您只谈论单向消息传递,那么显然不需要关联(并且您在第一个示例中不需要聚 ...您不能以这种方式更改属性,因为expression是ExpressionEvaluatingRouter的ctor arg: public class ExpressionEvaluatingRouter extends AbstractMessageProcessingRouter { public ExpressionEvaluatingRouter(Expression expression) { super(new ExpressionEvaluatingMessagePr ...相关文章
更多- elasticsearch
- 安装elasticsearch
- ElasticSearch入门-Bulk,Search操作
- elasticsearch 口水篇(6) Mapping 定义索引
- Python内建函数(A)
- elasticsearch 口水篇(1) 安装、插件
- Python资源索引 【转载】
- elasticsearch vs solr
- Realtime Search: Solr vs Elasticsearch
- pychseg - A Python Chinese Segment Project - Google Project Hosting
最新问答
更多- 获取MVC 4使用的DisplayMode后缀(Get the DisplayMode Suffix being used by MVC 4)
- 如何通过引用返回对象?(How is returning an object by reference possible?)
- 矩阵如何存储在内存中?(How are matrices stored in memory?)
- 每个请求的Java新会话?(Java New Session For Each Request?)
- css:浮动div中重叠的标题h1(css: overlapping headlines h1 in floated divs)
- 无论图像如何,Caffe预测同一类(Caffe predicts same class regardless of image)
- xcode语法颜色编码解释?(xcode syntax color coding explained?)
- 在Access 2010 Runtime中使用Office 2000校对工具(Use Office 2000 proofing tools in Access 2010 Runtime)
- 从单独的Web主机将图像传输到服务器上(Getting images onto server from separate web host)
- 从旧版本复制文件并保留它们(旧/新版本)(Copy a file from old revision and keep both of them (old / new revision))
- 西安哪有PLC可控制编程的培训
- 在Entity Framework中选择基类(Select base class in Entity Framework)
- 在Android中出现错误“数据集和渲染器应该不为null,并且应该具有相同数量的系列”(Error “Dataset and renderer should be not null and should have the same number of series” in Android)
- 电脑二级VF有什么用
- Datamapper Ruby如何添加Hook方法(Datamapper Ruby How to add Hook Method)
- 金华英语角.
- 手机软件如何制作
- 用于Android webview中图像保存的上下文菜单(Context Menu for Image Saving in an Android webview)
- 注意:未定义的偏移量:PHP(Notice: Undefined offset: PHP)
- 如何读R中的大数据集[复制](How to read large dataset in R [duplicate])
- Unity 5 Heighmap与地形宽度/地形长度的分辨率关系?(Unity 5 Heighmap Resolution relationship to terrain width / terrain length?)
- 如何通知PipedOutputStream线程写入最后一个字节的PipedInputStream线程?(How to notify PipedInputStream thread that PipedOutputStream thread has written last byte?)
- python的访问器方法有哪些
- DeviceNetworkInformation:哪个是哪个?(DeviceNetworkInformation: Which is which?)
- 在Ruby中对组合进行排序(Sorting a combination in Ruby)
- 网站开发的流程?
- 使用Zend Framework 2中的JOIN sql检索数据(Retrieve data using JOIN sql in Zend Framework 2)
- 条带格式类型格式模式编号无法正常工作(Stripes format type format pattern number not working properly)
- 透明度错误IE11(Transparency bug IE11)
- linux的基本操作命令。。。