首页 \ 问答 \ 使用php实现自动完成的Solr配置(Solr configuration for autocompletion implementation with php)

使用php实现自动完成的Solr配置(Solr configuration for autocompletion implementation with php)

我如何索引我的数据并在solr中配置solr和我的搜索选项,可以实现具有以下要求的自动完成(如谷歌):

产品: - 我们的产品有标题,描述,id,例如标题:toshiba tecra s1:centrino 1.5 ghz / xp pro / 15.0“tft / 40 gb / 256 mb + 256mb / cd-rw-dvd-rom / lan / wi-fi - 此产品的此产品或字段必须以下列方式编制索引(如果用户开始输入,则无法区分用户搜索searchterm的方式,例如TOSHIBA或tOSHiba)前三个字符“tos”最多20个结果(完整标题(短语)例如“toshiba tecra s1:centrino 1.5 ghz / xp pro / 15.0”tft / 40 gb / 256 mb + 256mb / cd-rw-dvd-rom / lan / wi-fi“)应出现在自动完成框中。 - 如果用户输入两个术语“toshiba tecra”,则搜索结果必须更加精确,并且只显示所有文档,其中包含(连贯的)术语“toshiba tecra”

获得任何提示,使用什么样的tokenizer / searchcomponent等会很棒。

我正在使用solr版本3.5

谢谢oyur想法Ramo


how do i have to index my data and configure solr and my search options in solr, that an autocompletion (like google) with the following requirements is possible:

Products: - We have products with their titles, descriptions, id's, e.g. for the title: toshiba tecra s1: centrino 1.5 ghz/xp pro/15.0" tft/40 gb/256 mb+256mb/cd-rw-dvd-rom/lan/wi-fi - this products or fields of this product has to be indexed in such a way that the following should be possible (no differentation how a user search for the searchterm, e.g. TOSHIBA or tOSHiba) - if a user starts entering the first three characters "tos" max. 20 results (the complete title (phrase) e.g. "toshiba tecra s1: centrino 1.5 ghz/xp pro/15.0" tft/40 gb/256 mb+256mb/cd-rw-dvd-rom/lan/wi-fi") should appear in the autocomplete box. - if a user enters e.g. two terms "toshiba tecra" the searchresult must be more precisly and just all documents should be shown, that contain the (coherent) terms "toshiba tecra"

It would be great to get any hints for this, what kind of tokenizer/searchcomponent etc. to use.

I'm using solr Version 3.5

Thank you for oyur thoughts Ramo


原文:https://stackoverflow.com/questions/8459570
更新时间:2022-10-30 07:10

最满意答案

您可以使用-1来始终获取最后一部分而不是第二部分。

df['c'] = df['b'].apply(lambda x: x.split("'")[-1])

print(df)

#    a        b      c
# 0  1     ciao   ciao
# 1  2    hotel  hotel
# 2  3  l'hotel  hotel 

但是,请记住,如果您有两个或更多撇号的字符串,这将会制动(但您的要求无论如何都没有指定在这些情况下要做什么)。


You can use -1 to always get the last part rather than the second part.

df['c'] = df['b'].apply(lambda x: x.split("'")[-1])

print(df)

#    a        b      c
# 0  1     ciao   ciao
# 1  2    hotel  hotel
# 2  3  l'hotel  hotel 

However, keep in mind that this will brake if you have have strings with 2 or more apostrophes (but your requirement doesn't specify what to do in these cases anyway).

相关问答

更多

相关文章

更多

最新问答

更多
  • 获取MVC 4使用的DisplayMode后缀(Get the DisplayMode Suffix being used by MVC 4)
  • 如何通过引用返回对象?(How is returning an object by reference possible?)
  • 矩阵如何存储在内存中?(How are matrices stored in memory?)
  • 每个请求的Java新会话?(Java New Session For Each Request?)
  • css:浮动div中重叠的标题h1(css: overlapping headlines h1 in floated divs)
  • 无论图像如何,Caffe预测同一类(Caffe predicts same class regardless of image)
  • xcode语法颜色编码解释?(xcode syntax color coding explained?)
  • 在Access 2010 Runtime中使用Office 2000校对工具(Use Office 2000 proofing tools in Access 2010 Runtime)
  • 从单独的Web主机将图像传输到服务器上(Getting images onto server from separate web host)
  • 从旧版本复制文件并保留它们(旧/新版本)(Copy a file from old revision and keep both of them (old / new revision))
  • 西安哪有PLC可控制编程的培训
  • 在Entity Framework中选择基类(Select base class in Entity Framework)
  • 在Android中出现错误“数据集和渲染器应该不为null,并且应该具有相同数量的系列”(Error “Dataset and renderer should be not null and should have the same number of series” in Android)
  • 电脑二级VF有什么用
  • Datamapper Ruby如何添加Hook方法(Datamapper Ruby How to add Hook Method)
  • 金华英语角.
  • 手机软件如何制作
  • 用于Android webview中图像保存的上下文菜单(Context Menu for Image Saving in an Android webview)
  • 注意:未定义的偏移量:PHP(Notice: Undefined offset: PHP)
  • 如何读R中的大数据集[复制](How to read large dataset in R [duplicate])
  • Unity 5 Heighmap与地形宽度/地形长度的分辨率关系?(Unity 5 Heighmap Resolution relationship to terrain width / terrain length?)
  • 如何通知PipedOutputStream线程写入最后一个字节的PipedInputStream线程?(How to notify PipedInputStream thread that PipedOutputStream thread has written last byte?)
  • python的访问器方法有哪些
  • DeviceNetworkInformation:哪个是哪个?(DeviceNetworkInformation: Which is which?)
  • 在Ruby中对组合进行排序(Sorting a combination in Ruby)
  • 网站开发的流程?
  • 使用Zend Framework 2中的JOIN sql检索数据(Retrieve data using JOIN sql in Zend Framework 2)
  • 条带格式类型格式模式编号无法正常工作(Stripes format type format pattern number not working properly)
  • 透明度错误IE11(Transparency bug IE11)
  • linux的基本操作命令。。。