首页 \ 教程 \ solr

知识点

Solr

Solrj的使用

（三）solrj使用

SolrJ使用教程

solrj的使用 ---转

solrJ使用样例

使用Solrj管理Solr索引

CLR Via C# 3rd 阅读摘要 -- Chapter 21 – Automatic Memory Management (Garbage Collection)

使用SolrJ生成索引

solrj的基本使用

Solrj example

solrj索引操作

HBase在淘宝的应用和优化小结

HBase 在淘宝的应用和优化

Hibernate 异常之:associate a collection with two ...

用好Collection 对solrj入库进行优化

2019-03-27 01:27|来源: 网路

今天一个朋友找我说他进行入库测试：

1个collection 2个shard，30多个字段，一个小时才入库4万条左右。

如果每条记录都很大这也是有可能的，不过我还是先让他贴代码出来看一下，他的入库代码：

public static void addIndex(String record) throws SolrServerException,

IOException {

String[] records = record.split(“\\|\\+\\|”, -1);

SolrInputDocument doc = new SolrInputDocument();

doc.addField(“id”, uuid(), 1.0f);

doc.addField(“DATA0″, records[0], 1.0f);1.0f);

doc.addField(“DATA31″, records[31], 1.0f);

Collection<SolrInputDocument> docs = new ArrayList<SolrInputDocument>();

docs.add(doc);

server.add(docs);

server.commit();

}

这样一条一条提交显然会慢的，除非要保证较强的事务性或者实时性，我建议他进行批量入，类似这样：

Collection<SolrInputDocument> docs = new ArrayList<SolrInputDocument>();

for (j = 0; j < 100000; j++) {

SolrInputDocument doc = new SolrInputDocument();

doc.setField(“id”, “i_” + i + 1 + “_j_” + j + 1 + “_”

+ ((i + 1) * (j + 1)));

doc.setField(“type”, j % 7);

doc.setField(“name”, “name” + System.currentTimeMillis());

docs.add(doc);

}

server.add(docs);

server.commit();

另外一点，如果我们能够控制好输入的速度，可以考虑autocommit，比如在solrconfig.xml里面配置:

上面配置的意思是：

maxTime:每15000ms，自动进行一次commit

maxDocs:若超过10000个文档更新，自动进行一次commit

openSearcher:执行完commit后记录马上可以搜索到

转自：http://www.solr.cc/blog/?p=96

相关问答

去广州哪个学校学习电脑应用好。[2022-10-17]

不建议学计算机应用这个专业，杂而不精，没有方向性不知道到底侧重点在哪里，所以不好就业
LINUX怎么才能应用好？[2023-02-17]

多动手，多试验，别无他法。建议去市面上买一门鸟哥的私房菜一书，写得还不错，通俗易懂。加油哟。。
Solrj - 由于HTTPSolrServer而无法继承最终类错误(Solrj - cannot inherit from final class error due to HTTPSolrServer)[2022-08-26]

如果您要连接到Solr（并使用SolrJ作为客户端库），那么您应该实例化HttpSolrClient ： String urlString = "http://localhost:8983/solr/collection1"; SolrClient solr = new HttpSolrClient(urlString); BTW：URL中的锚点（例如#foo）仅用于客户端（例如浏览器），并不意味着（因此，通常不会）作为请求的一部分进行传输。 If you're connecting to Solr (a ...
为什么Solrj SolrDocumentList（SolrDocument）将数据项保存为数组（ArrayList <>）(Why does Solrj SolrDocumentList (SolrDocument) hold data items as Arrays (ArrayList<>))[2022-09-02]

如果您确定该字段是单个值（或者您只需要集合的第一个值），则可以使用SolrDocument的getFirstValue（String name）方法。 If you are sure the field is a single value (or you need just the first value of the collection), you can use the getFirstValue(String name) method of the SolrDocument.
Solrj无法使用Solr连接到AWS EC2 Box(Solrj cannot connect to AWS EC2 box with Solr)[2022-09-08]

看看你的问题，我想你正在AWS EC2实例中运行Solr，并且你正在使用Solrj与本地计算机上的Solr进行交互。命令./solr -e cloud在本地工作站上启动SolrCloud集群。脚本创建的网络具有与您的需求不兼容的拓扑结构。以独立的风格（ ./solr start ）启动Solr应该让您以较少的痛苦进行远程连接。在这种情况下，你必须使用HttpSolrClient ： String urlString = "http://remote-ec2-ip-address:8983/solr/ ...
SolrJ不设置请求处理程序(SolrJ not setting request handler)[2023-07-15]

确实，“setRequestHandler”方法中的可见效果是只设置qt参数。但这不是SolrJ故事的结尾。处理SolrJ请求时，如果qt参数包含以正斜杠开始的字符串，SolrJ会在URL路径中将该路径中的“/ select”更改为该参数中包含的值，然后将请求发送到Solr。它也会将qt参数原样发送给Solr，但参数本身通常并不重要。如果您向/ select处理程序发送实际请求，并将qt设置为“/ all”，则Solr应该忽略qt参数 - 除非您在solrconfig.xml的requestDisp ...
Solrj API /集合示例(Solrj API/Example for collections)[2021-10-18]

import org.apache.solr.client.solrj.impl.CloudSolrServer; import org.apache.solr.common.SolrInputDocument; CloudSolrServer server = new CloudSolrServer("localhost:9983"); server.setDefaultCollection("collection1"); SolrInputDocument doc = new SolrInputDoc ...
无法在java中的solrj和embeddedsolrserver中创建CoreContainer(can not create CoreContainer in solrj and embeddedsolrserver in java)[2022-06-04]

似乎solr容器没有初始化。尝试使用此代码初始化容器： String solrDir = "C:/Program Files/Apache Software Foundation/Tomcat 7.0/webapps/solr/new_core/"; //this solr Directory is home and specified core. //solrParams.set("qt", "/dataimport"); //solrParams.set("command", "full-impo ...
solrconfig.xml中更改后SolrJ中的异常(Exception in SolrJ after change in solrconfig.xml)[2022-04-29]

我一直在查看HttpSolrServer上可用的方法和SolrJ库中SolrServer上的各种add方法，遗憾的是，我没有看到任何可以覆盖或在SolrJ中为updateHandler指定其他名称的地方。它假定它可以通过http://HOST:PORT/solr/update I have been looking through the methods available on HttpSolrServer and the various add methods on SolrServer in the ...

知识点

相关文章

最近更新

用好Collection 对solrj入库进行优化

相关问答

去广州哪个学校学习电脑应用好。[2022-10-17]

LINUX怎么才能应用好？[2023-02-17]

Solrj - 由于HTTPSolrServer而无法继承最终类错误(Solrj - cannot inherit from final class error due to HTTPSolrServer)[2022-08-26]

为什么Solrj SolrDocumentList（SolrDocument）将数据项保存为数组（ArrayList <>）(Why does Solrj SolrDocumentList (SolrDocument) hold data items as Arrays (ArrayList<>))[2022-09-02]

Solrj无法使用Solr连接到AWS EC2 Box(Solrj cannot connect to AWS EC2 box with Solr)[2022-09-08]

SolrJ不设置请求处理程序(SolrJ not setting request handler)[2023-07-15]

Solrj API /集合示例(Solrj API/Example for collections)[2021-10-18]

无法在java中的solrj和embeddedsolrserver中创建CoreContainer(can not create CoreContainer in solrj and embeddedsolrserver in java)[2022-06-04]

solrconfig.xml中更改后SolrJ中的异常(Exception in SolrJ after change in solrconfig.xml)[2022-04-29]