首页 \ 问答 \ Nutch没有删除Solr的重复项(Nutch not deleting duplicates from Solr)

Nutch没有删除Solr的重复项(Nutch not deleting duplicates from Solr)

 当Nutch完成其爬行时，它识别出有重复删除并通过说“删除xxx重复”并完成没有问题。 唯一的问题是它实际上没有删除重复项，虽然它说它有。  
 我也试过自己使用重复数据删除命令，结果是一样的。  
 我有Solr＆Nutch设置如我在博客上所示，如果你想深入研究一下，每个阶段在不同的帖子中：  
 http://amac4.blogspot.co.uk/2013/07/setting-up-solr-with-apache-tomcat-be.html http://amac4.blogspot.co.uk/2013/07/setting-up -nutch到爬行，filesystem.html 

When Nutch finishes its crawl it recognises that there are duplicates to delete and goes through saying "deleting xxx duplicates" and completes with no problems. The only problem is that it actually hasnt deleted the duplicates although it said it has. 
I've also tried using the dedup command on its own and the result is the same.  
I have Solr & Nutch Set-up as shown on my blog if you wish to delve a little deeper, each stage in a different post: 
http://amac4.blogspot.co.uk/2013/07/setting-up-solr-with-apache-tomcat-be.html http://amac4.blogspot.co.uk/2013/07/setting-up-nutch-to-crawl-filesystem.html

原文：https://stackoverflow.com/questions/17901592

更新时间：2022-09-04 15:09

最满意答案

 XML方法在SQL Server中被破坏。 没有理由尝试在任何其他数据库中。  
 一种方法使用数组：  
select s.id, array_agg(s.term)
from search s
group by s.id;
 
 由于数据库支持数组，因此应该学会使用它们。 您可以将数组转换为字符串：  
select s.id, array_join(array_agg(s.term), ',') as terms
from search s
group by s.id;

The XML method is brokenness in SQL Server. No reason to attempt it in any other database. 
One method uses arrays: 
select s.id, array_agg(s.term)
from search s
group by s.id;
 
Because the database supports arrays, you should learn to use them. You can convert the array to a string: 
select s.id, array_join(array_agg(s.term), ',') as terms
from search s
group by s.id;

Nutch没有删除Solr的重复项(Nutch not deleting duplicates from Solr)

最满意答案

相关问答

我怎样才能申请STUFF（）逗号分隔这个(How can I apply STUFF() to comma seperate this)[2023-03-25]

SQL中GROUP BY和STUFF的替代方法(Alternative for GROUP BY and STUFF in SQL)[2024-03-28]

SQL Server使用STUFF和GROUP BY将数据复制到另一个表(SQL Server copying data to another table using STUFF and GROUP BY)[2022-07-04]

STUFF函数sql返回null？(STUFF function sql returns null?)[2021-05-26]

如何使用Stuff（）将Oracle LISTAGG（）函数转换为SQL Server(How to convert Oracle LISTAGG() function to SQL Server using Stuff())[2022-05-07]

GROUP_CONCAT到STUFF转换返回错误(GROUP_CONCAT to STUFF conversion returning error)[2023-10-29]

使用STUFF将多个元素放入SQL中的一个块中[重复](Using STUFF in order to get multiple elements into one block in SQL [duplicate])[2021-02-02]

SQL STUFF无法正常工作，为什么？(SQL STUFF not working, why?)[2023-08-25]

SQL将STUFF函数添加到此查询中(SQL add STUFF function into this query)[2023-08-21]

SQL - GROUP BY和HAVING COUNT问题(SQL - GROUP BY & HAVING COUNT issue)[2022-02-15]

相关文章

最新问答