首页 \ 问答 \ 使用solr作为有序聚合系统(Use solr as an ordered aggregate system)

使用solr作为有序聚合系统(Use solr as an ordered aggregate system)

 我需要开发一个系统，显示不同类型的“内容块”。 首先，我有三个系统提供数据块（其中一个是博客，另一个是事件列表，第三个是定制的）。 这可能会在以后的项目中得到扩展。  
 这个想法是有一个很长的列表，首先在发布日期排序（按时间顺序排列）。 但是，某些类型可能具有更高的优先级，因此例如，事件在列表中显示的相对高于具有因子x博客帖子。  
 然后我需要一些额外的排序，比如博客文章可能是“特色”，所以他们也给予了更多的优先级，比如因子y 。 最后一件事是内容块可能被标记为“粘性”，因此它们始终位于列表的顶部。  
 由于solr背后的先进技术和所有查询选项，我当时正在考虑使用solr作为此列表块的引擎。 这样的事情是可能的，还是另一种更适合这种技术？  
 我对solr的偏好也是因为将新内容注入索引会相对容易。 但是我必须定义一个查询来返回按日期排序的完整（分页）索引，这也是我必须要弄清楚的。  
 最后一点：应用程序将是一个php（Zend Framework 2）项目，所以它更好，它必须直接使用php或为php编写的客户端（例如SolrClient for solr）。 把它作为一个问题来形容：对于上述要求可能是一项好的技术，对于这个问题来说，这是一个好的技术吗？ 

I need to develop a system where different kind of "content blocks" are displayed. First I have three systems delivering blocks of data (one of them is a blog, another one a listing of events, a third one something custom). This might get extended later on in the project. 
The idea is to have a long list, ordered at first on published date (so chronologically). However, some types might have a little bit more priority, so for example events are showed relatively higher in the list than blog posts with a factor x. 
Then I need some additional sorting, like blog articles might be "featured" so they are also given a bit more priority, say a factor y. The last thing is content blocks might be marked "sticky" so they keep at the top of the list all the time. 
Because of the advanced technology behind solr and all the query options, I was thinking to use solr as an engine for this listing of blocks. Is something like that possible, or is another technology more suitable for this? 
My preference for solr is also because it would be relatively easy to inject new content into the index. But how I have to define a query to return the complete (paginated) index sorted by date is something I have to figure out as well. 
As a last point: the application will be a php (Zend Framework 2) project so preferable it must work directly with php or by a client written for php (eg, SolrClient for solr). To formulate it as a question: what might be a good technology for above described requirements and is solr a good one for this?

原文：https://stackoverflow.com/questions/13634433

更新时间：2023-10-25 13:10

最满意答案

 没什么 。 如果您在缓存的RDD上调用cache ，则什么都不会发生，RDD将被缓存（一次）。 像许多其他转换一样，缓存是懒惰的：  
 
  调用cache ，RDD的MEMORY_ONLY设置为MEMORY_ONLY  
  当你再次调用cache时，它被设置为相同的值（不变）  
  经过评估，当基础RDD被实现时，Spark将检查RDD的storageLevel ，如果需要缓存，它将缓存它。  
 
 所以你很安全。 

Nothing. If you call cache on a cached RDD, nothing happens, RDD will be cached (once). Caching, like many other transformations, is lazy:  
 
 When you call cache, the RDD's storageLevel is set to MEMORY_ONLY 
 When you call cache again, it's set to the same value (no change) 
 Upon evaluation, when underlying RDD is materialized, Spark will check the RDD's storageLevel and if it requires caching, it will cache it.  
 
So you're safe.

使用solr作为有序聚合系统(Use solr as an ordered aggregate system)

最满意答案

相关问答

通过Spark RDD进行迭代(Iterating through a Spark RDD)[2022-08-19]

在Apache Spark中缓存RDD的目的是什么？(What is the purpose of cache an RDD in Apache Spark?)[2022-05-20]

如果我在Spark中两次缓存相同的RDD，会发生什么情况(What happens if I cache the same RDD twice in Spark)[2023-06-06]

spark中的cache（）是否会改变RDD的状态或创建一个新的？(Does cache() in spark change the state of the RDD or create a new one?)[2023-10-25]

为什么我必须明确告诉Spark什么缓存？(Why do I have to explicitly tell Spark what to cache?)[2022-01-23]

Spark：将缓存RDD用于其他作业(Spark: cache RDD to be used in another job)[2021-12-08]

缓存Spark中有多少个RDD(How many RDDs in cache Spark)[2023-02-04]

Spark RDD问题(Spark RDD problems)[2022-07-26]

Apache Spark方法返回RDD（带尾递归）(Apache Spark Method returning an RDD (with Tail Recursion))[2022-09-01]

Spark总是删除RDD(Spark always remove RDD)[2022-03-19]

相关文章

最新问答