首页 \ 问答 \ Elasticsearch慢速聚合(Elasticsearch Slow terms aggregation)

Elasticsearch慢速聚合(Elasticsearch Slow terms aggregation)

 我有一个ES查询和聚合的问题。 在添加术语聚合之前，所有es记录数大约为1亿个，我的查询速度很快，只需要75ms，所有命中数都是105.但是在添加聚合后，如下所示：  
{
    "query": {
        ...
    },
    "aggs": {
        "index": {
            "terms": {
                "field": "index"
            }
         }
     }
}
 
 这个查询将花费20秒！ 我的问题是：我的查询结果数只有105，为什么聚合太慢？ 谢谢你的回复！ 

I hava a problem with es query and aggregation. All es record count is about 0.1 billion, before adding a terms aggregation, my query is fast, only cost 75ms and all hit count is 105. But after adding a aggregation, like this one: 
{
    "query": {
        ...
    },
    "aggs": {
        "index": {
            "terms": {
                "field": "index"
            }
         }
     }
}
 
this query will cost 20 second! My question is: my query result count is only 105, why the aggregation is so slow? Thanks for any reply!

原文：https://stackoverflow.com/questions/50893653

更新时间：2023-10-28 14:10

最满意答案

 对于任何此类NLP任务，您应该使用单词嵌入，例如Word2Vec。 每个单词将表示为向量。 您的输入将是原始错误顺序的这些向量的矩阵。 您的输出将是正确顺序的这些向量的矩阵。 下面，我已经包含了一个Fast.ai课程的链接，该课程进一步讨论了单词嵌入。  
 https://course.fast.ai/lessons/lesson6.html  
 *请注意，基于问题公式，我假设你的RNN能够处理输入/输出句子对。 如果情况并非如此，或者您遇到问题，请发表评论，我可以给您更多想法。 

For any NLP task such as this you should be using word embeddings, e.g. Word2Vec. Each word will be represented as a vector. Your input will be a matrix of these vectors in the original, incorrect order. Your output will be a matrix of these vectors in the correct order. Below, I've included a link to a Fast.ai course that further discusses word embeddings. 
https://course.fast.ai/lessons/lesson6.html 
*Note that based on the problem formulation, I'm assuming your RNN is capable of handling input / output sentence pairs. If that's not the case or you run into problems along those lines, leave a comment and I can give you some more ideas.

相关问答

对如何运行tensorflow LSTM感到困惑(Confused on how to run tensorflow LSTM)[2022-01-08]

第二个有效地与第一个循环相同，返回循环和最终状态中收集的所有输出的列表。它通过一些安全检查可以提高效率。它还支持有用的功能，如可变序列长度。 Tensorflow教程中提供了第一个选项，以便您了解如何解开RNN，但第二个选项是“生产”代码的首选。 The second effectively does the same as the loop in the first, returning a list of all the outputs collected in the loop and the f ...
LSTM神经网络中的损失函数(loss function in LSTM neural network)[2022-05-08]

从keras文档中， categorical_crossentropy只是多类logloss。这里记录日志丢失的数学和理论解释。基本上，LSTM将标签分配给单词（或字符，具体取决于您的模型），并通过惩罚单词（或字符）序列中的错误标签来优化模型。该模型采用输入字或字符向量，并尝试根据训练示例猜测下一个“最佳”字。分类交叉熵是衡量猜测有多好的定量方法。当模型迭代训练集时，它会在猜测下一个最佳单词（或字符）时减少错误。 From the keras documentation, categorical ...
由keras学习的LSTM网络的Foward传球(Foward pass in LSTM netwok learned by keras)[2023-07-25]

我意识到问题是什么。我试图使用Tensorflow会话（模型拟合后）而不是直接通过Keras方法提取模型权重。这导致权重矩阵完全有意义（尺寸明智）但包含初始化步骤的值。 model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=epochs, validation_split=0.05, callbacks=callbacks_list) p ...
关于LSTM对Keras的理解(About LSTM understanding on Keras)[2023-10-07]

你对LSTM应该返回的解释是不正确的。输出维度不需要与输入维度相匹配。具体来说， keras.layers.LSTM的第一个参数对应于输出空间的维度，并将其设置为1。换句话说，设置： model.add(LSTM(k, input_shape=(3,4), return_sequences=True)) 将导致(None, 3, k)输出形状。 Your interpretation of what the LSTM should return is not right. The output dim ...
是否在LSTM中使用了多少次展开？(Does it matter how many unrollings are used in an LSTM?)[2022-05-12]

展开仅针对培训进行定义。在评估期间，没有展开这样的东西，您只需输入数据并保持隐藏状态。但是，对于培训来说，它会产生巨大的影响。为了更好地理解这一点，让我们看看下面的图表，展开3。 UPDATE | v LSTM_t-LSTM_t+1-LSTM_t+2 LSTM_t+3-LSTM_t+4-LSTM_t+5 .... | | ...
使用LSTM网络的混乱词语解算器(Jumbled words solver using LSTM networks)[2020-01-12]

对于任何此类NLP任务，您应该使用单词嵌入，例如Word2Vec。每个单词将表示为向量。您的输入将是原始错误顺序的这些向量的矩阵。您的输出将是正确顺序的这些向量的矩阵。下面，我已经包含了一个Fast.ai课程的链接，该课程进一步讨论了单词嵌入。 https://course.fast.ai/lessons/lesson6.html *请注意，基于问题公式，我假设你的RNN能够处理输入/输出句子对。如果情况并非如此，或者您遇到问题，请发表评论，我可以给您更多想法。 For any NLP task ...
如何使用完整LSTM序列的输出？(How to use output of complete LSTM sequence ? tensorflow)[2023-03-12]

要获取所有输出的列表，您可以执行以下操作： return [tf.matmul(output, _weights['out']) + _biases['out'] for output in outputs] 这将返回一个TensorFlow张量的python数组，每个输出一个。如果您想要一个连接所有输出的张量，请将此数组传递给tf.concat ： transformed_outputs = [tf.matmul(output, _weights['out']) + _biases['out'] fo ...
如何使用时间序列数据训练和预测Keras LSTM？(How to train and predict Keras LSTM with time series data?)[2021-12-19]

我认为你可以在构建训练集时处理这种情况，至少如果最后一个值（在输入序列中）和预测值之间的时间延迟是固定的。设X_train具有维度：（nb_samples，timesteps，input_dim）和y_train具有维度（n_samples，output_dim）。设x是一个训练输入样本。它对应于具有维度的多变量时间序列（timesteps，input_dim）。它的相应输出是y和维度（output_dim）。在y中，您将值预测为可以在x中的最后一个值之后3天，LSTM“应该”掌握时间依赖性。因 ...
LSTM中的Tensorflow序列到序列LSTM（嵌套）(Tensorflow sequence-to-sequence LSTM within LSTM (nested))[2022-09-18]

我不知道你指的那篇论文。但是这里有一个关于如何在TensorFlow中实现这样的东西的想法：您可以创建2个LSTMCell 。如果你想支持每个单词的可变数量的字符和每个序列的可变数量的单词，你可以复制和调整dynamic_rnn的代码（参见rnn.py）而不是单个while循环，你将创建一个嵌套的while循环。内部操作字符调用第一个LSTMCell并重置每个单词后的状态。外部操作嵌入字（内部循环的输出）并调用第二个LSTMCell。通常，您是否应该单独训练嵌入取决于您可用的数据量。如果您没有 ...
如何在Keras中合并两个LSTM图层(How to merge two LSTM layers in Keras)[2022-03-30]

问题不在于合并层。您需要创建两个嵌入层以提供2个不同的输入。以下修改应该有效： embedding_layer_1 = Embedding(len(word_index) + 1, EMBEDDING_DIM, weights=[embedding_matrix], input_length=50, ...

ElasticSearch

Elasticsearch强大的聚合功能Facet

elasticsearch 口水篇（6） Mapping 定义索引

elasticsearch

分布式检索系统 ElasticSearch 和 SenseiDB 比较

elasticsearch vs solr

安装elasticsearch

Elasticsearch介绍

[MySQL Slow log]平滑清除在线慢查询日志slow log的流程

实时分布式搜索引擎比较（senseidb、Solr、elasticsearch）

Elasticsearch慢速聚合(Elasticsearch Slow terms aggregation)

最满意答案

相关问答

对如何运行tensorflow LSTM感到困惑(Confused on how to run tensorflow LSTM)[2022-01-08]

LSTM神经网络中的损失函数(loss function in LSTM neural network)[2022-05-08]

由keras学习的LSTM网络的Foward传球(Foward pass in LSTM netwok learned by keras)[2023-07-25]

关于LSTM对Keras的理解(About LSTM understanding on Keras)[2023-10-07]

是否在LSTM中使用了多少次展开？(Does it matter how many unrollings are used in an LSTM?)[2022-05-12]

使用LSTM网络的混乱词语解算器(Jumbled words solver using LSTM networks)[2020-01-12]

如何使用完整LSTM序列的输出？(How to use output of complete LSTM sequence ? tensorflow)[2023-03-12]

如何使用时间序列数据训练和预测Keras LSTM？(How to train and predict Keras LSTM with time series data?)[2021-12-19]

LSTM中的Tensorflow序列到序列LSTM（嵌套）(Tensorflow sequence-to-sequence LSTM within LSTM (nested))[2022-09-18]

如何在Keras中合并两个LSTM图层(How to merge two LSTM layers in Keras)[2022-03-30]

相关文章

最新问答