具有不同切片的平均超过2d numpy阵列(Mean over 2d numpy array with varying slices)
我需要计算2D numpy数组的列的平均值,其中每列的切片变化。
例如,我有一个数组
arr = np.arange(20).reshape(4, 5)
每个列的切片的结束索引均值定义为
bot_ix = np.array([3, 2, 2, 1, 2])
那么第一列的平均值就是
arr[0:bot_ix[0], 0].mean()
什么是适当的(即Pythonic +高效)方式来做到这一点? 我的阵列大小是〜(50,50K)。
I need to calculate the mean over the columns of a 2D numpy array where the slice per column varies.
For example, I have an array
arr = np.arange(20).reshape(4, 5)
with the end index of the slice for each column mean defined as
bot_ix = np.array([3, 2, 2, 1, 2])
The mean of the first column would then be
arr[0:bot_ix[0], 0].mean()
What's the appropriate (i.e. Pythonic + efficient) way to do this? My array sizes are ~(50, 50K).
原文:https://stackoverflow.com/questions/37822880
最满意答案
这里有一个类似的问题:
如何将ROW INDEX作为列添加到SQL SELECT查询?
从这个问题扩展到你想要的东西:
SET @row_num = 0; SELECT T.id,T.name,T.status,IFNULL(T.image, 'no-image.png') AS DP, (SELECT COUNT(*) FROM badminton_matches MT WHERE (MT.team_one = T.id OR MT.team_two = T.id)) AS played, (SELECT COUNT(*) FROM badminton_match_results R WHERE R.winner_id = T.id) AS won, (SELECT COUNT(*) FROM badminton_matches MT JOIN badminton_match_results MR ON (MR.match_id = MT.id) WHERE (MT.team_one = T.id OR MT.team_two = T.id) AND MR.winner_id != T.id) AS lost, ( ((SELECT COUNT(*) FROM badminton_match_results R WHERE R.winner_id = T.id) * 2) + ((SELECT COUNT(*) FROM badminton_match_results R JOIN badminton_matches M ON (M.id = R.match_id AND M.match_type = 'quarter') WHERE R.winner_id = T.id)) ) AS Points, /* here is the magic */ (@row_num := @row_num + 1) < 4 AS row_index FROM badminton_teams T ORDER BY (Points) DESC
这将添加一个名为
row_index
的额外列,其中1表示在前row_index
表示不在前3。请记住,您必须在每个
SELECT
之前和同一会话中调用SET
。There is a similar question here:
How to add ROW INDEX as a column to SQL SELECT query?
Extending from that question you want something like:
SET @row_num = 0; SELECT T.id, T.name, T.status, IFNULL(T.image, 'no-image.png') AS DP, (SELECT COUNT(*) FROM badminton_matches MT WHERE (MT.team_one = T.id OR MT.team_two = T.id)) AS played, (SELECT COUNT(*) FROM badminton_match_results R WHERE R.winner_id = T.id) AS won, (SELECT COUNT(*) FROM badminton_matches MT JOIN badminton_match_results MR ON (MR.match_id = MT.id) WHERE (MT.team_one = T.id OR MT.team_two = T.id) AND MR.winner_id != T.id) AS lost, ( ((SELECT COUNT(*) FROM badminton_match_results R WHERE R.winner_id = T.id) * 2) + ((SELECT COUNT(*) FROM badminton_match_results R JOIN badminton_matches M ON (M.id = R.match_id AND M.match_type = 'quarter') WHERE R.winner_id = T.id)) ) AS Points, /* here is the magic */ (@row_num := @row_num + 1) < 4 AS row_index FROM badminton_teams T ORDER BY (Points) DESC
This will add an extra column called
row_index
where 1 means in top 3 and 0 means not in the top 3.Remember, that you must call the
SET
before eachSELECT
and within the same session.
相关问答
更多-
LINUX 如何查看JPG文件[2022-06-13]
find -
您可以使用multisearch API来完成单独的查询。 You can use the multisearch api in order to do completely separate queries in one go.
-
找到前三名排名的球队(MySQL Find Top 3 Ranked Teams)[2021-05-16]
这里有一个类似的问题: 如何将ROW INDEX作为列添加到SQL SELECT查询? 从这个问题扩展到你想要的东西: SET @row_num = 0; SELECT T.id,T.name,T.status,IFNULL(T.image, 'no-image.png') AS DP, (SELECT COUNT(*) FROM badminton_matches MT WHERE (MT.team_one = T.id OR MT.team_two = T.id)) AS played, ... -
Mysql查询前三名客户(Mysql query for top 3 customer)[2022-05-19]
使用SUM而不是COUNT : SELECT Customer.CustomerID, Customer.Company, SUM(CustomerOrder.Amount) AS total_amount FROM CustomerOrder INNER JOIN Customer ON Customer.CustomerID = CustomerOrder.CustomerID GROUP BY Customer.CustomerID ORDER BY total_amou ... -
考虑使用相关子查询计算总冠军等级 ,然后使用它来过滤前6: WITH cte AS (SELECT * FROM teams AS t INNER JOIN result AS r INNER JOIN championship AS c ON t.id=r.id_team AND c.id=r.id_championship WHERE ano BETWEEN 2012 AND 2017) SELECT main.* FROM (SELECT t.*, (SEL ...
-
找到前三名的相关类别及其相应的概率(Finding the top three relevant category and its corresponding probabilities)[2024-01-20]
你在这里完成了大部分的辛苦工作,只是缺少一些numpy foo来完成它。 你的线 order = np.argsort(probabilities, axis=1) 包含排序概率的索引,所以[[lowest_prob_class_1, ..., highest_prob_class_1]...]对于您的每个样本。 你曾经用order[:, -1:]来给你的分类,即最高概率类的索引。 所以要获得前三名的课程,我们只需做一个简单的改变 top_3_classes = classifier.classes_[o ... -
Pinax团队 - 让用户找到用户所属的所有团队(Pinax teams - given a user find all the teams for which the user is a member)[2022-09-06]
不需要任何如此复杂或低效的东西。 您可以在单个查询中关注关系: teams = Team.objects.filter(memberships__user=user) There's no need for anything so complex or inefficient. You can follow the relationships in a single query: teams = Team.objects.filter(memberships__user=user) -
在Microsoft Teams中找不到Incoming Webhook连接器(Cannot find Incoming Webhook connector in Microsoft Teams)[2023-11-23]
正如@ wajeed-msft所说,这种情况正在发生,因为您的管理员已关闭对外部应用程序的访问权限。 假设“允许侧载外部应用程序”也被关闭,您将获得上面找到的页面上的列表。 如果您滚动浏览该列表,您将看到列出的“传入的webhook”。 As @wajeed-msft notes, this is happening because your administrator turned off access to external applications. Assuming "Allow sideloadi ... -
来自协会的排名最高的项目(Top Ranked Item from Associations)[2023-05-02]
您可以使用has_many:through来实现此目的。 http://ryandeussing.com/blog/2013/06/12/nested-associations-and-has-many-through/ Has_many将添加额外的条件以在3个或更多表之间进行连接。 所以你应该能够在“对话”之上的所有类中使用它。 基本上你可以链接has_many:通过你需要的次数来找到任何相关模型的顶级对话。 在你的模型中: class Discussion < AR::Base belongs_to ... -
如何在scikit中学习RFECV中的功能(sklearn)?(How are features ranked in RFECV in scikit learn(sklearn)?)[2022-05-15]
_grid_scores不是第i个特征的分数,它是使用第i个特征子集训练时估计器产生的分数。 要理解这意味着什么,请记住递归特征消除(RFE)的工作原理是训练模型,评估模型,然后删除step最不重要的特征,然后重复。 因此, _grid_score[-1]将是所有功能训练的估算器的分数。 _grid_score[-2]将是已删除step功能的估算工具的分数。 _grid_score[-3]将是已删除2*step功能的估算器的分数。 因此,网格分数不反映个别特征的评分。 实际上,如果step大于1,则网格分数 ...