首页 \ 问答 \ Seaborn：使用boxplot导致内存不足(Seaborn: using boxplot cause running out of memory)

Seaborn：使用boxplot导致内存不足(Seaborn: using boxplot cause running out of memory)

 我想为1,2和3个weight_cat值绘制三个weight_cat （这些是它唯一的不同值）。 这些weight_cat图应显示重量类别（ weight_cat ）的依赖性高度。  
 所以我有这样一个数据帧：  
print data.head(5)

        Height    Weight  weight_cat
Index                                
1      65.78331  112.9925           1
2      71.51521  136.4873           2
3      69.39874  153.0269           3
4      68.21660  142.3354           2
5      67.78781  144.2971           2
 
 下面的代码终于吃掉了我的所有内存。 这不正常，我相信：  
Seaborn.boxplot(x="Height", y="weight_cat", data=data)
 
 这有什么不对？ 这是手册的链接。 数据帧的形状是（25000,4）。 这是csv文件的链接。  
 这是你如何获得相同的数据：  
data = pd.read_csv('weights_heights.csv', index_col='Index')
def weight_category(weight):
    newWeight = weight
    if newWeight < 120:
        return 1

    if newWeight >= 150:
        return 3

    else:
        return 2

data['weight_cat'] = data['Weight'].apply(weight_category)

I would like to plot three boxplots for 1, 2 and 3 weight_cat values (these are the only distinct values it has). These boxplots should show dependency height on weight category (weight_cat). 
So I have such a dataframe: 
print data.head(5)

        Height    Weight  weight_cat
Index                                
1      65.78331  112.9925           1
2      71.51521  136.4873           2
3      69.39874  153.0269           3
4      68.21660  142.3354           2
5      67.78781  144.2971           2
 
The code below finally eats all my ram. This is not normal, I believe: 
Seaborn.boxplot(x="Height", y="weight_cat", data=data)
 
What is wrong here? This is the link to manual. Shape of the dataframe is (25000,4). This the link to the csv file. 
This is how you can get the same data: 
data = pd.read_csv('weights_heights.csv', index_col='Index')
def weight_category(weight):
    newWeight = weight
    if newWeight < 120:
        return 1

    if newWeight >= 150:
        return 3

    else:
        return 2

data['weight_cat'] = data['Weight'].apply(weight_category)

原文：https://stackoverflow.com/questions/36666562

更新时间：2023-12-21 08:12

最满意答案

 问题是你正在使用np.matrix 。 改为使用np.array并简单地迭代而不进行索引：  
result = np.array([[11, 12, 13],
                   [21, 22, 23],
                   [31, 32, 33]])

for p in result:
    print(p)

[11 12 13]
[21 22 23]
[31 32 33]
 
 说明  
 你看到的是numpy.matrix的效果，要求每一行有2个维度。 这对于NumPy来说是不必要的和反模式的。  
 numpy.matrix背后有一段历史。 为了方便矩阵乘法运算符，它被初始化使用。 但这不再是一个问题，因为@是可能的（Python 3.5+）而不是嵌套dot调用。 因此，默认情况下，使用numpy.array 。 

The problem is you are using np.matrix. Use np.array instead and simply iterate without indexing: 
result = np.array([[11, 12, 13],
                   [21, 22, 23],
                   [31, 32, 33]])

for p in result:
    print(p)

[11 12 13]
[21 22 23]
[31 32 33]
 
Explanation 
What you are seeing is the effect of numpy.matrix requiring each row to have 2 dimensions. This is unnecessary and anti-pattern for NumPy. 
There is a history behind numpy.matrix. It was used initial for convenience of matrix multiplication operators. But this is no longer an issue since @ is possible (Python 3.5+) instead of nested dot calls. Therefore, by default, use numpy.array.

Seaborn：使用boxplot导致内存不足(Seaborn: using boxplot cause running out of memory)

最满意答案

相关问答

数组矩阵(Numpy matrix to array)[2023-12-09]

矩阵中的前n行？(Top n rows in a matrix?)[2022-09-15]

迭代n行的矩阵行(Iterate over a numpy Matrix rows)[2023-05-22]

numpy，用其他矩阵的行填充稀疏矩阵(numpy, fill sparse matrix with rows from other matrix)[2022-01-30]

在Numpy中有效地构建“滚动”行的矩阵(Building a matrix of 'rolled' rows efficiently in Numpy)[2023-12-12]

为什么numpy矩阵不让我打印它的行？(Why won't numpy matrix let me print its rows?)[2023-06-12]

如何根据条件从NumPy Matrix获取行的子集？(How to get a subset of rows from a NumPy Matrix based on a condition?)[2021-09-28]

迭代矩阵，对某些行求和并将结果添加到另一个数组(Iterate over a matrix, sum over some rows and add the result to another array)[2021-12-27]

Numpy：在另一个不同维度的矩阵上添加矩阵行(Numpy: Add Rows of Matrix over another Matrix of different dimension)[2022-04-19]

是否可以在NumPy中创建一个没有行或列的虚拟稀疏矩阵？(Is it possible to create a dummy sparse matrix with no rows or columns in NumPy?)[2022-01-15]

相关文章

最新问答