Graphlab:替换Sframe中的值和过滤(Graphlab : replacing values in Sframe and filtering)
所以我有这个非常愚蠢的问题,我已经磕了几个小时。 我正在练习kaggle的泰坦尼克号ML练习,使用graphlab创建。
现在我想替换表中的一些值。 例如,我想将(作为测试)年龄设置为38,对于Pclass == 1 30为Pclass == 2和26为Pclass == 3
我已经尝试了很多这样做的方法,我迷失了。
我现在所拥有的只是:
df = gl.SFrame(data) df[(df["Pclass"]==1)] #will print the rows of the table where Pclass=1 df["Age"][df["Pclass"]==1] #will display an array containg only the column "Age" for Pclass=1
现在我正在尝试正确使用SFrame.apply,但我很困惑。
我试过了
df["Age"][df["Pclass"]==1].apply(lambda x: 38)
返回一个具有正确值的数组,但我无法将其应用于SFrame。 例如,我尝试过:
df = df["Age"][df["Pclass"]==1].apply(lambda x: 38)
但是现在我的DataFrame变成了一个列表......(显然)
我还尝试过:
df["Age"] = df["Age"][df["Pclass"]==1].apply(lambda x: 38)
但是我收到以下错误:“RuntimeError:Runtime Exception。Column”__PassengerId-Survived-Pclass-Sex-Age-Fare“的大小与当前列不同!”
我确信解决方案很简单,但我太困惑了,不能自己找到它。
最终我想要像df [“Age”] = something.apply(lambda x:38,如果Pclass == 1,其他30,如果Pclass == 2,则其他26,如果Pclass == 3)
谢谢。
So I have this very stupid problem I have been stumbling upon for hours. I'm practicing on kaggle's Titanic ML exercice, using graphlab create.
Now I want to replace some values in the table. For example I want to set (as a test) the age to 38 for Pclass==1 30 for Pclass==2 and 26 for Pclass==3
I have tried so many ways of doing this that I am lost.
All I have now is :
df = gl.SFrame(data) df[(df["Pclass"]==1)] #will print the rows of the table where Pclass=1 df["Age"][df["Pclass"]==1] #will display an array containg only the column "Age" for Pclass=1
Now I am trying to use SFrame.apply properly but I'm confused.
I have tried
df["Age"][df["Pclass"]==1].apply(lambda x: 38)
That returns an array with the correct values but I was not able to apply it to the SFrame. For example, I have tried :
df = df["Age"][df["Pclass"]==1].apply(lambda x: 38)
But now my DataFrame has turned into a list ... (obviously)
Il have also tried :
df["Age"] = df["Age"][df["Pclass"]==1].apply(lambda x: 38)
But I get the following error : "RuntimeError: Runtime Exception. Column "__PassengerId-Survived-Pclass-Sex-Age-Fare" has different size than current columns!"
I'm sure the solution is pretty simple but I am too confused to find it by myself.
Ultimately I would like something like df["Age"] = something.apply(lambda x: 38 if Pclass==1 else 30 if Pclass==2 else 26 if Pclass==3)
Thanks.