INSERT INTO与SELECT INTO(INSERT INTO vs SELECT INTO)
使用有什么区别?
SELECT ... INTO MyTable FROM...
和
INSERT INTO MyTable (...) SELECT ... FROM ....
?
从BOL [ INSERT , SELECT ... INTO ],我知道使用SELECT ... INTO将在默认文件组中创建插入表,如果它不存在,并且该语句的日志记录取决于恢复数据库模型。
- 哪个说法最好?
- 是否有其他性能影响?
- 什么是SELECT ... INTO在INSERT INTO ...中的一个很好的用例?
编辑:我已经说过我知道那个SELECT INTO ...创建一个不存在的表。 我想知道的是,SQL包含这个声明是一个原因,是什么? 它是否在幕后插入行,或者只是语法糖在
CREATE TABLE
和INSERT INTO
之上。What is the difference between using
SELECT ... INTO MyTable FROM...
and
INSERT INTO MyTable (...) SELECT ... FROM ....
?
From BOL [ INSERT, SELECT...INTO ], I know that using SELECT...INTO will create the insertion table on the default file group if it doesn't already exist, and that the logging for this statement depends on the recovery model of the database.
- Which statement is preferable?
- Are there other performance implications?
- What is a good use case for SELECT...INTO over INSERT INTO ...?
Edit: I already stated that I know that that SELECT INTO... creates a table where it doesn't exist. What I want to know is that SQL includes this statement for a reason, what is it? Is it doing something different behind the scenes for inserting rows, or is it just syntactic sugar on top of a
CREATE TABLE
andINSERT INTO
.
原文:https://stackoverflow.com/questions/6947983
最满意答案
需要除以
div
withgroupby
by levelday_of_week
withtransform
for newSeries
withindex
with originaldf
:print (X.groupby(level='day_of_week')['count'].transform('sum')) day_of_week cat 0 0 145 1 145 1 0 87 1 87 2 0 82 1 82 3 0 170 1 170 4 0 150 1 150 5 0 112 1 112 6 0 25 1 25 Name: count, dtype: int32 X['ratio'] = X['count'].div(X.groupby(level='day_of_week')['count'].transform('sum')) print (X) count ratio day_of_week cat 0 0 52 0.358621 1 93 0.641379 1 0 15 0.172414 1 72 0.827586 2 0 61 0.743902 1 21 0.256098 3 0 83 0.488235 1 87 0.511765 4 0 75 0.500000 1 75 0.500000 5 0 88 0.785714 1 24 0.214286 6 0 3 0.120000 1 22 0.880000
在最后一个pandas版本可能省略
level
:X['ratio'] = X['count'].div(X.groupby('day_of_week')['count'].transform('sum'))
Need divide by
div
withgroupby
by levelday_of_week
withtransform
for newSeries
with sameindex
as originaldf
:print (X.groupby(level='day_of_week')['count'].transform('sum')) day_of_week cat 0 0 145 1 145 1 0 87 1 87 2 0 82 1 82 3 0 170 1 170 4 0 150 1 150 5 0 112 1 112 6 0 25 1 25 Name: count, dtype: int32 X['ratio'] = X['count'].div(X.groupby(level='day_of_week')['count'].transform('sum')) print (X) count ratio day_of_week cat 0 0 52 0.358621 1 93 0.641379 1 0 15 0.172414 1 72 0.827586 2 0 61 0.743902 1 21 0.256098 3 0 83 0.488235 1 87 0.511765 4 0 75 0.500000 1 75 0.500000 5 0 88 0.785714 1 24 0.214286 6 0 3 0.120000 1 22 0.880000
In last pandas version is possible omit
level
:X['ratio'] = X['count'].div(X.groupby('day_of_week')['count'].transform('sum'))
相关问答
更多-
你的d不再是groupby对象,它是一个多索引的df,这就是你得到错误的原因: In [61]: for col in d: print(col) City H subindex 这就是现在的情况: Out[52]: City H subindex City AMS 0 AMS 1.1 1 2 AMS 0.9 2 1 AMS 0.8 3 BOS 3 ...
-
在熊猫中操纵子指数(Manipulating subindex in Pandas)[2021-05-13]
需要除以div with groupby by level day_of_week with transform for new Series with index with original df : print (X.groupby(level='day_of_week')['count'].transform('sum')) day_of_week cat 0 0 145 1 145 1 0 87 ... -
你使用的是什么版本的大熊猫? 对我来说,你的代码工作正常(我在git master上)。 另一种方法可以是: In [117]: import pandas In [118]: import random In [119]: df = pandas.DataFrame(np.random.randn(100, 4), columns=list('ABCD')) In [120]: rows = random.sample(df.index, 10) In [121]: df_10 = df.ix[r ...
-
这有效: import pandas as pd import numpy as np np.random.seed(0) idx = pd.IndexSlice midx = pd.MultiIndex.from_product([['A', 'B'], [0, 1], range(-1000, 0)]) df = pd.DataFrame(np.random.randn(4000, 3), columns=['dat1', 'dat2', 'dat3'], index=midx) df.sort_in ...
-
这是一个不使用numpy的解决方案。 A = [X[i][:2] for i in range(2)] B = [X[i][2:] for i in range(2)] C = [X[i][:2] for i in range(2,4)] D = [X[i][2:] for i in range(2,4)] >>> A [[2, 3], [4, 5]] >>> B [[5, 6], [9, 10]] >>> C [[6, 1], [3, 7]] >>> D [[3, 9], [11, 12]] Here ...
-
熊猫柱操纵(Pandas Column Manipulation)[2023-07-06]
假设你想要在每个X>0实例之间的情况计数,而不是在每个X>0之后整个DataFrame的剩余部分的计数: 您可以创建一个新column ,指示X>0条件在结果上的位置为True , .fillna(method='ffill')和.groupby() 。 然后你只需要.apply()来获取其他条件为True的group的len() 。 一些样本数据: df = pd.DataFrame(data=np.random.randint(-10, 10, size=(100, 3)), columns=list( ... -
提取熊猫中的多指数类型(extract multiindex types in pandas)[2023-03-20]
您可以在[lev.dtype.type for lev in index.levels]使用[lev.dtype.type for lev in index.levels] : import pandas as pd df = pd.DataFrame({"id": [1,2,1,2], "time": [1, 1, 2, 2], "val": [1,2,3,4]}) df.set_index(keys=["id", "time"], inplace=True) index = df.index prin ... -
你可以尝试如下所示: function LetterCapitalize(str) { var arr = str.split(" "); var nstr = ""; for(var i=0; i我认为你需要一个DatetimeIndex(而不是一个MultiIndex): In [11]: df1 = df.reset_index('status') In [12]: df1 Out[12]: status TUFNWGTP TELFS t070101 t070102 t070103 t070104 TUDIARYDATE 2003-01-03 emp 8155462.672158 2 0 0 ...
Pandas multiIndex:为每个现有索引添加新索引(Pandas multiIndex: add new indexes for each existing index)[2022-02-17]
要为每个组中的项目编号,请使用cumcount : import pandas as pd df = pd.DataFrame({'ID': ['a1', 'a1', 'a2', 'a2', 'a3'], 'Type': ['y', 'y', 'y', 'n', 'n']}) df['Subindex'] = df.groupby('ID').cumcount()+1 print(df) 产量 ID Type Subindex 0 a1 y ...相关文章
更多- Struts2标签select的使用
- BaseService类对insert和update方法的类型做限制-java cms开发四
- Select2在Bootstrap 3 Modal框中不能搜索的解决方法
- INSERT INTO blog_appitem (user_id,appid,app_secret,is_valid) VALUES (1, 'wxf415741de036114c','48e1e345fd5f11c93af18ff1714c7f78',1)
- 请解释一下plsql中的/*+APPEND*/的意思
- 基于Hadoop的Cloudbase的问题/Bug
- 一对多关系问题
- java 存取oracle数据库日期数据
- 如何优化这个sql语句
- 不会sql语句....谁帮我优化下
最新问答
更多- 获取MVC 4使用的DisplayMode后缀(Get the DisplayMode Suffix being used by MVC 4)
- 如何通过引用返回对象?(How is returning an object by reference possible?)
- 矩阵如何存储在内存中?(How are matrices stored in memory?)
- 每个请求的Java新会话?(Java New Session For Each Request?)
- css:浮动div中重叠的标题h1(css: overlapping headlines h1 in floated divs)
- 无论图像如何,Caffe预测同一类(Caffe predicts same class regardless of image)
- xcode语法颜色编码解释?(xcode syntax color coding explained?)
- 在Access 2010 Runtime中使用Office 2000校对工具(Use Office 2000 proofing tools in Access 2010 Runtime)
- 从单独的Web主机将图像传输到服务器上(Getting images onto server from separate web host)
- 从旧版本复制文件并保留它们(旧/新版本)(Copy a file from old revision and keep both of them (old / new revision))
- 西安哪有PLC可控制编程的培训
- 在Entity Framework中选择基类(Select base class in Entity Framework)
- 在Android中出现错误“数据集和渲染器应该不为null,并且应该具有相同数量的系列”(Error “Dataset and renderer should be not null and should have the same number of series” in Android)
- 电脑二级VF有什么用
- Datamapper Ruby如何添加Hook方法(Datamapper Ruby How to add Hook Method)
- 金华英语角.
- 手机软件如何制作
- 用于Android webview中图像保存的上下文菜单(Context Menu for Image Saving in an Android webview)
- 注意:未定义的偏移量:PHP(Notice: Undefined offset: PHP)
- 如何读R中的大数据集[复制](How to read large dataset in R [duplicate])
- Unity 5 Heighmap与地形宽度/地形长度的分辨率关系?(Unity 5 Heighmap Resolution relationship to terrain width / terrain length?)
- 如何通知PipedOutputStream线程写入最后一个字节的PipedInputStream线程?(How to notify PipedInputStream thread that PipedOutputStream thread has written last byte?)
- python的访问器方法有哪些
- DeviceNetworkInformation:哪个是哪个?(DeviceNetworkInformation: Which is which?)
- 在Ruby中对组合进行排序(Sorting a combination in Ruby)
- 网站开发的流程?
- 使用Zend Framework 2中的JOIN sql检索数据(Retrieve data using JOIN sql in Zend Framework 2)
- 条带格式类型格式模式编号无法正常工作(Stripes format type format pattern number not working properly)
- 透明度错误IE11(Transparency bug IE11)
- linux的基本操作命令。。。