首页 \ 问答 \ MySQL:获取计数和平均值[重复](MySQL: Get Counts and Averages [duplicate])

MySQL:获取计数和平均值[重复](MySQL: Get Counts and Averages [duplicate])

这个问题在这里已有答案:

select 
COUNT(pd.property_id) AS `Beginning Total File Count`,
COUNT(pd.recv_dt) as `average days in inventory`,
AVG(pd.status = 'P') as `average days in pre-marketing`,
AVG(pd.status NOT IN('I','C')) as `average days onMarket`,
AVG(pd.status ='U') as `average days UnderContract`,
SUM(pd.status = 'O') as `Total FilesOccupied Status`,
SUM(pd.status = 'O') / COUNT(pd.property_id) as `percentage of Occupied / 
total file count`
from resnet.property_Details pd

我想要

  1. 开始总文件数
  2. 库存的平均天数
  3. 上市前的平均天数
  4. 平均待售天数
  5. 合同平均天数
  6. 处于占用状态的文件总数
  7. 占用/总文件数的百分比

不确定我的查询是否写得正确,请帮助:)

在此处输入图像描述


This question already has an answer here:

select 
COUNT(pd.property_id) AS `Beginning Total File Count`,
COUNT(pd.recv_dt) as `average days in inventory`,
AVG(pd.status = 'P') as `average days in pre-marketing`,
AVG(pd.status NOT IN('I','C')) as `average days onMarket`,
AVG(pd.status ='U') as `average days UnderContract`,
SUM(pd.status = 'O') as `Total FilesOccupied Status`,
SUM(pd.status = 'O') / COUNT(pd.property_id) as `percentage of Occupied / 
total file count`
from resnet.property_Details pd

I'm trying to get

  1. Beginning total file count
  2. Average days in inventory
  3. Average days in Pre-Marketing
  4. Average days on market
  5. Average days under contract
  6. Total files in occupied status
  7. Percentage of Occupied / total file count

Not sure if my query is written properly, please help :)

enter image description here


原文:https://stackoverflow.com/questions/43689973
更新时间:2023-06-19 14:06

最满意答案

我认为你需要首先使用boolean indexing进行过滤,然后进行groupby和聚合size

汇总输出并添加reindex以添加由0填充的缺失行:

print (df)
         Date ID
0  01/01/2016  a
1  05/01/2016  a
2  10/05/2017  a
3  05/05/2018  b
4  07/09/2014  b
5  07/09/2014  c
6  12/08/2018  b

#convert to datetime (if first number is day, add parameter dayfirst)
df['Date'] = pd.to_datetime(df['Date'], dayfirst=True)
now = pd.datetime.today()
print (now)

oneyarbeforenow =  now - pd.offsets.DateOffset(years=1)
oneyarafternow =  now + pd.offsets.DateOffset(years=1)

#first filter
a = df[df['Date'].between(oneyarbeforenow, now)].groupby('ID').size()
b = df[df['Date'].between(now, oneyarafternow)].groupby('ID').size()
print (a)
ID
a    1
dtype: int64

print (b)
ID
b    2
dtype: int64

df1 = pd.concat([a,b],axis=1).fillna(0).astype(int).reindex(df['ID'].unique(),fill_value=0)
print (df1)
   0  1
a  1  0
b  0  2
c  0  0

编辑:

如果需要比较每个日期的第一个日期加上或减去每组的year offset需要自定义函数的条件和sum

offs = pd.offsets.DateOffset(years=1)

f = lambda x: pd.Series([(x > x.iat[-1] - offs).sum(), \
                        (x < x.iat[-1] + offs).sum()], index=['last','next'])
df = df.groupby('ID')['Date'].apply(f).unstack(fill_value=0).reset_index()
print (df)
  ID  last  next
0  a     1     3
1  b     3     2
2  c     1     1

I think you need between with boolean indexing for filter first and then groupby and aggregate size.

Outputs are concated and add reindex for add missing rows filled by 0:

print (df)
         Date ID
0  01/01/2016  a
1  05/01/2016  a
2  10/05/2017  a
3  05/05/2018  b
4  07/09/2014  b
5  07/09/2014  c
6  12/08/2018  b

#convert to datetime (if first number is day, add parameter dayfirst)
df['Date'] = pd.to_datetime(df['Date'], dayfirst=True)
now = pd.datetime.today()
print (now)

oneyarbeforenow =  now - pd.offsets.DateOffset(years=1)
oneyarafternow =  now + pd.offsets.DateOffset(years=1)

#first filter
a = df[df['Date'].between(oneyarbeforenow, now)].groupby('ID').size()
b = df[df['Date'].between(now, oneyarafternow)].groupby('ID').size()
print (a)
ID
a    1
dtype: int64

print (b)
ID
b    2
dtype: int64

df1 = pd.concat([a,b],axis=1).fillna(0).astype(int).reindex(df['ID'].unique(),fill_value=0)
print (df1)
   0  1
a  1  0
b  0  2
c  0  0

EDIT:

If need compare each date by first date add or subtract year offset per group need custom function with condition and sum Trues:

offs = pd.offsets.DateOffset(years=1)

f = lambda x: pd.Series([(x > x.iat[-1] - offs).sum(), \
                        (x < x.iat[-1] + offs).sum()], index=['last','next'])
df = df.groupby('ID')['Date'].apply(f).unstack(fill_value=0).reset_index()
print (df)
  ID  last  next
0  a     1     3
1  b     3     2
2  c     1     1

相关问答

更多

相关文章

更多

最新问答

更多
  • 获取MVC 4使用的DisplayMode后缀(Get the DisplayMode Suffix being used by MVC 4)
  • 如何通过引用返回对象?(How is returning an object by reference possible?)
  • 矩阵如何存储在内存中?(How are matrices stored in memory?)
  • 每个请求的Java新会话?(Java New Session For Each Request?)
  • css:浮动div中重叠的标题h1(css: overlapping headlines h1 in floated divs)
  • 无论图像如何,Caffe预测同一类(Caffe predicts same class regardless of image)
  • xcode语法颜色编码解释?(xcode syntax color coding explained?)
  • 在Access 2010 Runtime中使用Office 2000校对工具(Use Office 2000 proofing tools in Access 2010 Runtime)
  • 从单独的Web主机将图像传输到服务器上(Getting images onto server from separate web host)
  • 从旧版本复制文件并保留它们(旧/新版本)(Copy a file from old revision and keep both of them (old / new revision))
  • 西安哪有PLC可控制编程的培训
  • 在Entity Framework中选择基类(Select base class in Entity Framework)
  • 在Android中出现错误“数据集和渲染器应该不为null,并且应该具有相同数量的系列”(Error “Dataset and renderer should be not null and should have the same number of series” in Android)
  • 电脑二级VF有什么用
  • Datamapper Ruby如何添加Hook方法(Datamapper Ruby How to add Hook Method)
  • 金华英语角.
  • 手机软件如何制作
  • 用于Android webview中图像保存的上下文菜单(Context Menu for Image Saving in an Android webview)
  • 注意:未定义的偏移量:PHP(Notice: Undefined offset: PHP)
  • 如何读R中的大数据集[复制](How to read large dataset in R [duplicate])
  • Unity 5 Heighmap与地形宽度/地形长度的分辨率关系?(Unity 5 Heighmap Resolution relationship to terrain width / terrain length?)
  • 如何通知PipedOutputStream线程写入最后一个字节的PipedInputStream线程?(How to notify PipedInputStream thread that PipedOutputStream thread has written last byte?)
  • python的访问器方法有哪些
  • DeviceNetworkInformation:哪个是哪个?(DeviceNetworkInformation: Which is which?)
  • 在Ruby中对组合进行排序(Sorting a combination in Ruby)
  • 网站开发的流程?
  • 使用Zend Framework 2中的JOIN sql检索数据(Retrieve data using JOIN sql in Zend Framework 2)
  • 条带格式类型格式模式编号无法正常工作(Stripes format type format pattern number not working properly)
  • 透明度错误IE11(Transparency bug IE11)
  • linux的基本操作命令。。。