首页 \ 问答 \ SQL:在每个成员的多个开始和结束日期范围内识别不同的处理块(SQL: Identify distinct blocks of treatment over multiple start and end date ranges for each member)

SQL:在每个成员的多个开始和结束日期范围内识别不同的处理块(SQL: Identify distinct blocks of treatment over multiple start and end date ranges for each member)

目标:确定表中每个成员的连续治疗的不同事件。 每个成员都有诊断和服务日期,并且剧集被定义为每个连续服务之间的时间小于某个数字的所有服务(例如,对于该示例,假设为90天)。 查询将需要遍历每一行并计算日期之间的差异,并返回与每集相关的第一个和最后一个日期。 目标是按成员和剧集开始/结束日期对结果进行分组。

之前已经提出了一个非常类似的问题 ,并且有些帮助。 问题是在自定义代码时,返回的表将排除第一个和最后一个记录。 我不知道该怎么办。

我的数据目前看起来像这样:

MemberCode       Diagnosis              ServiceDate         
1001   -----        ABC      -----       2010-02-04           
1001   -----        ABC      -----       2010-03-20          
1001   -----        ABC      -----       2010-04-18          
1001   -----        ABC      -----       2010-05-22         
1001   -----        ABC      -----       2010-09-26          
1001   -----        ABC      -----       2010-10-11  
1001   -----        ABC      -----       2010-10-19
2002   -----        XYZ      -----       2010-07-10          
2002   -----        XYZ      -----       2010-07-21
2002   -----        XYZ      -----       2010-11-08
2002   -----        ABC      -----       2010-06-03           
2002   -----        ABC      -----       2010-08-13         

在上述数据中,会员1001的第一个记录是2010-02-04 ,并且在2010-09-26 (新剧集开始的日期)之前连续服务之间的差异不超过90天。 因此,会员1001有两个不同的剧集:(1)诊断ABC,从2010-02-042010-05-22 ,以及(2)诊断ABC,从2010-09-262010-10-19

同样,2002年会员有三个不同的剧集:(1)诊断XYZ,从2010-07-102010-07-21 ,(2)诊断XYZ,在2010-11-08开始和结束,和(3) )诊断ABC,从2010-06-032010-08-13

期望的输出:

MemberCode         Diagnosis       EpisodeStartDate          EpisodeEndDate
1001   -----          ABC   -----     2010-02-04   -----       2010-05-22
1001   -----          ABC   -----     2010-09-26   -----       2010-10-19
2002   -----          XYZ   -----     2010-07-10   -----       2010-07-21
2002   -----          XYZ   -----     2010-11-08   -----       2010-11-08
2002   -----          ABC   -----     2010-06-03   -----       2010-08-13

我一直在研究这个查询太久了,仍然无法得到我需要的东西。 任何帮助,将不胜感激。 提前致谢!


Objective: Identify distinct episodes of continuous treatment for each member in a table. Each member has a diagnosis and a service date, and an episode is defined as all services where the time between each consecutive service is less than some number (let's say 90 days for this example). The query will need to loop through each row and calculate the difference between dates, and return the first and last date associated with each episode. The goal is to group results by member and episode start/end date.

A very similar question has been asked before, and was somewhat helpful. The problem is that in customizing the code, the returned tables are excluding first and last records. I'm not sure how to proceed.

My data currently looks like this:

MemberCode       Diagnosis              ServiceDate         
1001   -----        ABC      -----       2010-02-04           
1001   -----        ABC      -----       2010-03-20          
1001   -----        ABC      -----       2010-04-18          
1001   -----        ABC      -----       2010-05-22         
1001   -----        ABC      -----       2010-09-26          
1001   -----        ABC      -----       2010-10-11  
1001   -----        ABC      -----       2010-10-19
2002   -----        XYZ      -----       2010-07-10          
2002   -----        XYZ      -----       2010-07-21
2002   -----        XYZ      -----       2010-11-08
2002   -----        ABC      -----       2010-06-03           
2002   -----        ABC      -----       2010-08-13         

In the above data, the first record for Member 1001 is 2010-02-04, and there is not a difference of more than 90 days between consecutive services until 2010-09-26 (the date at which a new episode starts). So Member 1001 has two distinct episodes: (1) Diagnosis ABC, which goes from 2010-02-04 to 2010-05-22, and (2) Diagnosis ABC, which goes from 2010-09-26 to 2010-10-19.

Similarly, Member 2002 has three distinct episodes: (1) Diagnosis XYZ, which goes from 2010-07-10 to 2010-07-21, (2) Diagnosis XYZ, which begins and ends on 2010-11-08, and (3) Diagnosis ABC, which goes from 2010-06-03 to 2010-08-13.

Desired output:

MemberCode         Diagnosis       EpisodeStartDate          EpisodeEndDate
1001   -----          ABC   -----     2010-02-04   -----       2010-05-22
1001   -----          ABC   -----     2010-09-26   -----       2010-10-19
2002   -----          XYZ   -----     2010-07-10   -----       2010-07-21
2002   -----          XYZ   -----     2010-11-08   -----       2010-11-08
2002   -----          ABC   -----     2010-06-03   -----       2010-08-13

I've been working on this query for too long, and still can't get exactly what I need. Any help would be appreciated. Thanks in advance!


原文:
更新时间:2021-10-16 08:10

最满意答案

您的rbindlist(lapply(...))可以使用版本1.9.7的data.table替换为非equi连接

specialty.dt[ provider.dt, on = .(p1 <= prob, p2 > prob)]

这将specialty.dt直接连接到provider.dt ,使用p1 <= probprob < p2


参考

这是一个类似问题的列表

这是阿伦的一次演讲


Your rbindlist(lapply(...)) can be replaced with a non-equi join using version 1.9.7 of data.table

specialty.dt[ provider.dt, on = .(p1 <= prob, p2 > prob)]

This joins specialty.dt onto provider.dt directly, using the condition that p1 <= prob, and prob < p2.


References

Here's a list of similar questions

And here's a talk by Arun

相关问答

更多
  • 在Rails 4.x中(见http://edgeguides.rubyonrails.org/active_record_querying.html#not-conditions ) GroupUser.where.not(user_id: me) 在Rails 3.x GroupUser.where(GroupUser.arel_table[:user_id].not_eq(me)) 要缩短长度,您可以将GroupUser.arel_table存储在变量中,或者如果在模型GroupUser本身内部使用 ...
  • ANSI SQL具有IS DISTINCT FROM但这在SQL Server中未实现。 它可以以不依赖于魔术常量的方式进行模拟 ,如果在列上使用则可以进行模拟 IF EXISTS (SELECT @A EXCEPT SELECT @B) PRINT 'Different' Here are the results of a performance test that checks for inequality a million times such that 90% of the time ...
  • 是的,这将永远为真,没有数字解释的位模式,运算符==将不会返回完全相反的!=。 对于不同的位模式(负零点0x80 ...等于零0x00 ...)进行比较的位模式有异常,或者与您已经提到的相同的位模式比较为不等(NaN),但是!=总是==的反转。 Yes, this will be always true, there are no number interpretations of bit patterns for which the operator == will not return the exac ...
  • 我真的很喜欢 if ( a != b ) 只是因为你应该少阅读,而且你更快地理解程序员想要传递的信息。 记住程序员花更多的时间阅读代码,而不是编写代码,所以你可以做的越多,代码就越容易理解,越好。 I really prefer if ( a != b ) simply because you are supposed to read less and you understand more quickly the message the programmer wanted to transmit. R ...
  • 你需要使用notEqualTo $where->notEqualTo('field2_id', $field2_id); http://framework.zend.com/apidoc/2.1/classes/Zend.Db.Sql.Predicate.Predicate.html#notEqualTo You need to use notEqualTo $where->notEqualTo('field2_id', $field2_id); http://framework.zend.com/ap ...
  • input函数返回一个string ,你应该把它转换为int : qe1 = int(input("What does RAM stand for?: ")) input function return a string, you should cast it to int: qe1 = int(input("What does RAM stand for?: "))
  • 这是一个包含在lingprog公式中的下限行的包装器。 请注意,需要更多的错误捕获(例如,每个A矩阵的列数必须相等),这并不意味着是一个健壮的实现。 为了正确的错误捕获,我建议您浏览一下linprog 源代码 。 from scipy.optimize import linprog import numpy as np def linprog_lb_wrapper(c, A_ub=None, b_ub=None, A_lb=None, b_lb=None, A_eq=None, b_eq=None, \ ...
  • 您的rbindlist(lapply(...))可以使用版本1.9.7的data.table替换为非equi连接 specialty.dt[ provider.dt, on = .(p1 <= prob, p2 > prob)] 这将specialty.dt直接连接到provider.dt ,使用p1 <= prob和prob < p2 。 参考 这是一个类似问题的列表 这是阿伦的一次演讲 Your rbindlist(lapply(...)) can be replaced with a non-equ ...

相关文章

更多

最新问答

更多
  • 获取MVC 4使用的DisplayMode后缀(Get the DisplayMode Suffix being used by MVC 4)
  • 如何通过引用返回对象?(How is returning an object by reference possible?)
  • 矩阵如何存储在内存中?(How are matrices stored in memory?)
  • 每个请求的Java新会话?(Java New Session For Each Request?)
  • css:浮动div中重叠的标题h1(css: overlapping headlines h1 in floated divs)
  • 无论图像如何,Caffe预测同一类(Caffe predicts same class regardless of image)
  • xcode语法颜色编码解释?(xcode syntax color coding explained?)
  • 在Access 2010 Runtime中使用Office 2000校对工具(Use Office 2000 proofing tools in Access 2010 Runtime)
  • 从单独的Web主机将图像传输到服务器上(Getting images onto server from separate web host)
  • 从旧版本复制文件并保留它们(旧/新版本)(Copy a file from old revision and keep both of them (old / new revision))
  • 西安哪有PLC可控制编程的培训
  • 在Entity Framework中选择基类(Select base class in Entity Framework)
  • 在Android中出现错误“数据集和渲染器应该不为null,并且应该具有相同数量的系列”(Error “Dataset and renderer should be not null and should have the same number of series” in Android)
  • 电脑二级VF有什么用
  • Datamapper Ruby如何添加Hook方法(Datamapper Ruby How to add Hook Method)
  • 金华英语角.
  • 手机软件如何制作
  • 用于Android webview中图像保存的上下文菜单(Context Menu for Image Saving in an Android webview)
  • 注意:未定义的偏移量:PHP(Notice: Undefined offset: PHP)
  • 如何读R中的大数据集[复制](How to read large dataset in R [duplicate])
  • Unity 5 Heighmap与地形宽度/地形长度的分辨率关系?(Unity 5 Heighmap Resolution relationship to terrain width / terrain length?)
  • 如何通知PipedOutputStream线程写入最后一个字节的PipedInputStream线程?(How to notify PipedInputStream thread that PipedOutputStream thread has written last byte?)
  • python的访问器方法有哪些
  • DeviceNetworkInformation:哪个是哪个?(DeviceNetworkInformation: Which is which?)
  • 在Ruby中对组合进行排序(Sorting a combination in Ruby)
  • 网站开发的流程?
  • 使用Zend Framework 2中的JOIN sql检索数据(Retrieve data using JOIN sql in Zend Framework 2)
  • 条带格式类型格式模式编号无法正常工作(Stripes format type format pattern number not working properly)
  • 透明度错误IE11(Transparency bug IE11)
  • linux的基本操作命令。。。