我在python中对多边形点(约旦曲线定理)的改编是否正确?(Is my adaptation of point-in-polygon (jordan curve theorem) in python correct?)
问题
我最近发现需要确定我的点是否在多边形内。 所以我在C ++中学习了这种方法并将其改编为python。 但是,我认为我正在研究的C ++代码是不是很正确? 我相信我已经修好了,但我不太确定所以我希望那些比我更聪明的人可以帮助我解决这个问题?
定理是超级简单的,这个想法是这样的,给定第n个闭合多边形,你绘制一条任意线,如果你的点在里面,你的线将与边相交奇数次。 否则,你将是偶数,它在多边形之外。 相当酷的酷。
我有以下测试用例:
polygon_x = [5, 5, 11, 10] polygon_y = [5, 10, 5, 10] test1_x = 6 test1_y = 6 result1 = point_in_polygon(test1_x, test1_y, polygon_x, polygon_y) print(result1) test2_x = 13 test2_y = 5 result2 = point_in_polygon(test2_x, test2_y, polygon_x, polygon_y) print(result2)
如果我将其定义如下,上面的内容会给我错误:
if polygon_x[i] < polygon_x[(i+1) % length]: temp_x = polygon_x[i] temp_y = polygon_x[(i+1) % length] else: temp_x = polygon_x[(i+1) % length] temp_y = polygon_x[i]
这是错的! 我应该对
result1
变为true
,然后对result2
变为false
。 很明显,有些东西很时髦。我用C ++阅读的代码是有道理的,除了上面的内容。 另外,它失败了我的测试用例,这使我认为
temp_y
应该用polygon_y
而不是polygon_x
定义。 果然,当我这样做时,我的测试用例(6,6)
通过了。 当我的点在线上时它仍然失败,但只要我在多边形内部,它就会通过。 预期的行为。采用python的多边形代码
def point_in_polygon(self, target_x, target_y, polygon_x, polygon_y): print(polygon_x) print(polygon_y) #Variable to track how many times ray crosses a line segment crossings = 0 temp_x = 0 temp_y = 0 length = len(polygon_x) for i in range(0,length): if polygon_x[i] < polygon_x[(i+1) % length]: temp_x = polygon_x[i] temp_y = polygon_y[(i+1) % length] else: temp_x = polygon_x[(i+1) % length] temp_y = polygon_y[i] print(str(temp_x) + ", " + str(temp_y)) #check if target_x > temp_x and target_x <= temp_y and (target_y < polygon_y[i] or target_y <= polygon_y[(i+1)%length]): eps = 0.000001 dx = polygon_x[(i+1) % length] - polygon_x[i] dy = polygon_y[(i+1) % length] - polygon_y[i] k = 0 if abs(dx) < eps: k = 999999999999999999999999999 else: k = dy/dx m = polygon_y[i] - k * polygon_x[i] y2 = k*target_x + m if target_y <= y2: crossings += 1 print(crossings) if crossings % 2 == 1: return True else: return False
概要
有人可以向我解释一下
temp_x
和temp_y
方法在做什么吗? 另外,如果我为polygon_x
重新定义polygon_x
和temp_y
的temp_x
修复是正确的方法吗? 我对此表示怀疑。 这就是原因。
temp_x
和temp_y
对我来说没有多大意义。 对于i = 0
,显然polygon_x[0] < polygon_x[1]
为false
,因此我们得到temp_x[1] = 5
和temp_y[0] = 5
。 那是(5,5)
。 这恰好是我的一对。 但是,假设我不按顺序提供算法我的点(按轴,成对完整性总是必须的),如:x = [5, 10, 10, 5] y = [10,10, 5, 5]
在这种情况下,当
i = 0
,我们得到temp_x[1] = 10
和temp_y[0] = 10
。 好吧,巧合(10,10)
。 我还针对“校正”算法(9,9)
测试了点数,它仍在里面。 简而言之,我试图找到一个反例,为什么我的修复不起作用,但我不能。 如果这是有效的,我需要了解该方法正在做什么,并希望有人可以帮我解释一下?无论如何,如果我是对或错,如果有人可以帮助更好地了解这个问题,我将不胜感激。 我甚至愿意以更有效的方式为n多边形解决问题,但我想确保我正确理解代码。 作为一名程序员,我对一种不太有意义的方法感到不舒服。
非常感谢你听我上面的想法。 任何建议都非常欢迎。
Problem
I recently found a need to determine if my points are inside of a polygon. So I learned about this approach in C++ and adapted it to python. However, the C++ code I was studying isn't quite right I think? I believe I have fixed it, but I am not quite sure so I was hoping folks brighter than me might help me caste some light on this?
The theorem is super simple and the idea is like this, given an nth closed polygon you draw an arbitrary line, if your point is inside, you line will intersect with the edges an odd number of times. Otherwise, you will be even and it is outside the polygon. Pretty freaking cool.
I had the following test cases:
polygon_x = [5, 5, 11, 10] polygon_y = [5, 10, 5, 10] test1_x = 6 test1_y = 6 result1 = point_in_polygon(test1_x, test1_y, polygon_x, polygon_y) print(result1) test2_x = 13 test2_y = 5 result2 = point_in_polygon(test2_x, test2_y, polygon_x, polygon_y) print(result2)
The above would give me both false if I defined it as follows:
if polygon_x[i] < polygon_x[(i+1) % length]: temp_x = polygon_x[i] temp_y = polygon_x[(i+1) % length] else: temp_x = polygon_x[(i+1) % length] temp_y = polygon_x[i]
This is wrong! I should be getting
true
forresult1
and thenfalse
forresult2
. So clearly, something is funky.The code I was reading in C++ makes sense except for the above. In addition, it failed my test case which made me think that
temp_y
should be defined withpolygon_y
and notpolygon_x
. Sure enough, when I did this, my test case for(6,6)
passes. It still fails when my points are on the line, but as long as I am inside the polygon, it will pass. Expected behavior.Polygon code adopted to python
def point_in_polygon(self, target_x, target_y, polygon_x, polygon_y): print(polygon_x) print(polygon_y) #Variable to track how many times ray crosses a line segment crossings = 0 temp_x = 0 temp_y = 0 length = len(polygon_x) for i in range(0,length): if polygon_x[i] < polygon_x[(i+1) % length]: temp_x = polygon_x[i] temp_y = polygon_y[(i+1) % length] else: temp_x = polygon_x[(i+1) % length] temp_y = polygon_y[i] print(str(temp_x) + ", " + str(temp_y)) #check if target_x > temp_x and target_x <= temp_y and (target_y < polygon_y[i] or target_y <= polygon_y[(i+1)%length]): eps = 0.000001 dx = polygon_x[(i+1) % length] - polygon_x[i] dy = polygon_y[(i+1) % length] - polygon_y[i] k = 0 if abs(dx) < eps: k = 999999999999999999999999999 else: k = dy/dx m = polygon_y[i] - k * polygon_x[i] y2 = k*target_x + m if target_y <= y2: crossings += 1 print(crossings) if crossings % 2 == 1: return True else: return False
Summary
Can someone please explain to me what the
temp_x
andtemp_y
approaches are doing? Also, if my fix for redefining thetemp_x
forpolygon_x
andtemp_y
forpolygon_y
is the correct approach? I doubt it. Here is why.What is going on for
temp_x
andtemp_y
doesn't quite make sense to me. Fori = 0
, clearlypolygon_x[0] < polygon_x[1]
isfalse
, so we gettemp_x[1] = 5
andtemp_y[0] = 5
. That is(5,5)
. This just happens to be one of my pairs. However, suppose I feed the algorithm my points out of order (by axis, pairwise integrity is always a must), something like:x = [5, 10, 10, 5] y = [10,10, 5, 5]
In this case, when
i = 0
, we gettemp_x[1] = 10
andtemp_y[0] = 10
. Okay, by coincidence(10,10)
. I also tested points against the "corrected" algorithm(9,9)
and it is still inside. In short, I am trying to find a counterexample, for why my fix won't work, but I can't. If this is working, I need to understand what the method is doing and hope someone could help explain it to me?Regardless, if I am right or wrong, I would appreciate it if someone could help shed some better light on this problem. I'm even open to solving the problem in a more efficient way for n-polygons, but I want to make sure I am understanding code correctly. As a coder, I am uncomfortable with a method that doesn't quite make sense.
Thank you so much for listening to my thoughts above. Any suggestions greatly welcomed.
原文:https://stackoverflow.com/questions/39262210
最满意答案
df['Height']
将返回意甲。然后你应该使用
df['Height'].argmax()
或df['Height'].idxmax()
来获得相应的索引。通过文档链接:
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.idxmax.html http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.argmax.html
df['Height']
would return a Serie.Then you should use
df['Height'].argmax()
ordf['Height'].idxmax()
to get the corresponding index.With the links to the documentation :
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.idxmax.html http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.argmax.html
相关问答
更多-
来自pandas.Index的文档 不可变的ndarray实现有序的可切片集。 存储所有pandas对象的轴标签的基本对象 将常规列表作为DataFrame的索引可能会导致无法解决或不可读的对象出现问题 - 显然 - 由于它是由哈希表支持的,因此相同的原则适用于为什么列表不能成为常规Python中的字典键 。 同时,显式的Index对象允许我们使用不同的类型作为索引,与NumPy具有的隐式整数索引相比,并执行快速查找。 如果要检索列名列表,则Index对象具有tolist方法。 >>> df.columns ...
-
使用布尔掩码来获取值等于随机变量的行。 然后使用该掩码索引数据帧或系列。 然后你将使用pandas数据帧或系列的.index字段。 一个例子是: In [9]: s = pd.Series(range(10,20)) In [10]: s Out[10]: 0 10 1 11 2 12 3 13 4 14 5 15 6 16 7 17 8 18 9 19 dtype: int64 In [11]: val_mask = s == 13 In ...
-
确实,它应该,如果你愿意的话,你可以使用remove_unused_levels来做到这remove_unused_levels - i = df.loc[df["test"].notnull(), "test"] i.index = i.index.remove_unused_levels() i.index MultiIndex(levels=[['a'], [2017-01, 2017-02]], labels=[[0, 0], [0, 1]]) 此函数删除当前数据帧切片中实 ...
-
您可以使用numpy.argsort()来获取排序索引: from StringIO import StringIO import numpy as np import pandas as pd txt = """RUN_START_DATE,PUSHUP_START_DATE,SITUP_START_DATE,PULLUP_START_DATE 2013-01-24,2013-01-02,2013-01-30,2013-02-03 2013-01-30,2013-01-21,2013-01-13,201 ...
-
听起来就像你想使用DataFrame.groupby方法。 df.groupby(df.index).agg(lambda x: list(x)) ticket_number assigned person1 [1, 2] person2 [3] 然后在整个结果数据帧上,一个Dataframe.to_json() df.groupby(df.index).agg(lambda x: list(x)).to_js ...
-
熊猫 - 索引中没有索引(Pandas - Indexing by not in index)[2023-10-19]
你可以使用df.drop(df_index, errors="ignore") 。 you can use df.drop(df_index, errors="ignore"). -
df['Height']将返回意甲。 然后你应该使用df['Height'].argmax()或df['Height'].idxmax()来获得相应的索引。 通过文档链接: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.idxmax.html http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.argmax.html df['Heigh ...
-
reset_index和groupby df.reset_index(level=1).groupby(level=0)['level_1'].apply(list) Out[21]: a [dog, cat] b [fox, rat] Name: level_1, dtype: object reset_index and groupby df.reset_index(level=1).groupby(level=0)['level_1'].apply(list) Out[21]: ...
-
Pandas - 根据条件对索引中的所有值过滤多索引(Pandas - Filter multi-index by condition on all values within index)[2023-11-03]
groupby索引,然后我们使用filter + all来获取所有计数超过thresh data.groupby(level=0).filter(lambda x : x['Count'].gt(10).all()) Out[495]: 0 Count B M 0.232856 15 F 0.536026 17 D M 0.375064 11 F 0.795447 20 受Jpp启发使用isin s=data.Count.min( ... -
您可以使用SeriesGroupBy.nlargest : print (grouped.groupby(level='yearmonth').nlargest(3).reset_index(level=0, drop=True)) yearmonth product 201601 E 180 A 100 B 90 201602 F 220 A ...