首页 \ 问答 \ 线程创建操作是否意味着在关系之前发生?(Do thread creation operations imply happens-before relationships?)

线程创建操作是否意味着在关系之前发生?(Do thread creation operations imply happens-before relationships?)

我知道锁可以确保线程之间的关系发生。 线程创建操作本身是否意味着事先发生关系? 换句话说,在下面的代码中,我们可以确保#2的输出是1吗? 这段代码是否有数据竞争?

#include <iostream>
#include <thread>

using namespace std;

void func(int *ptr)
{
  cout << *ptr << endl; // #2
}

int main()
{
  int data = 1; // #1
  thread t(func, &data);
  t.join();

  return 0;
}

I know that locks can ensure happens-before relationships among threads. Does a thread creation operation itself imply a happens-before relationship? In other words, in the code below, can we ensure that the output of #2 is 1? Does this code have a data race?

#include <iostream>
#include <thread>

using namespace std;

void func(int *ptr)
{
  cout << *ptr << endl; // #2
}

int main()
{
  int data = 1; // #1
  thread t(func, &data);
  t.join();

  return 0;
}

原文:https://stackoverflow.com/questions/49460385
更新时间:2024-03-22 08:03

最满意答案

我们可以通过简单地将起始和结束索引与覆盖列长度的范围数组进行比较来利用NumPy broadcasting来实现矢量化解决方案,从而为我们提供一个掩码,该掩码表示输出数组中需要指定为1s

所以,解决方案将是这样的 -

ncols = z.shape[1]
r = np.arange(z.shape[1])
mask = (index[:,0,None] <= r) & (index[:,1,None] >= r)
z[mask] = 1

样品运行 -

In [39]: index = np.array([[1,2],[2,4],[1,5],[5,6]])
    ...: z = np.zeros(shape = [4,10], dtype = np.float32)

In [40]: ncols = z.shape[1]
    ...: r = np.arange(z.shape[1])
    ...: mask = (index[:,0,None] <= r) & (index[:,1,None] >= r)
    ...: z[mask] = 1

In [41]: z
Out[41]: 
array([[0., 1., 1., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 1., 1., 1., 0., 0., 0., 0., 0.],
       [0., 1., 1., 1., 1., 1., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 1., 1., 0., 0., 0.]], dtype=float32)

如果z总是一个zeros-initialized数组,我们可以直接从mask获取输出 -

z = mask.astype(int)

样品运行 -

In [37]: mask.astype(int)
Out[37]: 
array([[0, 1, 1, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 1, 1, 0, 0, 0, 0, 0],
       [0, 1, 1, 1, 1, 1, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 1, 1, 0, 0, 0]])

标杆

比较@ hpaulj的foo0和我的foo4如@ hpaulj的帖子中列出的1000行和可变列数的集合。 我们从10列开始,因为输入样本是如何列出的,我们给它的行数更多 - 1000 。 我们会将列数增加到1000

这是时间 -

In [14]: ncols = 10
    ...: index = np.random.randint(0,ncols,(10000,2))
    ...: z = np.zeros(shape = [len(index),ncols], dtype = np.float32)

In [15]: %timeit foo0(z,index)
    ...: %timeit foo4(z,index)
100 loops, best of 3: 6.27 ms per loop
1000 loops, best of 3: 594 µs per loop

In [16]: ncols = 100
    ...: index = np.random.randint(0,ncols,(10000,2))
    ...: z = np.zeros(shape = [len(index),ncols], dtype = np.float32)

In [17]: %timeit foo0(z,index)
    ...: %timeit foo4(z,index)
100 loops, best of 3: 6.49 ms per loop
100 loops, best of 3: 2.74 ms per loop

In [38]: ncols = 300
    ...: index = np.random.randint(0,ncols,(1000,2))
    ...: z = np.zeros(shape = [len(index),ncols], dtype = np.float32)

In [39]: %timeit foo0(z,index)
    ...: %timeit foo4(z,index)
1000 loops, best of 3: 657 µs per loop
1000 loops, best of 3: 600 µs per loop

In [40]: ncols = 1000
    ...: index = np.random.randint(0,ncols,(1000,2))
    ...: z = np.zeros(shape = [len(index),ncols], dtype = np.float32)

In [41]: %timeit foo0(z,index)
    ...: %timeit foo4(z,index)
1000 loops, best of 3: 673 µs per loop
1000 loops, best of 3: 1.78 ms per loop

因此,选择最佳的一个将取决于在loopy和基于广播的矢量化之间设置的问题的列数。


We can leverage NumPy broadcasting for a vectorized solution by simply comparing the start and end indices against the ranged array covering the length of columns to give us a mask that represents all the places in the output array required to be assigned as 1s.

So, the solution would be something like this -

ncols = z.shape[1]
r = np.arange(z.shape[1])
mask = (index[:,0,None] <= r) & (index[:,1,None] >= r)
z[mask] = 1

Sample run -

In [39]: index = np.array([[1,2],[2,4],[1,5],[5,6]])
    ...: z = np.zeros(shape = [4,10], dtype = np.float32)

In [40]: ncols = z.shape[1]
    ...: r = np.arange(z.shape[1])
    ...: mask = (index[:,0,None] <= r) & (index[:,1,None] >= r)
    ...: z[mask] = 1

In [41]: z
Out[41]: 
array([[0., 1., 1., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 1., 1., 1., 0., 0., 0., 0., 0.],
       [0., 1., 1., 1., 1., 1., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 1., 1., 0., 0., 0.]], dtype=float32)

If z is always a zeros-initialized array, we can directly get the output from mask -

z = mask.astype(int)

Sample run -

In [37]: mask.astype(int)
Out[37]: 
array([[0, 1, 1, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 1, 1, 0, 0, 0, 0, 0],
       [0, 1, 1, 1, 1, 1, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 1, 1, 0, 0, 0]])

Benchmarking

Comparing @hpaulj's foo0 and mine foo4 as listed in @hpaulj's post for a set with 1000 rows and variable number of columns. We are starting with 10 columns as that was how the input sample was listed and we are giving it a bigger number of rows - 1000. We would increase the number of columns to 1000.

Here's the timings -

In [14]: ncols = 10
    ...: index = np.random.randint(0,ncols,(10000,2))
    ...: z = np.zeros(shape = [len(index),ncols], dtype = np.float32)

In [15]: %timeit foo0(z,index)
    ...: %timeit foo4(z,index)
100 loops, best of 3: 6.27 ms per loop
1000 loops, best of 3: 594 µs per loop

In [16]: ncols = 100
    ...: index = np.random.randint(0,ncols,(10000,2))
    ...: z = np.zeros(shape = [len(index),ncols], dtype = np.float32)

In [17]: %timeit foo0(z,index)
    ...: %timeit foo4(z,index)
100 loops, best of 3: 6.49 ms per loop
100 loops, best of 3: 2.74 ms per loop

In [38]: ncols = 300
    ...: index = np.random.randint(0,ncols,(1000,2))
    ...: z = np.zeros(shape = [len(index),ncols], dtype = np.float32)

In [39]: %timeit foo0(z,index)
    ...: %timeit foo4(z,index)
1000 loops, best of 3: 657 µs per loop
1000 loops, best of 3: 600 µs per loop

In [40]: ncols = 1000
    ...: index = np.random.randint(0,ncols,(1000,2))
    ...: z = np.zeros(shape = [len(index),ncols], dtype = np.float32)

In [41]: %timeit foo0(z,index)
    ...: %timeit foo4(z,index)
1000 loops, best of 3: 673 µs per loop
1000 loops, best of 3: 1.78 ms per loop

Thus, choosing the best one would depend on the number of columns of the problem set between the loopy and the broadcasting based vectorized one.

相关问答

更多
  • 这是(高级)部分索引的情况。 有2个索引数组和1个切片 如果索引子空间是分开的(按切片对象),则广播的索引空间是第一个,后面是x的切片子空间。 http://docs.scipy.org/doc/numpy-1.8.1/reference/arrays.indexing.html#advanced-indexing 先进的索引示例指出,当ind_1 , ind_2广播子空间是shape (2,3,4) : 但是,x [:,ind_1,:,ind_2]已经形成了(2,3,4,10,30,50),因为在索引子空 ...
  • 你可以使用NumPy broadcasting - mask = bot_ix > np.arange(arr.shape[0])[:,None] out = np.true_divide(np.einsum('ij,ij->j',arr,mask),mask.sum(0)) 样本运行以验证结果 - In [431]: arr Out[431]: array([[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 1 ...
  • 这是一个矢量化的方法,利用broadcasting来获取这些索引,用于索引每行的列,然后使用NumPy's advanced-indexing以矢量化方式从每行中提取出这些元素 - idx = A[:,0,None] + np.arange(stride+1) out = A[np.arange(idx.shape[0])[:,None], idx] 样品运行 - In [273]: A Out[273]: array([[ 1, 1, 2, 3, 4, 5, 6], [ 1, ...
  • 您可以使用专为此设计的pandas索引。 'series'是附加索引的1d数组。 参考Wes McKinney的Python for Data Analysis: import pandas as pd temp = np.random.randn(366) time_series = pd.Series(temp,index=np.arange(np.datetime64('2015-12-19'),np.datetime64('2016-12-19'))) start = np.datetime64( ...
  • 你可以这样做: var1="img" prescan_area_def = "[:, :20]" 并使用eval prescan_area=eval(var1+prescan_area_def) you can do something like: var1="img" prescan_area_def = "[:, :20]" and to use eval prescan_area=eval(var1+prescan_area_def)
  • 我们可以通过简单地将起始和结束索引与覆盖列长度的范围数组进行比较来利用NumPy broadcasting来实现矢量化解决方案,从而为我们提供一个掩码,该掩码表示输出数组中需要指定为1s 。 所以,解决方案将是这样的 - ncols = z.shape[1] r = np.arange(z.shape[1]) mask = (index[:,0,None] <= r) & (index[:,1,None] >= r) z[mask] = 1 样品运行 - In [39]: index = np.array ...
  • 问题是b[slice]创建了一个副本而不是一个视图(它触发了花式索引)。 代码b[slice][0:2]创建此副本的视图(不是原始b !)。 因此... b[slice][0:2] = a[slice][0:2] * 2 ...将a的对应行分配给b副本的视图。 因为它可能导致这些情况,所以最好不要以这种方式链接索引操作。 相反,只需先计算slice的相关行号,然后进行分配: slice = np.invert(percent).nonzero()[0][:2] # first two rows b[sli ...
  • 信息是通过array.__array_interface__暴露的(也许更好的地方),但我认为你应该只使用memmaps开头而不是搞乱这个。 检查例如np.may_share_memory函数(或实际上是np.byte_bounds )的numpy代码。 The informaiton is exposed through array.__array_interface__ (maybe somewhere better too), however I think you should probably j ...
  • data = np.linspace(0,10,50) starts = np.array([0,10,21]) length = 5 对于NumPy这样做的唯一方法,您可以使用此处所述的numpy.meshgrid() http://docs.scipy.org/doc/numpy/reference/generated/numpy.meshgrid.html 正如hpaulj在评论中指出的那样,这个问题实际上不需要meshgrid,因为你可以使用数组广播。 http://docs.scipy.org/ ...
  • 您的代码似乎并不能保证您获得一段length ,例如 >>> A = numpy.array([1,3,5,3,9]) >>> bigslice(A, 0, 3) array([1, 3, 5, 3, 9, 1, 3, 5]) 假设这是一个疏忽,也许你可以使用np.pad ,例如 def wpad(A, begin_at, length): to_pad = max(length + begin_at - len(A), 0) return np.pad(A, (0, to_pad), m ...

相关文章

更多

最新问答

更多
  • 获取MVC 4使用的DisplayMode后缀(Get the DisplayMode Suffix being used by MVC 4)
  • 如何通过引用返回对象?(How is returning an object by reference possible?)
  • 矩阵如何存储在内存中?(How are matrices stored in memory?)
  • 每个请求的Java新会话?(Java New Session For Each Request?)
  • css:浮动div中重叠的标题h1(css: overlapping headlines h1 in floated divs)
  • 无论图像如何,Caffe预测同一类(Caffe predicts same class regardless of image)
  • xcode语法颜色编码解释?(xcode syntax color coding explained?)
  • 在Access 2010 Runtime中使用Office 2000校对工具(Use Office 2000 proofing tools in Access 2010 Runtime)
  • 从单独的Web主机将图像传输到服务器上(Getting images onto server from separate web host)
  • 从旧版本复制文件并保留它们(旧/新版本)(Copy a file from old revision and keep both of them (old / new revision))
  • 西安哪有PLC可控制编程的培训
  • 在Entity Framework中选择基类(Select base class in Entity Framework)
  • 在Android中出现错误“数据集和渲染器应该不为null,并且应该具有相同数量的系列”(Error “Dataset and renderer should be not null and should have the same number of series” in Android)
  • 电脑二级VF有什么用
  • Datamapper Ruby如何添加Hook方法(Datamapper Ruby How to add Hook Method)
  • 金华英语角.
  • 手机软件如何制作
  • 用于Android webview中图像保存的上下文菜单(Context Menu for Image Saving in an Android webview)
  • 注意:未定义的偏移量:PHP(Notice: Undefined offset: PHP)
  • 如何读R中的大数据集[复制](How to read large dataset in R [duplicate])
  • Unity 5 Heighmap与地形宽度/地形长度的分辨率关系?(Unity 5 Heighmap Resolution relationship to terrain width / terrain length?)
  • 如何通知PipedOutputStream线程写入最后一个字节的PipedInputStream线程?(How to notify PipedInputStream thread that PipedOutputStream thread has written last byte?)
  • python的访问器方法有哪些
  • DeviceNetworkInformation:哪个是哪个?(DeviceNetworkInformation: Which is which?)
  • 在Ruby中对组合进行排序(Sorting a combination in Ruby)
  • 网站开发的流程?
  • 使用Zend Framework 2中的JOIN sql检索数据(Retrieve data using JOIN sql in Zend Framework 2)
  • 条带格式类型格式模式编号无法正常工作(Stripes format type format pattern number not working properly)
  • 透明度错误IE11(Transparency bug IE11)
  • linux的基本操作命令。。。