首页 \ 问答 \ 搜索类似于生产者 - 消费者的算法(Searching an algorithm similar to producer-consumer)

搜索类似于生产者 - 消费者的算法(Searching an algorithm similar to producer-consumer)

我想问一下是否有人会对以下场景的最佳 (最快)算法有所了解:

  • X进程生成一个非常大的文件列表。 每个进程一次生成一个文件
  • 正在通知Y进程文件已准备好。 每个Y进程都有自己的队列来收集通知
  • 在给定时间,1 X进程将通过具有Round Rubin算法的Load Balancer通知1 Y进程
  • 每个文件都有一个大小,当然,更大的文件会使X和Y更加忙碌

限制

  • 一旦文件进入Y进程,删除它并将其移动到另一个Y进程是不切实际的

我现在想不出其他限制。

这种方法的缺点

  • 有时X落后(不再推文件)。 它并没有受到排队系统的影响,无论我是否改变它,它仍然会有慢/好的时间。
  • 有时Y落后(很多文件聚集在队列中)。 再次,像以前一样。
  • 1 Y进程忙于处理非常大的文件。 它的队列中还有几个小文件可供其他Y进程使用。
  • 通知本身是通过HTTP,有时似乎不可靠。 通知失败,调试没有透露任何内容。

还有一些细节可以帮助更清楚地看到图片。

  • Y进程是数据库线程/作业
  • X进程是Web应用程序
  • 一旦文件到达X进程,这些进程也会通过查询来从数据库端刻录资源。 它对生产部分有影响

现在我考虑了以下方法:

  • X将生成像以前一样的文件,但不会通知Y.它将保存一个缓冲区(表)来填充文件列表
  • Y将不断搜索缓冲区中的文件并自行检索它们并将它们存储在自己的队列中。

现在这种变化是否切合实际? 就像我说的,每个Y进程都有自己的队列,保持它似乎没有效率。 如果是这样,那么下一点我还是犹豫不决:

如何确定要获取的文件

我已经阅读了背包问题,我认为如果我从一开始就拥有整个文件列表,那么它就有应用程序。 实际上,我确实有每个文件的列表和大小,但我不知道何时可以准备好每个文件。

我已经解决了生产者 - 消费者问题,但是它以固定缓冲区为中心并进行了优化,但在这种情况下,缓冲区是无限的,我不关心它是大还是小。

下一个最佳选择是贪婪的方法,其中每个Y进程锁定最小的文件并接受它。 起初它似乎是最快的方法,我目前正在建立一个模拟来验证,但第二个意见将是太棒了。

更新只是为了确保每个人都能了解全局,我在这里链接一个快速完成的图表。

  • 工作独立于流程。 它们将以一定的速度运行并处理可能的文件数量。
  • 当作业完成文件时,它将向LB发送HTTP请求
  • 每个进程对来自LB的请求(文件)进行排队
  • LB适用于循环规则


I would like to ask if someone would have an idea on the best(fastest) algorithm for the following scenario:

  • X processes generate a list of very large files. Each process generates one file at a time
  • Y processes are being notified that a file is ready. Each Y process has its own queue to collect the notifications
  • At a given time 1 X process will notify 1 Y process through a Load Balancer that has the Round Rubin algorithm
  • Each file has a size and naturally, bigger files will keep both X and Y more busy

Limitations

  • Once a file gets on a Y process it would be impractical to remove it and move it to another Y process.

I can't think of other limitations at the moment.

Disadvantages to this approach

  • sometimes X falls behind(files are no longer pushed). It's not really impacted by the queueing system and no matter if I change it it will still have slow/good times.
  • sometimes Y falls behind(a lot of files gather in the queues). Again, the same thing like before.
  • 1 Y process is busy with a very large file. It also has several small files in its queue that could be taken on by other Y processes.
  • The notification itself is through HTTP and seems somehow unreliable sometimes. Notifications fail and debugging has not revealed anything.

There are some more details that would help to see the picture more clearly.

  • Y processes are DB threads/jobs
  • X processes are web apps
  • Once files reach the X processes, these would also burn resources from the DB side by querying it. It has an impact on the producing part

Now I considered the following approach:

  • X will produce files like it has before but will not notify Y. It will hold a buffer (table) to populate the file list
  • Y will constantly search for files in the buffer and retrieve them itself and store them in its own queue.

Now would this change be practical? Like I said, each Y process has its own queue, it doesn't seem to be efficient to keep it anymore. If so, then I'm still undecided on the next bit:

How to decide which files to fetch

I've read through the knapsack problem and I think that has application if I would have the entire list of files from the beginning which I don't. Actually, I do have the list and the size of each file but I wouldn't know when each file would be ready to be taken.

I've gone through the producer-consumer problem but that centers around a fixed buffer and optimising that but in this scenario the buffer is unlimited and I don't really care if it is large or small.

The next best option would be a greedy approach where each Y process locks on the smallest file and takes it. At first it does appear to be the fastest approach and I'm currently building a simulation to verify that but a second opinion would be fantastic.

Update Just to be sure that everyone gets the big picture, I'm linking here a fast-done diagram.

  • Jobs are independent from Processes. They will run at a speed and process how many files are possible.
  • When a Job finishes with a file it will send a HTTP request to the LB
  • Each process queues requests (files) coming from the LB
  • The LB works on a round robin rule

Diagram


原文:https://stackoverflow.com/questions/15585304
更新时间:2024-04-23 13:04

最满意答案

iOS UIAutomation,apple为在目标主机上运行任务提供了api。

performTaskWithPathArgumentsTimeout

使用这个,我们可以使用bash脚本打印出我们想要在第一种情况下获取的文件的内容。

对于此要求,Bash脚本可以像这样简单。

 #! /bin/bash
FILE_NAME="$1"
cat $FILE_NAME

将其另存为FileReader.sh文件。

在您的自动化脚本中,

    var target = UIATarget.localTarget();
    var host = target.host();
    var result = host.performTaskWithPathArgumentsTimeout(executablePath,[filePath,fileName], 15);
UIALogger.logDebug("exitCode: " + result.exitCode);
    UIALogger.logDebug("stdout: " + result.stdout);
    UIALogger.logDebug("stderr: " + result.stderr);

where in,executablePath是需要执行命令的地方。

var executablePath = "/bin/sh";

filePath是创建的FileReader.sh文件的位置。 执行时,将内容输出到标准输出(在我们的要求中)。 [给出文件的完整绝对路径]

fileName是从中获取内容的实际文件。 [给出文件的完整绝对路径]在我的情况下,我有一个Contents.csv文件,我必须阅读。

最后一个参数是以秒为单位的超时。

希望这有助于其他人,尝试获取内容(阅读文件)以执行iOS UIAutomation。

参考文献:

https://stackoverflow.com/a/19016573/344798

https://developer.apple.com/library/iOS/documentation/UIAutomation/Reference/UIAHostClassReference/UIAHost/UIAHost.html


iOS UIAutomation, apple provides an api for running a task on the target's host.

performTaskWithPathArgumentsTimeout

Using this, we can have a bash script to printout the contents of a file that we wanted to fetch in the first case.

Bash script can be as simple as this for this requirement.

 #! /bin/bash
FILE_NAME="$1"
cat $FILE_NAME

Save it as for example FileReader.sh file.

And in your automation script,

    var target = UIATarget.localTarget();
    var host = target.host();
    var result = host.performTaskWithPathArgumentsTimeout(executablePath,[filePath,fileName], 15);
UIALogger.logDebug("exitCode: " + result.exitCode);
    UIALogger.logDebug("stdout: " + result.stdout);
    UIALogger.logDebug("stderr: " + result.stderr);

where in, executablePath is where the command need to be executed.

var executablePath = "/bin/sh";

filePath is the location of the created FileReader.sh file. When executed, outputs the content to standard output (in our requirement). [give full absolute path of the file]

fileName is the actual file to fetch contents from. [give full absolute path of the file] In my case I had a Contents.csv file, which I had to read.

and the last parameter is the timeout in seconds.

Hope this helps others, trying to fetch contents (reading files) for performing iOS UIAutomation.

References:

https://stackoverflow.com/a/19016573/344798

https://developer.apple.com/library/iOS/documentation/UIAutomation/Reference/UIAHostClassReference/UIAHost/UIAHost.html

相关问答

更多

相关文章

更多

最新问答

更多
  • CSS修复容器和溢出元素(CSS Fix container and overflow elements)
  • SQL多个连接在与where子句相同的表上(SQL Multiple Joins on same table with where clause)
  • nginx 80端口反向代理多个域名,怎样隐藏端口的
  • xcode提醒样式,swift 3(xcode alert style, swift 3)
  • 在Chrome控制台中调试JavaScript(debugging javascript in Chrome console)
  • Javascript - 试图围绕自定义事件(Javascript - Trying to wrap my head around custom events)
  • 边栏链接不可点击(Sidebar links aren't clickable)
  • 使用recpatcha gem时如何显示其他表单错误?(How do I display other form errors when using the recpatcha gem?)
  • boost.python避免两次注册内部类,但仍然在python中公开(boost.python Avoid registering inner class twice but still expose in python)
  • Android 现在软件很少吗?以后会多起来吗
  • 如何在ActiveAdmin 0.5.0中为资源全局指定预先加载?(How to specify eager loading globally for a resource in ActiveAdmin 0.5.0?)
  • matlab代码为黄金比例持续分数(matlab code for golden ratio continued fraction)
  • Android浏览器触摸事件位置(Android browser touch event location)
  • 将cURL输出分配给Bash中的变量(Assign output to variable in Bash)
  • 我如何在MVC视图上没有时间获取当前日期(how i can get current date without time on MVC view)
  • sql连接函数(sql join of function)
  • 为什么在Xamarin Media插件中使用ImageSource.FromStream而不是FromFile?(Why use ImageSource.FromStream instead of FromFile in Xamarin Media plugin?)
  • 这段代码是否真的可以防止SQL注入?(Will this code actually work against SQL-injection? [duplicate])
  • 信阳方远计算机学校大专证被国家认可么
  • React / Rails AJAX POST请求返回404(React/Rails AJAX POST request returns 404)
  • Android与php服务器交互(Android interact with php server)
  • 自动刷新QTableWidget可能吗?(Refresh QTableWidget automatically possible?)
  • JVM / Compiler优化对象的未使用属性(optimization of unused properties of object by JVM / Compiler)
  • 插入表格时,乌克兰字符会更改为问号(Ukrainian character change to question mark when insert to table)
  • 在头文件中包含异常类(Including an exception class in a header file)
  • 完成c#中的执行后关闭sqlcmd(Close sqlcmd after finishing executing in c#)
  • 使用软导航栏正确检测屏幕尺寸(Detecting screensize correctly with soft navigation bar)
  • Typescript:从输入更新值(Typescript : update value from input)
  • 如何在执行某些行后仅在断点处停止?(How to only stop at a breakpoint after some line was executed?)
  • 以未定义的元素在JSON中循环(loop in JSON with undefined elements)