ImageMagick转换内存不足(ImageMagick convert out of memory)
我有一个自定义应用程序在CentOS 6.7上运行,带有64 GB的RAM,它基本上是一个文件爬虫,每当它找到一个匹配某些文件扩展名的文件(主要是TIFF或多页TIFF)时调用以下bash脚本。 我无法确切地说出频率和已考虑的文件数量,但它的数量级是数千。
#!/bin/bash IMAGE_INPUT=$1 OUTPUT=$2 TMP_FOLDER=/data/tesseract-tmp # generating a unique random file name TFN=`cat /dev/urandom | tr -cd 'a-f0-9' | head -c 32`; # converting the image and putting the result into the TFN /usr/bin/convert -density 288 "$IMAGE_INPUT" -resize 75% -quality 100 -append jpeg:$TMP_FOLDER/$TFN; # extract text with tesseract and put it into a result file /usr/local/bin/tesseract $TMP_FOLDER/$TFN $TMP_FOLDER/$TFN.out; cp $TMP_FOLDER/$TFN.out.txt $OUTPUT; # returning the file content to std output cat $OUTPUT;
临时文件正由cronjob清理。
我注意到经过一段时间和大量的脚本调用后,top命令向我显示imagemagick的gs和转换进程占用了所有可用的内存,并且它们开始消耗所有可用的交换空间。 如果我不杀死这些进程,系统会耗尽内存并冻结。
我该如何解决这种情况? 有没有办法限制特定程序(转换)的内存量,还是有可能将执行调用队列排队到脚本?
注意我已经看到转换命令有限制选项,但如果我理解正确,它适用于正在运行的进程的单个实例,而我想限制整个运行实例的内存使用。
谢谢
I have a custom application runnning on CentOS 6.7 with 64 GB of RAM, which is basically a file crawler that calls the following bash script every time it finds a file matching some file extensions(mainly TIFFs or multipage TIFFs). I can't tell exactly the frequency and how many files are been considered, but it's in the order of thousands.
#!/bin/bash IMAGE_INPUT=$1 OUTPUT=$2 TMP_FOLDER=/data/tesseract-tmp # generating a unique random file name TFN=`cat /dev/urandom | tr -cd 'a-f0-9' | head -c 32`; # converting the image and putting the result into the TFN /usr/bin/convert -density 288 "$IMAGE_INPUT" -resize 75% -quality 100 -append jpeg:$TMP_FOLDER/$TFN; # extract text with tesseract and put it into a result file /usr/local/bin/tesseract $TMP_FOLDER/$TFN $TMP_FOLDER/$TFN.out; cp $TMP_FOLDER/$TFN.out.txt $OUTPUT; # returning the file content to std output cat $OUTPUT;
The temp files are being cleaned by a cronjob.
I have noticed that after some time and a lot of calls to the script, the top command shows me that the gs and convert processes of imagemagick are taking all the memory available, and they start to consume all the swap space available. If I don't kill those processes the system runs out of memory and freezes.
How can I solve this situation? Is there a way to limit the amount of memory for a particular program(convert) or is there the possibility to queue the execution of calls to the script?
N.B. I have seen that there is the limit option for the convert command, but if I'm understanding right, it applies to the single instance of the running process, while I would like to limit the memory usage for the whole running instances.
Thanks
原文:https://stackoverflow.com/questions/48423113
最满意答案
正如评论中所提到的,
periodic
是Sidekiq Enterprise的一部分。如果你想在Sidekiq中拥有类似cron的工作,你可以使用多个GEM中的一个:
- sidekiq-的cron
- sidetiq - 不再维护,但我个人非常喜欢它
- 发条 - 适用于每个排队系统
我相信有更多的插件。
As mentioned in comments
periodic
is part of Sidekiq Enterprise.If you want to have cron-like jobs in Sidekiq, you can use one of multiple GEMs:
- sidekiq-cron
- sidetiq - no longer maintained, but personally I really liked it
- clockwork - works with every queuing system
I am sure there are more plugins.
相关问答
更多-
rails未定义的模块方法(rails undefined method at module)[2023-09-20]
require 'active_support/concern' module Callable extend ActiveSupport::Concern include Elasticsearch::Model include Elasticsearch::Model::Callbacks included do end module ClassMethods def setting_index(arguments) settings index ... -
Rails模块的未定义方法(Rails undefined method for Module)[2024-01-11]
所以我不确定你想用你的模块来完成什么,但一个快速解决方案让它工作在下面。 将my_module.rb移出帮助程序并移入lib / my_module.rb。 助手目录适用于您在视图中使用的方法。 惯例是利用在其各自的控制器之后命名空间的帮助程序或用于视图的全局方法的application_helper.rb。 不确定这是你用你的模块试图完成的,但是想把它扔出去。 在config / initializers / custom_modules.rb中创建一个初始化程序(你可以做任何事情)并添加require' ... -
Sidekiq Enterprise领导者角色由辅助服务器持有。 当持有领导者角色时,定期作业被锁定且无法修改。 关闭所有Sidekiq服务器,包括领导者,然后重新启动Sidekiq以清除并重新注册初始化程序中的所有作业。 无需使用API删除定期作业。 The Sidekiq Enterprise leader role was held by a secondary server. When the leader role is held the Periodic Jobs are locked and ...
-
正如评论中所提到的, periodic是Sidekiq Enterprise的一部分。 如果你想在Sidekiq中拥有类似cron的工作,你可以使用多个GEM中的一个: sidekiq-的cron sidetiq - 不再维护,但我个人非常喜欢它 发条 - 适用于每个排队系统 我相信有更多的插件。 As mentioned in comments periodic is part of Sidekiq Enterprise. If you want to have cron-like jobs in Sid ...
-
您需要实例化然后延迟它。 MyService.new(arg).delay.call 编辑: MyService.new(arg)应该响应find(id)方法。 如果没有,您需要编写自定义工作者 You need to instantiate and then delay it. MyService.new(arg).delay.call EDIT: MyService.new(arg) should respond to find(id) method. If it don't, you need t ...
-
Rails中的惯例是将文件中的Class名称和目录中的模块名称进行转换。 所以如果你把你的UserTest :: Test类放在你的app / model目录下的test.rb文件中,那么autoload无法获得你的类。 因为在app/model/user_test/test.rb文件上搜索。 因此,您可以通过在文件顶部添加一个需求来“强制”您的控制器中的需求。 如果你把你的类放在你的test.rb中,这个require是: require 'test.rb' 要知道如何定义您的需求,请考虑应用程序的LOA ...
-
我需要设置ENV文件中的凭据。 格式如下: heroku config:set GITHUB_USERNAME=joesmith 您还可以查看heroku关于配置变量(配置变量)的信息:heroku上的配置变量 I needed to set my credentials that are inside of the ENV file. The format is as follows: heroku config:set GITHUB_USERNAME=joesmith You can also ch ...
-
使用带Sidekiq的APN_Sender时未定义的方法'notify'(Undefined method 'notify' when using APN_Sender with Sidekiq)[2022-01-06]
此问题已在新版本v2.1.1中得到解决 请参阅https://github.com/arthurnn/apn_sender/issues/92 This issue has been resolved in the new release version v2.1.1 Refer https://github.com/arthurnn/apn_sender/issues/92 -
如何在sidekiq worker类中的lib文件夹中包含rails模块(How to include rails module in lib folder inside sidekiq worker class)[2023-06-20]
我建议遵循Rails自动加载约定名称 ,因此其中任何一个都应该有效: # lib/utils/customer_utils.rb module Utils class CustomerUtils ... end end 要么 # lib/customer_utils.rb class CustomerUtils ... end 通常,文件夹名称是名称空间,文件名是类名。 请注意,命名约定已在Rails版本上更改,因此如果没有特定的配置选项,则可能无法自动加载lib 。 您还可以使用r ... -
我相信我已经找到了答案:config.threadsafe !,我没有做过。 我现在已经这样做了,大多数(如果不是全部)错误都消失了。 参考文献: http : //guides.rubyonrails.org/configuring.html,http : //m.onkey.org/thread-safety-for-your-rails (特别是“Ruby的要求不是原子的”一节)。 I believe that I have discovered the answer: config.threadsa ...