首页 \ 问答 \ ImageMagick转换内存不足(ImageMagick convert out of memory)

ImageMagick转换内存不足(ImageMagick convert out of memory)

我有一个自定义应用程序在CentOS 6.7上运行,带有64 GB的RAM,它基本上是一个文件爬虫,每当它找到一个匹配某些文件扩展名的文件(主要是TIFF或多页TIFF)时调用以下bash脚本。 我无法确切地说出频率和已考虑的文件数量,但它的数量级是数千。

#!/bin/bash

IMAGE_INPUT=$1
OUTPUT=$2
TMP_FOLDER=/data/tesseract-tmp

# generating a unique random file name
TFN=`cat /dev/urandom | tr -cd 'a-f0-9' | head -c 32`;
# converting the image and putting the result into the TFN
/usr/bin/convert -density 288 "$IMAGE_INPUT" -resize 75% -quality 100 -append jpeg:$TMP_FOLDER/$TFN;
# extract text with tesseract and put it into a result file
/usr/local/bin/tesseract $TMP_FOLDER/$TFN $TMP_FOLDER/$TFN.out;
cp $TMP_FOLDER/$TFN.out.txt $OUTPUT;
# returning the file content to std output
cat $OUTPUT;

临时文件正由cronjob清理。

我注意到经过一段时间和大量的脚本调用后,top命令向我显示imagemagick的gs转换进程占用了所有可用的内存,并且它们开始消耗所有可用的交换空间。 如果我不杀死这些进程,系统会耗尽内存并冻结。

我该如何解决这种情况? 有没有办法限制特定程序(转换)的内存量,还是有可能将执行调用队列排队到脚本?

注意我已经看到转换命令有限制选项,但如果我理解正确,它适用于正在运行的进程的单个实例,而我想限制整个运行实例的内存使用。

谢谢


I have a custom application runnning on CentOS 6.7 with 64 GB of RAM, which is basically a file crawler that calls the following bash script every time it finds a file matching some file extensions(mainly TIFFs or multipage TIFFs). I can't tell exactly the frequency and how many files are been considered, but it's in the order of thousands.

#!/bin/bash

IMAGE_INPUT=$1
OUTPUT=$2
TMP_FOLDER=/data/tesseract-tmp

# generating a unique random file name
TFN=`cat /dev/urandom | tr -cd 'a-f0-9' | head -c 32`;
# converting the image and putting the result into the TFN
/usr/bin/convert -density 288 "$IMAGE_INPUT" -resize 75% -quality 100 -append jpeg:$TMP_FOLDER/$TFN;
# extract text with tesseract and put it into a result file
/usr/local/bin/tesseract $TMP_FOLDER/$TFN $TMP_FOLDER/$TFN.out;
cp $TMP_FOLDER/$TFN.out.txt $OUTPUT;
# returning the file content to std output
cat $OUTPUT;

The temp files are being cleaned by a cronjob.

I have noticed that after some time and a lot of calls to the script, the top command shows me that the gs and convert processes of imagemagick are taking all the memory available, and they start to consume all the swap space available. If I don't kill those processes the system runs out of memory and freezes.

How can I solve this situation? Is there a way to limit the amount of memory for a particular program(convert) or is there the possibility to queue the execution of calls to the script?

N.B. I have seen that there is the limit option for the convert command, but if I'm understanding right, it applies to the single instance of the running process, while I would like to limit the memory usage for the whole running instances.

Thanks


原文:https://stackoverflow.com/questions/48423113
更新时间:2022-07-23 22:07

最满意答案

正如评论中所提到的, periodicSidekiq Enterprise的一部分。

如果你想在Sidekiq中拥有类似cron的工作,你可以使用多个GEM中的一个:

我相信有更多的插件。


As mentioned in comments periodic is part of Sidekiq Enterprise.

If you want to have cron-like jobs in Sidekiq, you can use one of multiple GEMs:

I am sure there are more plugins.

相关问答

更多

相关文章

更多

最新问答

更多
  • 您如何使用git diff文件,并将其应用于同一存储库的副本的本地分支?(How do you take a git diff file, and apply it to a local branch that is a copy of the same repository?)
  • 将长浮点值剪切为2个小数点并复制到字符数组(Cut Long Float Value to 2 decimal points and copy to Character Array)
  • OctoberCMS侧边栏不呈现(OctoberCMS Sidebar not rendering)
  • 页面加载后对象是否有资格进行垃圾回收?(Are objects eligible for garbage collection after the page loads?)
  • codeigniter中的语言不能按预期工作(language in codeigniter doesn' t work as expected)
  • 在计算机拍照在哪里进入
  • 使用cin.get()从c ++中的输入流中丢弃不需要的字符(Using cin.get() to discard unwanted characters from the input stream in c++)
  • No for循环将在for循环中运行。(No for loop will run inside for loop. Testing for primes)
  • 单页应用程序:页面重新加载(Single Page Application: page reload)
  • 在循环中选择具有相似模式的列名称(Selecting Column Name With Similar Pattern in a Loop)
  • System.StackOverflow错误(System.StackOverflow error)
  • KnockoutJS未在嵌套模板上应用beforeRemove和afterAdd(KnockoutJS not applying beforeRemove and afterAdd on nested templates)
  • 散列包括方法和/或嵌套属性(Hash include methods and/or nested attributes)
  • android - 如何避免使用Samsung RFS文件系统延迟/冻结?(android - how to avoid lag/freezes with Samsung RFS filesystem?)
  • TensorFlow:基于索引列表创建新张量(TensorFlow: Create a new tensor based on list of indices)
  • 企业安全培训的各项内容
  • 错误:RPC失败;(error: RPC failed; curl transfer closed with outstanding read data remaining)
  • C#类名中允许哪些字符?(What characters are allowed in C# class name?)
  • NumPy:将int64值存储在np.array中并使用dtype float64并将其转换回整数是否安全?(NumPy: Is it safe to store an int64 value in an np.array with dtype float64 and later convert it back to integer?)
  • 注销后如何隐藏导航portlet?(How to hide navigation portlet after logout?)
  • 将多个行和可变行移动到列(moving multiple and variable rows to columns)
  • 提交表单时忽略基础href,而不使用Javascript(ignore base href when submitting form, without using Javascript)
  • 对setOnInfoWindowClickListener的意图(Intent on setOnInfoWindowClickListener)
  • Angular $资源不会改变方法(Angular $resource doesn't change method)
  • 在Angular 5中不是一个函数(is not a function in Angular 5)
  • 如何配置Composite C1以将.m和桌面作为同一站点提供服务(How to configure Composite C1 to serve .m and desktop as the same site)
  • 不适用:悬停在悬停时:在元素之前[复制](Don't apply :hover when hovering on :before element [duplicate])
  • 常见的python rpc和cli接口(Common python rpc and cli interface)
  • Mysql DB单个字段匹配多个其他字段(Mysql DB single field matching to multiple other fields)
  • 产品页面上的Magento Up出售对齐问题(Magento Up sell alignment issue on the products page)