首页 \ 问答 \ 使用Amazon MapReduce / Hadoop进行图像处理(Using Amazon MapReduce/Hadoop for Image Processing)

使用Amazon MapReduce / Hadoop进行图像处理(Using Amazon MapReduce/Hadoop for Image Processing)

 我有一个项目需要我处理大量（100-150MB）大图像（1000-10000）。 我正在做的处理可以通过Imagemagick完成，但我希望在Amazon的Elastic MapReduce平台（我相信使用Hadoop运行）上实际执行此处理。  
 在我发现的所有例子中，他们都处理基于文本的输入（我发现字数统计为十亿次）。 我无法找到有关Hadoop的这类工作的任何信息：从一组文件开始，对每个文件执行相同的操作，然后将新文件的输出写出为自己的文件。  
 我很确定这可以通过这个平台完成，并且应该可以使用Bash来完成; 我不认为我需要去创建一个完整的Java应用程序或其他东西，但我可能是错的。  
 我不是要求某人递交我的代码，但如果任何人有示例代码或指向处理类似问题的教程的链接，将不胜感激... 

I have a project that requires me to process a lot (1000-10000) of big (100MB to 500MB) images. The processing I am doing can be done via Imagemagick, but I was hoping to actually do this processing on Amazon's Elastic MapReduce platform (which I believe runs using Hadoop). 
Of all of the examples I have found, they all deal with text-based inputs (I have found that Word Count sample a billion times). I cannot find anything about this kind of work with Hadoop: starting with a set of files, performing the same action to each of the files, and then writing out the new file's output as it's own file. 
I am pretty sure this can be done with this platform, and should be able to be done using Bash; I don't think I need to go to the trouble of creating a whole Java app or something, but I could be wrong. 
I'm not asking for someone to hand me code, but if anyone has sample code or links to tutorials dealing with similar issues, it would be much appreciated...

原文：https://stackoverflow.com/questions/7816334

更新时间：2024-02-09 22:02

最满意答案

 之前我遇到过这个问题，我想知道我的插件是否成功。 我的短期解决方案是在插入之前和之后调用表上的计数（*）并比较数字。  
 我从来没有找到一种方法来确定你用于INSERT IGNORE和INSERT ... ON DUPLICATE KEY的操作。 

I ran into this problem before, where I wanted to know if my insert was successful or not. My short-term solution was to call a count(*) on the table before and after the insert and and compare the numbers.  
I never found a way to determine which action you have used for both INSERT IGNORE and INSERT ... ON DUPLICATE KEY.

使用Amazon MapReduce / Hadoop进行图像处理(Using Amazon MapReduce/Hadoop for Image Processing)

最满意答案

相关问答

如何知道python sqlite中最近添加的行的整数主键[重复](How to know the integer primary key of recently added row in python sqlite [duplicate])[2023-03-23]

MySQL ON DUPLICATE KEY UPDATE for multiple rows insert in single query(MySQL ON DUPLICATE KEY UPDATE for multiple rows insert in single query)[2022-06-19]

MySQL使用ON DUPLICATE KEY插入/更新多行(MySQL Insert/Update Multiple Rows Using ON DUPLICATE KEY)[2022-04-13]

为什么在我的“INSERT ... ON DUPLICATE KEY UPDATE”中有2行受影响？(Why are 2 rows affected in my `INSERT … ON DUPLICATE KEY UPDATE`?)[2023-04-16]

INSERT INTO ... SELECT FROM ... ON DUPLICATE KEY UPDATE(INSERT INTO … SELECT FROM … ON DUPLICATE KEY UPDATE)[2022-08-06]

“INSERT IGNORE”vs“INSERT ... ON DUPLICATE KEY UPDATE”(“INSERT IGNORE” vs “INSERT … ON DUPLICATE KEY UPDATE”)[2021-11-07]

如何使用knex使用“Insert ... ON DUPLICATE KEY UPDATE”添加多行(How to add multiple rows using “Insert … ON DUPLICATE KEY UPDATE” using knex)[2022-08-20]

INSERT INTO ...在python中没有DUPLICATE KEY子句来检查添加的新行(INSERT INTO…NO DUPLICATE KEY clause in python to check new rows added)[2023-09-23]

INSERT INTO - ON DUPLICATE KEY(INSERT INTO - ON DUPLICATE KEY)[2023-09-26]

使用“INSERT ... ON DUPLICATE KEY UPDATE”插入多个记录(Use“ INSERT … ON DUPLICATE KEY UPDATE” to insert multiple records)[2023-06-23]

相关文章

最新问答