使用Amazon MapReduce / Hadoop进行图像处理(Using Amazon MapReduce/Hadoop for Image Processing)
我有一个项目需要我处理大量(100-150MB)大图像(1000-10000)。 我正在做的处理可以通过Imagemagick完成,但我希望在Amazon的Elastic MapReduce平台(我相信使用Hadoop运行)上实际执行此处理。
在我发现的所有例子中,他们都处理基于文本的输入(我发现字数统计为十亿次)。 我无法找到有关Hadoop的这类工作的任何信息:从一组文件开始,对每个文件执行相同的操作,然后将新文件的输出写出为自己的文件。
我很确定这可以通过这个平台完成,并且应该可以使用Bash来完成; 我不认为我需要去创建一个完整的Java应用程序或其他东西,但我可能是错的。
我不是要求某人递交我的代码,但如果任何人有示例代码或指向处理类似问题的教程的链接,将不胜感激...
I have a project that requires me to process a lot (1000-10000) of big (100MB to 500MB) images. The processing I am doing can be done via Imagemagick, but I was hoping to actually do this processing on Amazon's Elastic MapReduce platform (which I believe runs using Hadoop).
Of all of the examples I have found, they all deal with text-based inputs (I have found that Word Count sample a billion times). I cannot find anything about this kind of work with Hadoop: starting with a set of files, performing the same action to each of the files, and then writing out the new file's output as it's own file.
I am pretty sure this can be done with this platform, and should be able to be done using Bash; I don't think I need to go to the trouble of creating a whole Java app or something, but I could be wrong.
I'm not asking for someone to hand me code, but if anyone has sample code or links to tutorials dealing with similar issues, it would be much appreciated...
原文:https://stackoverflow.com/questions/7816334
最满意答案
之前我遇到过这个问题,我想知道我的插件是否成功。 我的短期解决方案是在插入之前和之后调用表上的计数(*)并比较数字。
我从来没有找到一种方法来确定你用于INSERT IGNORE和INSERT ... ON DUPLICATE KEY的操作。
I ran into this problem before, where I wanted to know if my insert was successful or not. My short-term solution was to call a count(*) on the table before and after the insert and and compare the numbers.
I never found a way to determine which action you have used for both INSERT IGNORE and INSERT ... ON DUPLICATE KEY.
相关问答
更多-
您可以使用last_insert_rowid()函数来引用先前插入的行ID(主键)。 当然,您需要重新排序插入内容: INSERT INTO forms(form_name) VALUES ('form1'); INSERT INTO formsData(formid,data) VALUES (last_insert_rowid(),'data for form1'); INSERT INTO forms(form_name) VALUES ('form2'); INSERT INTO formsData ...
-
使用关键字VALUES来引用新值(参见文档 )。 INSERT INTO beautiful (name, age) VALUES ('Helen', 24), ('Katrina', 21), ('Samia', 22), ('Hui Ling', 25), ('Yumie', 29) ON DUPLICATE KEY UPDATE age = VALUES(age), ... Use keyword VALUES to refer to ...
-
MySQL使用ON DUPLICATE KEY插入/更新多行(MySQL Insert/Update Multiple Rows Using ON DUPLICATE KEY)[2022-04-13]
你忘记了values()关键字 INSERT INTO `buoy_stations` (`id`, `coords`, `name`, `owner`, `pgm`, `met`, `currents`) VALUES ('00922', 'Point(30,-90)','name 1','owner 1','pgm 1','y','y'), ('00923', 'Point(30,-90)','name 2','owner 2','pgm 2','y','y'), ('00924', 'Point(3 ... -
从手册 : 使用ON DUPLICATE KEY UPDATE,如果行作为新行插入,则每行的affected行行数为1,如果现有行已更新,则为2。 From the manual: With ON DUPLICATE KEY UPDATE, the affected-rows value per row is 1 if the row is inserted as a new row and 2 if an existing row is updated.
-
INSERT INTO ... SELECT FROM ... ON DUPLICATE KEY UPDATE(INSERT INTO … SELECT FROM … ON DUPLICATE KEY UPDATE)[2022-08-06]
MySQL将承担equals之前的部分引用INSERT INTO子句中指定的列,第二部分引用SELECT列。 INSERT INTO lee(exp_id, created_by, location, animal, starttime, endtime, entct, inact, inadur, inadist, smlct, smldur, smldist, larct, lardur, lardis ... -
“INSERT IGNORE”vs“INSERT ... ON DUPLICATE KEY UPDATE”(“INSERT IGNORE” vs “INSERT … ON DUPLICATE KEY UPDATE”)[2021-11-07]
我建议使用INSERT...ON DUPLICATE KEY UPDATE 。 如果使用INSERT IGNORE ,则如果导致重复键,则该行将不会实际插入。 但声明不会产生错误。 它会产生一个警告。 这些情况包括: 在PRIMARY KEY或UNIQUE约束的列中插入重复键。 将NULL插入到具有NOT NULL约束的列中。 将一行插入分区表,但插入的值不映射到分区。 如果使用REPLACE ,MySQL实际上会执行DELETE后跟INSERT内部,这会产生一些意想不到的副作用: 分配了一个新的自动增量I ... -
如果您只需要一次插入修订号行,您可以尝试这样做: knex.raw("INSERT INTO tablename (`col1`, `col2`, `col3`) VALUES (?, ?, ?), (?, ?, ?) ON DUPLICATE KEY UPDATE col2 = VALUES(`col2`)", ['val1', 'hello', 'world', 'val2', 'ohayo', 'minasan'], ); 如果您不知道一次需要插入多少,可以根据需要编写一个添加(?, ?, ...
-
之前我遇到过这个问题,我想知道我的插件是否成功。 我的短期解决方案是在插入之前和之后调用表上的计数(*)并比较数字。 我从来没有找到一种方法来确定你用于INSERT IGNORE和INSERT ... ON DUPLICATE KEY的操作。 I ran into this problem before, where I wanted to know if my insert was successful or not. My short-term solution was to call a count( ...
-
如果您想要餐厅ID /日期的唯一性,那么您需要在这些列上使用唯一索引/约束: create unique index idx_restaurant_views_2 on restaurant_views(restaurant_id, date) 然后你的代码应该工作。 If you want uniqueness on the restaurant id/date then you need a unique index/constraint on those columns: create unique ...
-
查阅这篇文章Yii INSERT ... ON DUPLICATE UPDATE 。 他们建议你不要使用这个功能。 但我希望它使用,所以我从CDbCommand扩展了我自己的组件并为ON DUPLICATE KEY UPDATE添加了方法: public function insertDuplicate($table, $columns, $duplicates) { $params=array(); $names=array(); $placeholders=array(); ...