首页 \ 问答 \ 在HDFS上使用NIO进行mapreduce(mapreduce using NIO on HDFS)

在HDFS上使用NIO进行mapreduce(mapreduce using NIO on HDFS)

我需要将一些文本放入地图缩小过程的Mappers (全部)的HDFS文件中。

文本/文件在reducers进程中用作查找,因此它不能在常规路径中传播( context.write()

使用下面的代码段很慢,并且从不同的映射器激活时可能会产生文件锁定问题。

我很想使用ByteBuffer和文件锁(NIO)。 这可能在这个框架中吗? 此外,欢迎任何其他想法。

代码段:

Path fname = ...
FileSystem fs = FileSystem.get(context.getconfiguration());    
out = fs.create(fname);
while (condition) out.write(...);
out.flush();
out.close();

感谢您的任何想法/帮助。

拉兹


I'm need to put some text into an HDFS file from (all) the Mappers of a map reduce process.

The text / file is be used as a lookup in the reducers process so it cannot travel in the regular path (context.write())

Using the below snippet is both slow and may produce file lock issues when activated from the different mappers.

I would love to use ByteBuffer and file locks (NIO). is this possible in this framework ? Also, Any other ideas is welcome.

The code snippet:

Path fname = ...
FileSystem fs = FileSystem.get(context.getconfiguration());    
out = fs.create(fname);
while (condition) out.write(...);
out.flush();
out.close();

Thanks for any idea / help.

Raz


原文:https://stackoverflow.com/questions/27708753
更新时间:2023-10-18 12:10

最满意答案

phoneorg的内容是完全不同的数据结构,并且您将无法完全反序列化为像您在示例中一样的同质格式。 最好的选择是至少部分反序列化为一个struct

type data struct {
    Phone []map[string]string
    Org []map[string]map[string]string
}

这至少会反序列化所有的数据,但它仍然有点混乱; 一片地图的地图并不是一个很好的数据结构。 这个问题并不清楚,但是如果任何一个领域都是固定的,那么您可能也希望将这些领域的类型编码,例如:

type data struct {
    Phone []map[string]string
    Org []struct{
        Current map[string]string
    }
}

然后您可以反序列化为这种类型并更容易使用它:

var person data
json.Unmarshal([]byte(ii), &person)
fmt.Printf("%v", person.Phone)
fmt.Printf("%v", person.Org[0].Current)

工作场地示例: https//play.golang.org/p/5W-7RzPimZj

请注意,我必须更正JSON中的错误,因"org"之前缺少逗号而无效。


The contents of phone and org are completely different data structures, and you won't be able to cleanly deserialize both into a homogenous format like you've got in the example. The best option is to at least partially deserialize into a struct:

type data struct {
    Phone []map[string]string
    Org []map[string]map[string]string
}

This will at least deserialize all of the data, but it's still a bit messy; a slice of maps of maps is not a great data structure to work with. It's not clear from the question, but if any of the fields are fixed, you might want to codify those in types as well, for example:

type data struct {
    Phone []map[string]string
    Org []struct{
        Current map[string]string
    }
}

You can then deserialize to this type and use it much more easily:

var person data
json.Unmarshal([]byte(ii), &person)
fmt.Printf("%v", person.Phone)
fmt.Printf("%v", person.Org[0].Current)

Working playground example here: https://play.golang.org/p/5W-7RzPimZj

Note that I had to correct an error in the JSON, it is invalid due to a missing comma before "org".

相关问答

更多

相关文章

更多

最新问答

更多
  • 获取MVC 4使用的DisplayMode后缀(Get the DisplayMode Suffix being used by MVC 4)
  • 如何通过引用返回对象?(How is returning an object by reference possible?)
  • 矩阵如何存储在内存中?(How are matrices stored in memory?)
  • 每个请求的Java新会话?(Java New Session For Each Request?)
  • css:浮动div中重叠的标题h1(css: overlapping headlines h1 in floated divs)
  • 无论图像如何,Caffe预测同一类(Caffe predicts same class regardless of image)
  • xcode语法颜色编码解释?(xcode syntax color coding explained?)
  • 在Access 2010 Runtime中使用Office 2000校对工具(Use Office 2000 proofing tools in Access 2010 Runtime)
  • 从单独的Web主机将图像传输到服务器上(Getting images onto server from separate web host)
  • 从旧版本复制文件并保留它们(旧/新版本)(Copy a file from old revision and keep both of them (old / new revision))
  • 西安哪有PLC可控制编程的培训
  • 在Entity Framework中选择基类(Select base class in Entity Framework)
  • 在Android中出现错误“数据集和渲染器应该不为null,并且应该具有相同数量的系列”(Error “Dataset and renderer should be not null and should have the same number of series” in Android)
  • 电脑二级VF有什么用
  • Datamapper Ruby如何添加Hook方法(Datamapper Ruby How to add Hook Method)
  • 金华英语角.
  • 手机软件如何制作
  • 用于Android webview中图像保存的上下文菜单(Context Menu for Image Saving in an Android webview)
  • 注意:未定义的偏移量:PHP(Notice: Undefined offset: PHP)
  • 如何读R中的大数据集[复制](How to read large dataset in R [duplicate])
  • Unity 5 Heighmap与地形宽度/地形长度的分辨率关系?(Unity 5 Heighmap Resolution relationship to terrain width / terrain length?)
  • 如何通知PipedOutputStream线程写入最后一个字节的PipedInputStream线程?(How to notify PipedInputStream thread that PipedOutputStream thread has written last byte?)
  • python的访问器方法有哪些
  • DeviceNetworkInformation:哪个是哪个?(DeviceNetworkInformation: Which is which?)
  • 在Ruby中对组合进行排序(Sorting a combination in Ruby)
  • 网站开发的流程?
  • 使用Zend Framework 2中的JOIN sql检索数据(Retrieve data using JOIN sql in Zend Framework 2)
  • 条带格式类型格式模式编号无法正常工作(Stripes format type format pattern number not working properly)
  • 透明度错误IE11(Transparency bug IE11)
  • linux的基本操作命令。。。