首页 \ 问答 \ hadoop中明显的内存泄漏(Apparent memory-leak in hadoop)

hadoop中明显的内存泄漏(Apparent memory-leak in hadoop)

我正在运行的hadoop程序中有明显的内存泄漏。 具体来说,我收到消息:超出了ERROR GC开销限制,之后是异常

attempt_201210041336_0765_m_0000000_1: Exception in thread "Tread for syncLogs" java.lang.OutOfMemoryError: GC overhead limit exceeded
attempt_201210041336_0765_m_0000000_1: at java.util.Vector.elements (Vector.java:292)
attempt_201210041336_0765_m_0000000_1: at org.apache.log4j.helpers.AppenderAtachableImpl.getAllAppenders(AppenderAttachableImpl.java:84
attempt_201210041336_0765_m_0000000_1: at org.apache.log4j.Category.getAllAppenders (Category.java:415)
attempt_201210041336_0765_m_0000000_1: at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:256)
attempt_201210041336_0765_m_0000000_1: at org.apache.hadoop.mapred.Child$3.run(Child.java:157)

我正在运行初始试验中应该是非常小的数据集,所以我不应该达到任何内存限制。 更重要的是,我不想改变hadoop配置; 如果程序无法使用当前配置运行,则需要重写程序。

任何人都可以帮我弄清楚如何诊断这个问题? 是否有一个命令行参数来获取内存使用的堆栈跟踪? 跟踪此问题的任何其他方式?

PS。 我手工编写了错误信息,无法从有问题的系统中复制粘贴。 所以请忽略任何错字作为我的愚蠢错误。

编辑:更新到此。 我再跑几次了; 虽然我总是得到错误GC开销限制超过消息我不总是得到log4j的堆栈跟踪。 所以问题可能不是log4j,而是log4j碰巧由于缺少内存而导致失败......其他的东西?


I have an apparent memory leak in a hadoop program I'm running. Specifically I get the message: ERROR GC overhead limit exceeded followed later by the exception

attempt_201210041336_0765_m_0000000_1: Exception in thread "Tread for syncLogs" java.lang.OutOfMemoryError: GC overhead limit exceeded
attempt_201210041336_0765_m_0000000_1: at java.util.Vector.elements (Vector.java:292)
attempt_201210041336_0765_m_0000000_1: at org.apache.log4j.helpers.AppenderAtachableImpl.getAllAppenders(AppenderAttachableImpl.java:84
attempt_201210041336_0765_m_0000000_1: at org.apache.log4j.Category.getAllAppenders (Category.java:415)
attempt_201210041336_0765_m_0000000_1: at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:256)
attempt_201210041336_0765_m_0000000_1: at org.apache.hadoop.mapred.Child$3.run(Child.java:157)

I'm running on what should be very small data sets in an initial trial, so I shouldn't be hitting any memory limit. More to the point I don't want to change the hadoop configuration; if the program can't run with the current configuration the program needs rewritten.

Can anyone help me figure out how to diagnose this issue? ise there a command line argument to get a stack trace of memory usage? any other way of tracking this issue?

ps. I wrote the error message by hand, can't copy-paste from the system that has the issue. So please ignore any typo as being my stupid fault.

edit: update to this. I ran the job a few more times; while I always get the Error GC overhead limit exceeded message I don't always get the stacktrace for log4j. So the issue is probably not log4j, instead log4j happened to fail due to the lack of memory caused by...something else?


原文:https://stackoverflow.com/questions/13647277
更新时间:2023-11-12 10:11

最满意答案

唯一可以确定的方法是做一个循环; 一次读一个字符并存储。 如果您分配的缓冲区已满,请将其增加一些适当的数量(建议一次超过一个字节用于性能,经典的经验法则是将其加倍)。

当你考虑字符串结束时停止,可能是换行或EOF。


The only way to be sure is to do a loop; read one character at a time and store. If your allocated buffer becomes full, grow it by some suitable amount (more than one byte at a time is recommended for performance, a classic rule-of-thumb is to double it).

Stop when you consider the string to end, perhaps at line feed or EOF.

相关问答

更多

相关文章

更多

最新问答

更多
  • 您如何使用git diff文件,并将其应用于同一存储库的副本的本地分支?(How do you take a git diff file, and apply it to a local branch that is a copy of the same repository?)
  • 将长浮点值剪切为2个小数点并复制到字符数组(Cut Long Float Value to 2 decimal points and copy to Character Array)
  • OctoberCMS侧边栏不呈现(OctoberCMS Sidebar not rendering)
  • 页面加载后对象是否有资格进行垃圾回收?(Are objects eligible for garbage collection after the page loads?)
  • codeigniter中的语言不能按预期工作(language in codeigniter doesn' t work as expected)
  • 在计算机拍照在哪里进入
  • 使用cin.get()从c ++中的输入流中丢弃不需要的字符(Using cin.get() to discard unwanted characters from the input stream in c++)
  • No for循环将在for循环中运行。(No for loop will run inside for loop. Testing for primes)
  • 单页应用程序:页面重新加载(Single Page Application: page reload)
  • 在循环中选择具有相似模式的列名称(Selecting Column Name With Similar Pattern in a Loop)
  • System.StackOverflow错误(System.StackOverflow error)
  • KnockoutJS未在嵌套模板上应用beforeRemove和afterAdd(KnockoutJS not applying beforeRemove and afterAdd on nested templates)
  • 散列包括方法和/或嵌套属性(Hash include methods and/or nested attributes)
  • android - 如何避免使用Samsung RFS文件系统延迟/冻结?(android - how to avoid lag/freezes with Samsung RFS filesystem?)
  • TensorFlow:基于索引列表创建新张量(TensorFlow: Create a new tensor based on list of indices)
  • 企业安全培训的各项内容
  • 错误:RPC失败;(error: RPC failed; curl transfer closed with outstanding read data remaining)
  • C#类名中允许哪些字符?(What characters are allowed in C# class name?)
  • NumPy:将int64值存储在np.array中并使用dtype float64并将其转换回整数是否安全?(NumPy: Is it safe to store an int64 value in an np.array with dtype float64 and later convert it back to integer?)
  • 注销后如何隐藏导航portlet?(How to hide navigation portlet after logout?)
  • 将多个行和可变行移动到列(moving multiple and variable rows to columns)
  • 提交表单时忽略基础href,而不使用Javascript(ignore base href when submitting form, without using Javascript)
  • 对setOnInfoWindowClickListener的意图(Intent on setOnInfoWindowClickListener)
  • Angular $资源不会改变方法(Angular $resource doesn't change method)
  • 在Angular 5中不是一个函数(is not a function in Angular 5)
  • 如何配置Composite C1以将.m和桌面作为同一站点提供服务(How to configure Composite C1 to serve .m and desktop as the same site)
  • 不适用:悬停在悬停时:在元素之前[复制](Don't apply :hover when hovering on :before element [duplicate])
  • 常见的python rpc和cli接口(Common python rpc and cli interface)
  • Mysql DB单个字段匹配多个其他字段(Mysql DB single field matching to multiple other fields)
  • 产品页面上的Magento Up出售对齐问题(Magento Up sell alignment issue on the products page)