首页 \ 问答 \ hadoop中明显的内存泄漏(Apparent memory-leak in hadoop)

hadoop中明显的内存泄漏(Apparent memory-leak in hadoop)

 我正在运行的hadoop程序中有明显的内存泄漏。 具体来说，我收到消息：超出了ERROR GC开销限制，之后是异常  
attempt_201210041336_0765_m_0000000_1: Exception in thread "Tread for syncLogs" java.lang.OutOfMemoryError: GC overhead limit exceeded
attempt_201210041336_0765_m_0000000_1: at java.util.Vector.elements (Vector.java:292)
attempt_201210041336_0765_m_0000000_1: at org.apache.log4j.helpers.AppenderAtachableImpl.getAllAppenders(AppenderAttachableImpl.java:84
attempt_201210041336_0765_m_0000000_1: at org.apache.log4j.Category.getAllAppenders (Category.java:415)
attempt_201210041336_0765_m_0000000_1: at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:256)
attempt_201210041336_0765_m_0000000_1: at org.apache.hadoop.mapred.Child$3.run(Child.java:157)
 
 我正在运行初始试验中应该是非常小的数据集，所以我不应该达到任何内存限制。 更重要的是，我不想改变hadoop配置; 如果程序无法使用当前配置运行，则需要重写程序。  
 任何人都可以帮我弄清楚如何诊断这个问题？ 是否有一个命令行参数来获取内存使用的堆栈跟踪？ 跟踪此问题的任何其他方式？  
 PS。 我手工编写了错误信息，无法从有问题的系统中复制粘贴。 所以请忽略任何错字作为我的愚蠢错误。  
 编辑：更新到此。 我再跑几次了; 虽然我总是得到错误GC开销限制超过消息我不总是得到log4j的堆栈跟踪。 所以问题可能不是log4j，而是log4j碰巧由于缺少内存而导致失败......其他的东西？ 

I have an apparent memory leak in a hadoop program I'm running. Specifically I get the message: ERROR GC overhead limit exceeded followed later by the exception 
attempt_201210041336_0765_m_0000000_1: Exception in thread "Tread for syncLogs" java.lang.OutOfMemoryError: GC overhead limit exceeded
attempt_201210041336_0765_m_0000000_1: at java.util.Vector.elements (Vector.java:292)
attempt_201210041336_0765_m_0000000_1: at org.apache.log4j.helpers.AppenderAtachableImpl.getAllAppenders(AppenderAttachableImpl.java:84
attempt_201210041336_0765_m_0000000_1: at org.apache.log4j.Category.getAllAppenders (Category.java:415)
attempt_201210041336_0765_m_0000000_1: at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:256)
attempt_201210041336_0765_m_0000000_1: at org.apache.hadoop.mapred.Child$3.run(Child.java:157)
 
I'm running on what should be very small data sets in an initial trial, so I shouldn't be hitting any memory limit. More to the point I don't want to change the hadoop configuration; if the program can't run with the current configuration the program needs rewritten.  
Can anyone help me figure out how to diagnose this issue? ise there a command line argument to get a stack trace of memory usage? any other way of tracking this issue? 
ps. I wrote the error message by hand, can't copy-paste from the system that has the issue. So please ignore any typo as being my stupid fault. 
edit: update to this. I ran the job a few more times; while I always get the Error GC overhead limit exceeded message I don't always get the stacktrace for log4j. So the issue is probably not log4j, instead log4j happened to fail due to the lack of memory caused by...something else?

原文：https://stackoverflow.com/questions/13647277

更新时间：2023-11-12 10:11

最满意答案

 唯一可以确定的方法是做一个循环; 一次读一个字符并存储。 如果您分配的缓冲区已满，请将其增加一些适当的数量（建议一次超过一个字节用于性能，经典的经验法则是将其加倍）。  
 当你考虑字符串结束时停止，可能是换行或EOF。 

The only way to be sure is to do a loop; read one character at a time and store. If your allocated buffer becomes full, grow it by some suitable amount (more than one byte at a time is recommended for performance, a classic rule-of-thumb is to double it). 
Stop when you consider the string to end, perhaps at line feed or EOF.

hadoop中明显的内存泄漏(Apparent memory-leak in hadoop)

最满意答案

相关问答

如何使用“process.stdin.on”？(how to work with “process.stdin.on”?)[2023-04-27]

从C中的stdin中读取最大缓冲区长度(Read from stdin in C without max buffer length)[2023-06-28]

C - fgets从stdin读取行，最大长度为1024？(C - fgets read line from stdin, maximum length is 1024? [duplicate])[2022-04-05]

c中从stdin输入的多行字符串(multiline string input from stdin in c)[2023-05-30]

如何从标准输入中读取长度为'n'的字符串(How to read a string of length 'n' from Standard input)[2022-02-26]

从stdin中读取字符串的长度[duplicate](read length of string from stdin [duplicate])[2022-07-16]

从C ++中的stdin读取长度大于4096字节的字符串(Read a string of length greater than 4096 bytes from stdin in C++)[2022-03-17]

从标准输入读取任意大小的字符串？(Read arbitrarily sized string from stdin? [duplicate])[2022-04-08]

使用Java Scanner从stdin顺序读取时抛出异常[重复](Exception thrown when use Java Scanner to read from stdin sequentially [duplicate])[2023-10-31]

如何获取字符串的字节长度？(How to get string's byte length? [duplicate])[2022-04-05]

相关文章

最新问答