一、
错误:Job failed with org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 6, hadoop104, executor 1): UnknownReason
FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. Spark job failed during runtime. Please check stacktrace for the root cause.
解决办法:
①检查udtf自定义函数,代码写的是否正确。
②检查执行sql写的是否正确。
二、
1)JVM堆(Heap)内存溢出:堆内存不足时,一般会抛出如下异常:
第一种:“java.lang.OutOfMemoryError:” GC overhead limit exceeded;
第二种:“Error: Java heapspace”异常信息;
第三种:“running beyondphysical memory limits.Current usage: 4.3 GB of 4.3 GBphysical memory used; 7.4 GB of 13.2 GB virtual memory used. Killing container”。
解决办法:在hive-env.sh这个文件中设置
export HADOOP_HEAPSIZE=4096
2) 栈内存溢出:抛出异常为:java.lang.StackOverflowError
常会出现在SQL中(SQL语句中条件组合太多,被解析成为不断的递归调用),或MR代码中有递归调用。这种深度的递归调用在栈中方法调用链条太长导致的。出现这种错误一般说明程序写的有问题。