文章目录
- tez作业报错UnsatisfiedLinkError
- 排查
- 检查udf
- 客户端报错
- 处理
tez作业报错UnsatisfiedLinkError
某个节点上的contianer报错
2023-11-20 17:20:10,836 [ERROR] [TezChild] |tez.MapRecordSource|: java.lang.UnsatisfiedLinkError: com.xxxxxxx.Gx.SMxxxxx.sm4dec(Ljava/lang/String;Ljava/lang/String;)Ljava/lang/String;at com.xxxxxxx.Gx.SMxxxxx.sm4dec(Native Method)at com.xxxxxxx.xxx.udf.GenericUDFSM4Decrypt.evaluate(GenericUDFSM4Decrypt.java:47)at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:197)at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68)at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995)at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:941)at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928)at org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:126)at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995)at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:941)at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:153)at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:555)at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419)at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)at java.security.AccessController.doPrivileged(Native Method)at javax.security.auth.Subject.doAs(Subject.java:422)at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)at java.lang.Thread.run(Thread.java:748)
排查
检查udf
从报错的堆栈看出是使用udf的时候出现了问题。
首先检查
- udf是否有相关变动
- 其次,检查注册的函数和jar是否可用
- 查看application日志,contianer是否已拉取并缓存相关的udf的jar
- 分别使用hiveserver2和hive客户端去测试使用udf,客户端使用application日志中报错的节点
客户端报错
tez相关的作业在使用时是以客户端的形式去使用的。登录对应的客户端测试发现报错。
相比堆栈中的报错多了一行关键的信息
Cannot load Uxxx library:java.lang.UnsatisfiedLinkError: no smxxxx_x64 in java.library.path
处理
这里显示找不到udf使用的共享库so文件,所以需要把 libxxx.so 拷贝到 ${HADOOP_HOME}/lib/native目录下。再次测试成功
- 如果是通过hiveserver2使用,需要将对应的so文件加入server机器的${HADOOP_HOME}/lib/native、udf jar加入到 ${HIVE_HOME}/lib/
- 如果需要客户端使用,需要将对应的so文件和udf jar加入到客户端机器的${HADOOP_HOME}/lib/native、udf jar加入到 ${HIVE_HOME}/lib/
- 如果on yarn的作业需要使用,也可以将文件放到hdfs的路径上