在 Spark 中找不到适合 jdbc 的驱动程序

2021-11-14 00:00:00 apache-spark apache-spark-sql mysql jdbc

我正在使用

df.write.mode("append").jdbc("jdbc:mysql://ip:port/database", "table_name", properties)

插入到 MySQL 中的表中.

to insert into a table in MySQL.

此外,我在代码中添加了 Class.forName("com.mysql.jdbc.Driver").

Also, I have added Class.forName("com.mysql.jdbc.Driver") in my code.

当我提交 Spark 申请时:

When I submit my Spark application:

spark-submit --class MY_MAIN_CLASS
  --master yarn-client
  --jars /path/to/mysql-connector-java-5.0.8-bin.jar
  --driver-class-path /path/to/mysql-connector-java-5.0.8-bin.jar
  MY_APPLICATION.jar

这种纱线客户端模式对我有用.

This yarn-client mode works for me.

但是当我使用纱线簇模式时:

But when I use yarn-cluster mode:

spark-submit --class MY_MAIN_CLASS
  --master yarn-cluster
  --jars /path/to/mysql-connector-java-5.0.8-bin.jar
  --driver-class-path /path/to/mysql-connector-java-5.0.8-bin.jar
  MY_APPLICATION.jar

它不起作用.我也试过设置--conf":

It doens't work. I also tried setting "--conf":

spark-submit --class MY_MAIN_CLASS
  --master yarn-cluster
  --jars /path/to/mysql-connector-java-5.0.8-bin.jar
  --driver-class-path /path/to/mysql-connector-java-5.0.8-bin.jar
  --conf spark.executor.extraClassPath=/path/to/mysql-connector-java-5.0.8-bin.jar
  MY_APPLICATION.jar

但仍然出现找不到适合 jdbc 的驱动程序"错误.

but still get the "No suitable driver found for jdbc" error.

推荐答案

有 3 种可能的解决方案,

There is 3 possible solutions,

  1. 您可能希望使用构建管理器(Maven、SBT)组装您的应用程序,因此您无需在 spark-submit cli 中添加依赖项.
  2. 您可以在 spark-submit cli 中使用以下选项:

  1. You might want to assembly you application with your build manager (Maven,SBT) thus you'll not need to add the dependecies in your spark-submit cli.
  2. You can use the following option in your spark-submit cli :

--jars $(echo ./lib/*.jar | tr ' ' ',')

说明:假设您在项目根目录的 lib 目录中拥有所有 jar,这将读取所有库并将它们添加到应用程序提交中.

Explanation : Supposing that you have all your jars in a lib directory in your project root, this will read all the libraries and add them to the application submit.

您也可以尝试在 SPARK_HOME/conf/spark 中配置这 2 个变量:spark.driver.extraClassPathspark.executor.extraClassPath-default.conf 文件并将这些变量的值指定为jar文件的路径.确保工作节点上存在相同的路径.

You can also try to configure these 2 variables : spark.driver.extraClassPath and spark.executor.extraClassPath in SPARK_HOME/conf/spark-default.conf file and specify the value of these variables as the path of the jar file. Ensure that the same path exists on worker nodes.

相关文章