如何使用 Spark java 从 mariadb 读取数据
我需要使用 Spark 和 Java 从 MariaDB 读取一个表.
I need to read a table from MariaDB by using Spark and Java.
I wrote a Java code for read table data from database.The connection is established successfully but it produces an error while reading the data. I am trying to read the table data as a dataframe. But the column name is shown as column value in result. find the code given below:
import java.io.IOException;
import java.io.InputStream;
import java.util.Properties;
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.SparkSession;
import static org.apache.spark.sql.functions.col;
public class mariadb_to_csv {
public static void main(String[] args) {
Properties prop = new Properties();
String resourceName = "config.properties";
ClassLoader loader = Thread.currentThread().getContextClassLoader();
try(InputStream resourceStream = loader.getResourceAsStream(resourceName)) {
} catch (IOException e) {
SparkSession spark = SparkSession.builder()
.appName("Java Spark SQL basic example")
.config("spark.some.config.option", "some-value").getOrCreate();
Dataset<Row> jdbcDF = spark.read().format("jdbc")
.option("driver", "org.mariadb.jdbc.Driver")
.option("dbtable", "source_table")
.option("user", "username")
.option("password", "password")
jdbcDF.select(col("code"), col("name"), col("isActive"), col("createdByUser"), col("modifiedByUser")).show();
In result, the column value is duplicated in column name.
好像maridb"连接器有问题.将主机 URL 从jdbc:mariadb://${Hostname}:${Port}/${Database}"更改为jdbc:mysql://${Hostname}:${Port}/${Database}" 为我解决了这个问题.
Seems there is a problem with "maridb" connector. Changing the host url from "jdbc:mariadb://${Hostname}:${Port}/${Database}" to "jdbc:mysql://${Hostname}:${Port}/${Database}" solved the problem for me.
MariaDB 和 Databricks 也使用jdbc"作为连接 url 来解释如何使用 Spark 从 Mariadb 读取数据.
MariaDB and Databricks also used "jdbc" as connection url to explain how to read data from Mariadb using Spark.