spark连接mysql数据库

1.安装启动检查Mysql服务。
netstat -tunlp (3306)

 service mysql start sudo netstat -tap | grep mysql

spark连接mysql数据库
2.spark 连接mysql驱动程序。

–cp /usr/local/hive/lib/mysql-connector-java-5.1.40-bin.jar /usr/local/spark/jars

pyspark \ --jars /usr/local/spark/jars/mysql-connector-java-8.0.25/mysql-connector-java-8.0.25.jar \ --driver-class-path /usr/local/spark/jars/mysql-connector-java-8.0.25/mysql-connector-java-8.0.25.jar

spark连接mysql数据库

3.启动 Mysql shell,新建数据库spark,表student。

select * from student;

sudo mysql -u root -p

mysql> create database spark; mysql> use spark; mysql> create table student (id int(4), name char(20), gender char(4), age int(4)); mysql> alter table student change id id int auto_increment primary key; mysql> insert into student values(1,‘Xueqian‘,‘F‘,23); mysql> insert into student values(2,‘Weiliang‘,‘M‘,24); mysql> select * from student;

spark连接mysql数据库

4.spark读取MySQL数据库中的数据
spark.read.format("jdbc").option("url", "jdbc:mysql://localhost:3306/spark?useSSL=false") ...  .load()

>>> jdbcDF = spark.read.format("jdbc").option("url", "jdbc:mysql://localhost:3306/spark").option("driver","com.mysql.cj.jdbc.Driver").option("dbtable", "student").option("user", "root").option("password", "hadoop").load() >>> jdbcDF.show()

spark连接mysql数据库

5.spark向MySQL数据库写入数据

studentDF.write.format(‘jdbc’).option(…).mode(‘append’).save() 

>>> from pyspark.sql.types import Row >>> from pyspark.sql.types import StructType >>> from pyspark.sql.types import StructField >>> from pyspark.sql.types import StringType >>> from pyspark.sql.types import IntegerType >>> studentRDD = spark.sparkContext.parallelize(["3 Rongcheng M 26","4 Guanhua M 27"]).map(lambda line : line.split(" ")) # 下面要设置模式信息 >>> schema = StructType([StructField("name", StringType(), True),StructField("gender", StringType(), True),StructField("age",IntegerType(), True)]) >>> rowRDD = studentRDD.map(lambda p : Row(p[1].strip(), p[2].strip(),int(p[3]))) # 建立起Row对象和模式之间的对应关系,也就是把数据和模式对应起来 >>> studentDF = spark.createDataFrame(rowRDD, schema) >>> prop = {} >>> prop[‘user‘] = ‘root‘ >>> prop[‘password‘] = ‘hadoop‘ >>> prop[‘driver‘] = "com.mysql.cj.jdbc.Driver" >>> studentDF.write.jdbc("jdbc:mysql://localhost:3306/spark",‘student‘,‘append‘, prop)

spark连接mysql数据库

 

 mysql> select * from student;

spark连接mysql数据库

 

 spark连接mysql数据库

 

 

 

spark连接mysql数据库

上一篇:[AWS - DA - Guru] DynamoDB Exam Tips


下一篇:09 spark连接mysql数据库