INSERT ... ON DUPLICATE KEY UPDATE Syntax

2023-07-25 19:49:34

ON DUPLICATE KEY UPDATE ：不用用于批量，除 insert into t1 select * from t2 on duplicated key update k1=v1,k2,v2
DUPLICATE KEY ：是一个唯一索引，如果insert中记录，与已存在记录判重的依据是唯一索引中的字段

一 mybatis中返回自动生成的id

当有时我们插入一条数据时，由于id很可能是自动生成的，如果我们想要返回这条刚插入的id怎么办呢。
在mysql数据中
我们可以在insert下添加一个selectKey用以指定返回的类型和值

<insert id="xxx" parammeterType="xxx">

    <selectKey resultType="java.lang.Integer" order="AFTER" keyProperty="id">

              SELECT LAST_INSERT_ID() AS ID

   </selectKey>

insert into ...

</insert>

插入后主键回填：

<insert id="xxx"

parammeterType="role"

useGeneratedKeys="true"

keyProperty="id" >

      insert into  t_role (role_name,note) values (#{roleName},#{note})

</insert>

这样我们传入的role对象就无需设置id的值，Mybatis会用数据库的设置进行处理。
这样做的好处是在MyBatis插入的时候，它会回填JavaBean的id值。

其中resultType表示返回的类型。ID就是返回的刚插入的ID

在oracle中类似selectKey如下

<selectKey resultType="java.lang.Integer" order="BEFORE" keyProperty="id">

            SELECT LOGS_SEQ.nextval AS ID FROM DUAL

</selectKey>

http://blog.sina.com.cn/s/blog_9344098b01019r6v.html

<insert id="saveOrUpdate" >

  <selectKey keyProperty="count" resultType="int" order="BEFORE">

    select count(*) from country where id = #{id}

  </selectKey>

  <if test="count > 0">

    update country

    set countryname = #{countryname},countrycode = #{countrycode}

    where id = #{id}

  </if>

  <if test="count==0">

    insert into country values(#{id},#{countryname},#{countrycode})

  </if>

</insert>

http://www.cnblogs.com/softidea/p/6066106.html

先描述一下这个问题的起因，假设有一张表，里面保存了交易订单，每张订单有唯一的ID，有最后更新时间，还有数据，详情如下：

+-------+----------+------+-----+---------------------+-------+

+-------+----------+------+-----+---------------------+-------+

| UID | int(11) | NO | PRI | 0 | |

| Time | datetime | NO | | 0000-00-00 00:00:00 | |

| Data | int(11) | YES | | NULL | |

+-------+----------+------+-----+---------------------+-------+

针对这张表会做追加及更新的操作，具体来说就是如果订单不存在就INSERT一条新的，如果已存在就UPDATE。由于入库前无法得知相应记录是否已存在，通常的做法无法以下几种：

1、先SELECT一下，再决定INSERT还是UPDATE；

2、直接UPDATE，如果受影响行数是0，再INSERT；

3、直接INSERT，如果发生主键冲突，再UPDATE；

这几种方法都有缺陷，对MySQL来说其实最好的是直接利用INSERT...ON DUPLICATE KEY UPDATE...语句，具体到上面的test表，执行语句如下：

INSERT INTO test VALUES (1, '2016-1-1', 10) ON DUPLICATE KEY UPDATE Time='2016-1-1',Data=10;

可以很好的插入或更新数据，一条语句就搞定，至此一直工作得很好。

后来因为查询方式变更，要求将UID和Time两个字段做联合主键，此时表结构如下：

+-------+----------+------+-----+---------------------+-------+

+-------+----------+------+-----+---------------------+-------+

| UID | int(11) | NO | PRI | 0 | |

| Time | datetime | NO | PRI | 0000-00-00 00:00:00 | |

| Data | int(11) | YES | | NULL | |

+-------+----------+------+-----+---------------------+-------+

但是问题来了：一但Time字段被更新，即使是相同的UID，也被数据库认为是不同的主键，因此不会产生主键冲突，上面的语句就失效了，数据库里出现了很多UID相同的数据。

开始寻找解决办法，其实也简单，按MySQL文档里的说明，ON DUPLICATE KEY UPDATE语句判断是否冲突是依靠主键或唯一索引，因此为UID建立唯一索引就可以了。先建索引：

CREATE UNIQUE INDEX IDX_UID ON test(UID);

再测试一下插入：

INSERT INTO test VALUES (1, '2016-1-1', 10) ON DUPLICATE KEY UPDATE Time='2016-1-1',Data=10;

INSERT INTO test VALUES (1, '2016-2-1', 20) ON DUPLICATE KEY UPDATE Time='2016-2-1',Data=20;

检查数据库，可以看到不会有多条数据生成，唯一的一条数据是Data字段被更新成20的，成功。

http://boytnt.blog.51cto.com/966121/1736690/

LAST_INSERT_ID(), LAST_INSERT_ID(expr)

With no argument, LAST_INSERT_ID() returns a BIGINT UNSIGNED (64-bit) value representing the first automatically generated value successfully inserted for an AUTO_INCREMENT column as a result of the most recently executed INSERTstatement. The value of LAST_INSERT_ID() remains unchanged if no rows are successfully inserted.

With an argument, LAST_INSERT_ID() returns an unsigned integer.

For example, after inserting a row that generates an AUTO_INCREMENT value, you can get the value like this:

mysql> SELECT LAST_INSERT_ID();

        ->

The currently executing statement does not affect the value of LAST_INSERT_ID(). Suppose that you generate an AUTO_INCREMENT value with one statement, and then refer to LAST_INSERT_ID() in a multiple-row INSERT statement that inserts rows into a table with its own AUTO_INCREMENT column. The value of LAST_INSERT_ID() will remain stable in the second statement; its value for the second and later rows is not affected by the earlier row insertions. (However, if you mix references to LAST_INSERT_ID() and LAST_INSERT_ID(expr), the effect is undefined.)

If the previous statement returned an error, the value of LAST_INSERT_ID() is undefined. For transactional tables, if the statement is rolled back due to an error, the value of LAST_INSERT_ID() is left undefined. For manual ROLLBACK, the value of LAST_INSERT_ID() is not restored to that before the transaction; it remains as it was at the point of the ROLLBACK.

http://dev.mysql.com/doc/refman/5.7/en/information-functions.html

14.2.5.3 INSERT ... ON DUPLICATE KEY UPDATE Syntax

If you specify ON DUPLICATE KEY UPDATE, and a row is inserted that would cause a duplicate value in a UNIQUE index or PRIMARY KEY, MySQL performs an UPDATE of the old row. For example, if column a is declared as UNIQUE and contains the value 1, the following two statements have similar effect:

INSERT INTO table (a,b,c) VALUES (1,2,3)

  ON DUPLICATE KEY UPDATE c=c+1;

UPDATE table SET c=c+1 WHERE a=1;

(The effects are not identical for an InnoDB table where a is an auto-increment column. With an auto-increment column, an INSERT statement increases the auto-increment value but UPDATE does not.)

The ON DUPLICATE KEY UPDATE clause can contain multiple column assignments, separated by commas.

With ON DUPLICATE KEY UPDATE, the affected-rows value per row is 1 if the row is inserted as a new row, 2 if an existing row is updated, and 0 if an existing row is set to its current values. If you specify the CLIENT_FOUND_ROWS flag tomysql_real_connect() when connecting to mysqld, the affected-rows value is 1 (not 0) if an existing row is set to its current values.

If column b is also unique, the INSERT is equivalent to this UPDATE statement instead:

UPDATE table SET c=c+1 WHERE a=1 OR b=2 LIMIT 1;

If a=1 OR b=2 matches several rows, only one row is updated. In general, you should try to avoid using an ON DUPLICATE KEY UPDATE clause on tables with multiple unique indexes.

You can use the VALUES(col_name) function in the UPDATE clause to refer to column values from the INSERT portion of the INSERT ... ON DUPLICATE KEY UPDATE statement. In other words, VALUES(col_name) in the ON DUPLICATE KEY UPDATE clause refers to the value of col_name that would be inserted, had no duplicate-key conflict occurred. This function is especially useful in multiple-row inserts. The VALUES() function is meaningful only in INSERT ... UPDATEstatements and returns NULL otherwise.
Example:

INSERT INTO table (a,b,c) VALUES (1,2,3),(4,5,6)

  ON DUPLICATE KEY UPDATE c=VALUES(a)+VALUES(b);

That statement is identical to the following two statements:

INSERT INTO table (a,b,c) VALUES (1,2,3)

  ON DUPLICATE KEY UPDATE c=3;

INSERT INTO table (a,b,c) VALUES (4,5,6)

  ON DUPLICATE KEY UPDATE c=9;

If a table contains an AUTO_INCREMENT column and INSERT ... ON DUPLICATE KEY UPDATE inserts or updates a row, the LAST_INSERT_ID() function returns the AUTO_INCREMENT value.

The DELAYED option is ignored when you use ON DUPLICATE KEY UPDATE.

Because the results of INSERT ... SELECT statements depend on the ordering of rows from the SELECT and this order cannot always be guaranteed, it is possible when logging INSERT ... SELECT ON DUPLICATE KEY UPDATE statements for the master and the slave to diverge. Thus, INSERT ... SELECT ON DUPLICATE KEY UPDATE statements are flagged as unsafe for statement-based replication. With this change, such statements produce a warning in the log when using statement-based mode and are logged using the row-based format when using MIXED mode. In addition, an INSERT ... ON DUPLICATE KEY UPDATE statement against a table having more than one unique or primary key is also marked as unsafe. (Bug #11765650, Bug #58637) See also Section 18.2.1.1, “Advantages and Disadvantages of Statement-Based and Row-Based Replication”.

http://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html

INSERT INTO ON DUPLICATE KEY UPDATE 与 REPLACE INTO，两个命令可以处理重复键值问题，在实际上它之间有什么区别呢？
前提条件是这个表必须有一个唯一索引或主键。

1、REPLACE发现重复的先删除再插入，如果记录有多个字段，在插入的时候如果有的字段没有赋值，那么新插入的记录这些字段为空。
2、INSERT发现重复的是更新操作。在原有记录基础上，更新指定字段内容，其它字段内容保留。

这样REPLACE的操作成本要大于 insert ON DUPLICATE KEY UPDATE ，按道理应该选用insert ON DUPLICATE KEY UPDATE

部分测试如下
2个都是影响的数据栏: 2

INSERT语法

INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]

       [INTO] tbl_name [(col_name,...)]

       VALUES ({expr | DEFAULT},...),(...),...

       [ ON DUPLICATE KEY UPDATE col_name=expr, ... ]

或：

INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]

       [INTO] tbl_name

       SET col_name={expr | DEFAULT}, ...

       [ ON DUPLICATE KEY UPDATE col_name=expr, ... ]

或：

INSERT [LOW_PRIORITY | HIGH_PRIORITY] [IGNORE]

       [INTO] tbl_name [(col_name,...)]

       SELECT ...

       [ ON DUPLICATE KEY UPDATE col_name=expr, ... ]

一、DELAYED 的使用

     使用延迟插入操作

DELAYED调节符应用于INSERT和REPLACE语句。当DELAYED插入操作到达的时候，

服务器把数据行放入一个队列中，并立即给客户端返回一个状态信息，这样客户

端就可以在数据表被真正地插入记录之前继续进行操作了。如果读取者从该数据

表中读取数据，队列中的数据就会被保持着，直到没有读取者为止。接着服务器

开始插入延迟数据行（delayed-row）队列中的数据行。在插入操作的同时，服务器

还要检查是否有新的读取请求到达和等待。如果有，延迟数据行队列就被挂起，

允许读取者继续操作。当没有读取者的时候，服务器再次开始插入延迟的数据行。

这个过程一直进行，直到队列空了为止。

几点要注意事项：

· INSERT DELAYED应该仅用于指定值清单的INSERT语句。服务器忽略用于INSERT DELAYED...SELECT语句的DELAYED。

· 服务器忽略用于INSERT DELAYED...ON DUPLICATE UPDATE语句的DELAYED。

· 因为在行被插入前，语句立刻返回，所以您不能使用LAST_INSERT_ID()来获取AUTO_INCREMENT值。AUTO_INCREMENT值可能由语句生成。

· 对于SELECT语句，DELAYED行不可见，直到这些行确实被插入了为止。

· DELAYED在从属复制服务器中被忽略了，因为DELAYED不会在从属服务器中产生与主服务器不一样的数据。

注意，目前在队列中的各行只保存在存储器中，直到它们被插入到表中为止。这意味着，如果您强行中止了mysqld（例如，使用kill -9）

或者如果mysqld意外停止，则所有没有被写入磁盘的行都会丢失。

二、IGNORE的使用

IGNORE是MySQL相对于标准SQL的扩展。如果在新表中有重复关键字，

或者当STRICT模式启动后出现警告，则使用IGNORE控制ALTER TABLE的运行。

如果没有指定IGNORE，当重复关键字错误发生时，复制操作被放弃，返回前一步骤。

如果指定了IGNORE，则对于有重复关键字的行，只使用第一行，其它有冲突的行被删除。

并且，对错误值进行修正，使之尽量接近正确值。

insert ignore into tb(...) value(...)

这样不用校验是否存在了，有则忽略，无则添加

三、ON DUPLICATE KEY UPDATE的使用

如果您指定了ON DUPLICATE KEY UPDATE，并且插入行后会导致在一个UNIQUE索引或PRIMARY KEY中出现重复值，则执行旧行UPDATE。例如，如果列a被定义为UNIQUE，并且包含值1，则以下两个语句具有相同的效果：

mysql> INSERT INTO table (a,b,c) VALUES (,,)

       -> ON DUPLICATE KEY UPDATE c=c+;

mysql> UPDATE table SET c=c+ WHERE a=;

如果行作为新记录被插入，则受影响行的值为1；如果原有的记录被更新，则受影响行的值为2。

注释：如果列b也是唯一列，则INSERT与此UPDATE语句相当：

mysql> UPDATE table SET c=c+ WHERE a= OR b= LIMIT ;

如果a=1 OR b=2与多个行向匹配，则只有一个行被更新。通常，您应该尽量避免对带有多个唯一关键字的表使用ON DUPLICATE KEY子句。

您可以在UPDATE子句中使用VALUES(col_name)函数从INSERT...UPDATE语句的INSERT部分引用列值。换句话说，如果没有发生重复关键字冲突，则UPDATE子句中的VALUES(col_name)可以引用被插入的col_name的值。本函数特别适用于多行插入。VALUES()函数只在INSERT...UPDATE语句中有意义，其它时候会返回NULL。

示例：

mysql> INSERT INTO table (a,b,c) VALUES (,,),(,,)

       -> ON DUPLICATE KEY UPDATE c=VALUES(a)+VALUES(b);

本语句与以下两个语句作用相同：

mysql> INSERT INTO table (a,b,c) VALUES (,,)

       -> ON DUPLICATE KEY UPDATE c=;

mysql> INSERT INTO table (a,b,c) VALUES (,,)

       -> ON DUPLICATE KEY UPDATE c=;

当您使用ON DUPLICATE KEY UPDATE时，DELAYED选项被忽略。

总结：DELAYED 做为快速插入，并不是很关心失效性，提高插入性能。

http://yuninglovekefan.blog.sohu.com/263559230.html

码农公寓

14.2.5.3 INSERT ... ON DUPLICATE KEY UPDATE Syntax

相关文章