具有唯一字段的MySQL数据库忽略结束空格

我的项目需要从用户开始输入,左边的间距和单词右边的间距,例如’apple’.如果用户键入“apple”或“apple”,无论是单词的左侧还是右侧是一个空格还是多个空格,我都需要以这种方式存储它.

此字段具有Unique属性,但我尝试在左侧插入带间距的单词,并且它可以正常工作.但是当我试图在右边插入带有间距的单词时,它会修剪单词右边的所有间距.

所以我想在间距之后在单词的右边添加一个特殊字符.但我希望有一个更好的解决方案来解决这个问题.

CREATE TABLE strings
( id bigint(20) unsigned NOT NULL AUTO_INCREMENT,
string varchar(255) COLLATE utf8_bin NOT NULL,
created_ts timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (id), UNIQUE KEY string (string) )
ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8 COLLATE=utf8_bin

解决方法:

问题是MySQL在进行字符串比较时会忽略尾随空格.看到
http://dev.mysql.com/doc/refman/5.7/en/char.html

All MySQL collations are of type PADSPACE. This means that all CHAR, VARCHAR, and TEXT values in MySQL are compared without regard to any trailing spaces.

For those cases where trailing pad characters are stripped or comparisons ignore them, if a column has an index that requires unique values, inserting into the column values that differ only in number of trailing pad characters will result in a duplicate-key error. For example, if a table contains ‘a’, an attempt to store ‘a ‘ causes a duplicate-key error.

(此信息适用于5.7;对于8.0,此信息已更改,请参见下文)

like运算符的部分给出了这种行为的一个例子(并表明喜欢跟踪空格):

mysql> SELECT 'a' = 'a ', 'a' LIKE 'a ';
+------------+---------------+
| 'a' = 'a ' | 'a' LIKE 'a ' |
+------------+---------------+
|          1 |             0 |
+------------+---------------+
1 row in set (0.00 sec)

不幸的是,UNIQUE索引似乎使用标准的字符串比较来检查是否已经有这样的值,因此忽略了尾随空格.
这与使用VARCHAR或CHAR无关,在这两种情况下插入都被拒绝,因为唯一检查失败.如果有一种方法可以使用类似语义进行UNIQUE检查,那么我不知道.

你可以做的是将值存储为VARBINARY:

mysql> create table test_ws ( `value` varbinary(255) UNIQUE );
Query OK, 0 rows affected (0.13 sec)

mysql> insert into test_ws (`value`) VALUES ('a');
Query OK, 1 row affected (0.08 sec)

mysql> insert into test_ws (`value`) VALUES ('a ');
Query OK, 1 row affected (0.06 sec)

mysql> SELECT CONCAT( '(', value, ')' ) FROM test_ws;
+---------------------------+
| CONCAT( '(', value, ')' ) |
+---------------------------+
| (a)                       |
| (a )                      |
+---------------------------+
2 rows in set (0.00 sec)

您最好不要在此列上按字母顺序排序,因为排序将发生在字节值上,而不是用户期望的(大多数用户,无论如何).

另一种方法是修补MySQL并编写自己的归类,类型为NO PAD.不确定是否有人想这样做,但如果你这样做,请告诉我;)

编辑:同时根据https://dev.mysql.com/doc/refman/8.0/en/char.html,MySQL具有NO PAD类型的排序规则:

Most MySQL collations have a pad attribute of PAD SPACE. The exceptions are Unicode collations based on UCA 9.0.0 and higher, which have a pad attribute of NO PAD.

https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-sets.html

Unicode collations based on UCA versions later than 4.0.0 include the version in the collation name. Thus, utf8mb4_unicode_520_ci is based on UCA 5.2.0 weight keys, whereas utf8mb4_0900_ai_ci is based on UCA 9.0.0 weight keys.

所以,如果你尝试:

  create table test_ws ( `value` varbinary(255) UNIQUE )
    character set utf8mb4 collate utf8mb4_0900_ai_ci;

您可以插入带有和不带尾随空格的值

您可以找到所有可用的NO PAD排序规则:

 show collation where Pad_attribute='NO PAD';
上一篇:Centos7-安装mysql5-7


下一篇:go mod的使用