Answers:
在我的指南《如何在MySQL数据库中支持完整Unicode》中,这是您可以用来更新数据库,表或列的字符集和排序规则的查询:
对于每个数据库:
ALTER DATABASE
database_name
CHARACTER SET = utf8mb4
COLLATE = utf8mb4_unicode_ci;
对于每个表:
ALTER TABLE
table_name
CONVERT TO CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci;
对于每一列:
ALTER TABLE
table_name
CHANGE column_name column_name
VARCHAR(191)
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci;
(不要盲目地复制粘贴此内容!确切的声明取决于列的类型,最大长度和其他属性。上一行只是VARCHAR
列的示例。)
但是请注意,您无法完全自动进行从utf8
到的转换utf8mb4
。如上述指南的第4步中所述,您需要检查列和索引键的最大长度,因为指定的数字在utf8mb4
代替时具有不同的含义utf8
。
我有一个解决方案,可以通过运行一些命令来转换数据库和表。它还转换类型的所有列varchar
,text
,tinytext
,mediumtext
,longtext
,char
。您还应该备份数据库,以防万一发生问题。
将以下代码复制到名为preAlterTables.sql的文件中:
use information_schema;
SELECT concat("ALTER DATABASE `",table_schema,"` CHARACTER SET = utf8mb4 COLLATE = utf8mb4_unicode_ci;") as _sql
FROM `TABLES` where table_schema like "yourDbName" group by table_schema;
SELECT concat("ALTER TABLE `",table_schema,"`.`",table_name,"` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;") as _sql
FROM `TABLES` where table_schema like "yourDbName" group by table_schema, table_name;
SELECT concat("ALTER TABLE `",table_schema,"`.`",table_name, "` CHANGE `",column_name,"` `",column_name,"` ",data_type,"(",character_maximum_length,") CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci",IF(is_nullable="YES"," NULL"," NOT NULL"),";") as _sql
FROM `COLUMNS` where table_schema like "yourDbName" and data_type in ('varchar','char');
SELECT concat("ALTER TABLE `",table_schema,"`.`",table_name, "` CHANGE `",column_name,"` `",column_name,"` ",data_type," CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci",IF(is_nullable="YES"," NULL"," NOT NULL"),";") as _sql
FROM `COLUMNS` where table_schema like "yourDbName" and data_type in ('text','tinytext','mediumtext','longtext');
将所有出现的“ yourDbName”替换为要转换的数据库。然后运行:
mysql -uroot < preAlterTables.sql | egrep '^ALTER' > alterTables.sql
这将生成一个新文件alterTables.sql,其中包含转换数据库所需的所有查询。运行以下命令以开始转换:
mysql -uroot < alterTables.sql
您还可以通过更改table_schema的条件使它适合于在多个数据库中运行。例如,table_schema like "wiki_%"
将转换所有带有名称前缀的数据库wiki_
。要转换所有数据库,请用替换条件table_type!='SYSTEM VIEW'
。
可能出现的问题。我在mysql键中有一些varchar(255)列。这会导致错误:
ERROR 1071 (42000) at line 2229: Specified key was too long; max key length is 767 bytes
如果发生这种情况,您可以简单地将列更改为较小,例如varchar(150),然后重新运行命令。
请注意:此答案将数据库转换为,utf8mb4_unicode_ci
而不是utf8mb4_bin
问题中要求的。但是您可以简单地替换它。
我使用了以下shell脚本。它以数据库名称为参数,并将所有表转换为另一个字符集和排序规则(由脚本中定义的另一个参数或默认值提供)。
#!/bin/bash
# mycollate.sh <database> [<charset> <collation>]
# changes MySQL/MariaDB charset and collation for one database - all tables and
# all columns in all tables
DB="$1"
CHARSET="$2"
COLL="$3"
[ -n "$DB" ] || exit 1
[ -n "$CHARSET" ] || CHARSET="utf8mb4"
[ -n "$COLL" ] || COLL="utf8mb4_general_ci"
echo $DB
echo "ALTER DATABASE \`$DB\` CHARACTER SET $CHARSET COLLATE $COLL;" | mysql
echo "USE \`$DB\`; SHOW TABLES;" | mysql -s | (
while read TABLE; do
echo $DB.$TABLE
echo "ALTER TABLE \`$TABLE\` CONVERT TO CHARACTER SET $CHARSET COLLATE $COLL;" | mysql $DB
done
)
遇到这种情况;这是我用来转换数据库的方法:
首先,您需要进行编辑my.cnf
以使默认数据库连接(应用程序和MYSQL之间)与utf8mb4_unicode_ci兼容。没有此字符,例如表情符号和您的应用程序提交的类似字符,将无法以正确的字节/编码将其放置到表中(除非您的应用程序的DB CNN参数指定utf8mb4连接)。
这里给出的指示。
执行以下SQL(无需准备SQL即可更改单个列,ALTER TABLE
语句将执行此操作)。
在执行以下代码之前,请用实际的数据库名称替换“ DbName”。
USE information_schema;
SELECT concat("ALTER DATABASE `",table_schema,
"` CHARACTER SET = utf8mb4 COLLATE = utf8mb4_unicode_ci;") as _sql
FROM `TABLES`
WHERE table_schema like "DbName"
GROUP BY table_schema;
SELECT concat("ALTER TABLE `",table_schema,"`.`",table_name,
"` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;") as _sql
FROM `TABLES`
WHERE table_schema like "DbName"
GROUP BY table_schema, table_name;
收集上述SQL的输出并将其保存在点sql文件中并执行。
如果您收到类似#1071 - Specified key was too long; max key length is 1000 bytes.
问题的表名之类的错误,则意味着该表某列的索引键(本应转换为MB4字符串)将很大,因此Varchar列应<= 250,以便其索引键最大为1000个字节。检查您有索引的列,如果其中之一是varchar> 250(最有可能是255),则
步骤1:检查该列中的数据,以确保该列中的最大字符串大小为<= 250。
查询示例:
select `id`,`username`, `email`,
length(`username`) as l1,
char_length(`username`) as l2,
length(`email`) as l3,
char_length(`email`) as l4
from jos_users
order by l4 Desc;
步骤2:如果索引列数据的最大字符长度<= 250,则将列长度更改为250。如果不可能,则删除该列上的索引
步骤3:然后再次对该表运行alter table查询,现在应该成功将表转换为utf8mb4。
干杯!
我写了本指南:http : //hanoian.com/content/index.php/24-automate-the-converting-a-mysql-database-character-set-to-utf8mb4
从我的工作中,我发现仅改变数据库和表是不够的。我必须进入每个表并更改每个text / mediumtext / varchar列。
幸运的是,我能够编写一个脚本来检测MySQL数据库的元数据,因此它可以遍历表和列并自动更改它们。
MySQL 5.6的长索引:
您必须具有DBA / SUPER USER特权才能执行以下操作:设置数据库参数:
innodb_large_prefix:开 innodb_file_format:梭子鱼 innodb_file_format_max:梭子鱼
在此问题的答案中,有说明如何在上面设置这些参数:https : //stackoverflow.com/questions/35847015/mysql-change-innodb-large-prefix
当然,在我的文章中也有说明。
对于MySQL 5.7或更高版本,innodb_large_prefix默认情况下为ON,innodb_file_format默认情况下也为梭子鱼。
对于可能遇到此问题的人,最佳解决方案是根据此表将列首先修改为二进制类型:
然后,将列修改回原来的类型并使用所需的字符集。
例如。:
ALTER TABLE [TABLE_SCHEMA].[TABLE_NAME] MODIFY [COLUMN_NAME] LONGBLOB;
ALTER TABLE [TABLE_SCHEMA].[TABLE_NAME] MODIFY [COLUMN_NAME] VARCHAR(140) CHARACTER SET utf8mb4;
我尝试了几个latin1表,它保留了所有变音符号。
您可以为执行此操作的所有列提取此查询:
SELECT
CONCAT('ALTER TABLE ', TABLE_SCHEMA,'.', TABLE_NAME,' MODIFY ', COLUMN_NAME,' VARBINARY;'),
CONCAT('ALTER TABLE ', TABLE_SCHEMA,'.', TABLE_NAME,' MODIFY ', COLUMN_NAME,' ', COLUMN_TYPE,' CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;')
FROM information_schema.columns
WHERE TABLE_SCHEMA IN ('[TABLE_SCHEMA]')
AND COLUMN_TYPE LIKE 'varchar%'
AND (COLLATION_NAME IS NOT NULL AND COLLATION_NAME NOT LIKE 'utf%');
我制作了一个脚本,它或多或少地自动执行此操作:
<?php
/**
* Requires php >= 5.5
*
* Use this script to convert utf-8 data in utf-8 mysql tables stored via latin1 connection
* This is a PHP port from: https://gist.github.com/njvack/6113127
*
* BACKUP YOUR DATABASE BEFORE YOU RUN THIS SCRIPT!
*
* Once the script ran over your databases, change your database connection charset to utf8:
*
* $dsn = 'mysql:host=localhost;port=3306;charset=utf8';
*
* DON'T RUN THIS SCRIPT MORE THAN ONCE!
*
* @author hollodotme
*
* @author derclops since 2019-07-01
*
* I have taken the liberty to adapt this script to also do the following:
*
* - convert the database to utf8mb4
* - convert all tables to utf8mb4
* - actually then also convert the data to utf8mb4
*
*/
header('Content-Type: text/plain; charset=utf-8');
$dsn = 'mysql:host=localhost;port=3306;charset=utf8';
$user = 'root';
$password = 'root';
$options = [
\PDO::ATTR_CURSOR => \PDO::CURSOR_FWDONLY,
\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY => true,
\PDO::MYSQL_ATTR_INIT_COMMAND => "SET CHARACTER SET latin1",
];
$dbManager = new \PDO( $dsn, $user, $password, $options );
$databasesToConvert = [ 'database1',/** database3, ... */ ];
$typesToConvert = [ 'char', 'varchar', 'tinytext', 'mediumtext', 'text', 'longtext' ];
foreach ( $databasesToConvert as $database )
{
echo $database, ":\n";
echo str_repeat( '=', strlen( $database ) + 1 ), "\n";
$dbManager->exec( "USE `{$database}`" );
echo "converting database to correct locale too ... \n";
$dbManager->exec("ALTER DATABASE `{$database}` CHARACTER SET = utf8mb4 COLLATE = utf8mb4_unicode_ci");
$tablesStatement = $dbManager->query( "SHOW TABLES" );
while ( ($table = $tablesStatement->fetchColumn()) )
{
echo "Table: {$table}:\n";
echo str_repeat( '-', strlen( $table ) + 8 ), "\n";
$columnsToConvert = [ ];
$columsStatement = $dbManager->query( "DESCRIBE `{$table}`" );
while ( ($tableInfo = $columsStatement->fetch( \PDO::FETCH_ASSOC )) )
{
$column = $tableInfo['Field'];
echo ' * ' . $column . ': ' . $tableInfo['Type'];
$type = preg_replace( "#\(\d+\)#", '', $tableInfo['Type'] );
if ( in_array( $type, $typesToConvert ) )
{
echo " => must be converted\n";
$columnsToConvert[] = $column;
}
else
{
echo " => not relevant\n";
}
}
//convert table also!!!
$convert = "ALTER TABLE `{$table}` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci";
echo "\n", $convert, "\n";
$dbManager->exec( $convert );
$databaseErrors = $dbManager->errorInfo();
if( !empty($databaseErrors[1]) ){
echo "\n !!!!!!!!!!!!!!!!! ERROR OCCURED ".print_r($databaseErrors, true)." \n";
exit;
}
if ( !empty($columnsToConvert) )
{
$converts = array_map(
function ( $column )
{
//return "`{$column}` = IFNULL(CONVERT(CAST(CONVERT(`{$column}` USING latin1) AS binary) USING utf8mb4),`{$column}`)";
return "`{$column}` = CONVERT(BINARY(CONVERT(`{$column}` USING latin1)) USING utf8mb4)";
},
$columnsToConvert
);
$query = "UPDATE IGNORE `{$table}` SET " . join( ', ', $converts );
//alternative
// UPDATE feedback SET reply = CONVERT(BINARY(CONVERT(reply USING latin1)) USING utf8mb4) WHERE feedback_id = 15015;
echo "\n", $query, "\n";
$dbManager->exec( $query );
$databaseErrors = $dbManager->errorInfo();
if( !empty($databaseErrors[1]) ){
echo "\n !!!!!!!!!!!!!!!!! ERROR OCCURED ".print_r($databaseErrors, true)." \n";
exit;
}
}
echo "\n--\n";
}
echo "\n";
}
mysql -uroot -pThatrootPassWord < alterTables.sql
可以正常工作。正如您已经指出的,utf8mb4_bin是nextcloud的建议之一。