如何将整个MySQL数据库字符集转换为UTF-8并将排序规则转换为UTF-8?
utf8mb4
,而不是utf8
为utf8
仅支持,而不是全方位的基本多文种平面。它需要MySQL 5.5.3或更高版本。
utf8mb4
您,则还需要将排序规则切换到utf8mb4_unicode_ci
utf8mb4_unicode_520_ci
或最新的可用版本。
如何将整个MySQL数据库字符集转换为UTF-8并将排序规则转换为UTF-8?
utf8mb4
,而不是utf8
为utf8
仅支持,而不是全方位的基本多文种平面。它需要MySQL 5.5.3或更高版本。
utf8mb4
您,则还需要将排序规则切换到utf8mb4_unicode_ci
utf8mb4_unicode_520_ci
或最新的可用版本。
Answers:
使用ALTER DATABASE
和ALTER TABLE
命令。
ALTER DATABASE databasename CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
或者,如果您仍在不支持4字节UTF-8的MySQL 5.5.2或更旧版本上,请使用utf8
代替utf8mb4
:
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
CONVERT TO
技术假定文本已正确存储在其他字符集(例如latin1)中,并且没有被弄乱(例如,UTF-8字节被挤入了latin1列而未转换为latin1)。
备份!
然后,您需要在数据库上设置默认字符集。这不会转换现有表,而只会为新创建的表设置默认值。
ALTER DATABASE dbname CHARACTER SET utf8 COLLATE utf8_general_ci;
然后,您将需要转换所有现有表及其列上的字符集。假设您当前的数据实际上在当前字符集中。如果您将列设置为一个字符集,但您的数据确实存储在另一个字符集中,则需要查看MySQL手册以了解如何处理该字符集。
ALTER TABLE tbl_name CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
utf8_general_ci
不再推荐最佳做法。从MySQL 5.5.3开始,您应该使用utf8mb4
而不是utf8
。它们都引用UTF-8编码,但是较旧的版本utf8
具有MySQL特定的限制,无法使用上面编号的字符0xFFFD
。
如果您是命令行外壳程序之一,则可以非常快速地执行此操作。只需填写“ dbname”:D
DB="dbname"
(
echo 'ALTER DATABASE `'"$DB"'` CHARACTER SET utf8 COLLATE utf8_general_ci;'
mysql "$DB" -e "SHOW TABLES" --batch --skip-column-names \
| xargs -I{} echo 'ALTER TABLE `'{}'` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;'
) \
| mysql "$DB"
DB="dbname"; ( echo 'ALTER DATABASE `'"$DB"'` CHARACTER SET utf8 COLLATE utf8_general_ci;'; mysql "$DB" -e "SHOW TABLES" --batch --skip-column-names | xargs -I{} echo 'ALTER TABLE `'{}'` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;' ) | mysql "$DB"
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'DB="dbname"
DB="db_name"; ( echo 'ALTER DATABASE
'“ $ DB”'`CHARACTER SET utf8 COLLATE utf8_general_ci;'; mysql --uuser -ppassword -hhost“ $ DB” -e“ SHOW TABLES” –batch --skip-column-names | xargs -I {} echo'SET foreign_key_checks = 0; 将表'{}'
转换为字符集utf8集合utf8_general_ci;' )| mysql -uuser -ppassword -hhost“ $ DB”`
您可以创建sql以使用以下命令更新所有表:
SELECT CONCAT("ALTER TABLE ",TABLE_SCHEMA,".",TABLE_NAME," CHARACTER SET utf8 COLLATE utf8_general_ci; ",
"ALTER TABLE ",TABLE_SCHEMA,".",TABLE_NAME," CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; ")
AS alter_sql
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = your_database_name;
捕获输出并运行它。
阿诺德·丹尼尔斯的上述回答更为优雅。
WHERE TABLE_SCHEMA=webdb_playground
就给您带来了未知的列错误,但WHERE TABLE_SCHEMA="webdb_playground"
可以成功。尝试尝试其他任何人遇到的情况。
在继续之前,请确保您:已完成完整的数据库备份!
步骤1:数据库级别更改
识别数据库的排序规则和字符集
SELECT DEFAULT_CHARACTER_SET_NAME, DEFAULT_COLLATION_NAME FROM
information_schema.SCHEMATA S
WHERE schema_name = 'your_database_name'
AND
(DEFAULT_CHARACTER_SET_NAME != 'utf8'
OR
DEFAULT_COLLATION_NAME not like 'utf8%');
修复数据库排序规则
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
步骤2:表级别更改
标识具有错误字符集或排序规则的数据库表
SELECT CONCAT(
'ALTER TABLE ', table_name, ' CHARACTER SET utf8 COLLATE utf8_general_ci; ',
'ALTER TABLE ', table_name, ' CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; ')
FROM information_schema.TABLES AS T, information_schema.`COLLATION_CHARACTER_SET_APPLICABILITY` AS C
WHERE C.collation_name = T.table_collation
AND T.table_schema = 'your_database_name'
AND
(C.CHARACTER_SET_NAME != 'utf8'
OR
C.COLLATION_NAME not like 'utf8%')
调整表格列的排序规则和字符集
捕获较高的sql输出并运行它。(如下所示)
ALTER TABLE rma CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE rma_history CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_history CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE rma_products CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_products CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE rma_report_period CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_report_period CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE rma_reservation CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_reservation CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE rma_supplier_return CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_supplier_return CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE rma_supplier_return_history CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_supplier_return_history CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE rma_supplier_return_product CHARACTER SET utf8 COLLATE utf8_general_ci;ALTER TABLE rma_supplier_return_product CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
请参阅:https : //confluence.atlassian.com/display/CONFKB/How+to+Fix+the+Collation+and+Character+Set+of+a+MySQL+Database
使用HeidiSQL。它的免费和非常好的数据库工具。
在工具菜单中,输入批量表编辑器
选择完整的数据库或选择要转换的表,
执行
只需几秒钟,即可将完整的数据库从拉丁语转换为utf8。
奇迹般有效 :)
HeidiSQL默认情况下以utf8连接,因此在检查表数据时,现在应将任何特殊字符视为字符(æøå),而不应视为已编码。
从拉丁语迁移到utf8时,真正的陷阱是确保pdo与utf8字符集连接。如果不是这样,您将把垃圾数据插入到utf8表中,并在您的网页上各处出现问号,使您认为表数据不是utf8 ...
受@sdfor注释的启发,这是一个可以完成工作的bash脚本
#!/bin/bash
printf "### Converting MySQL character set ###\n\n"
printf "Enter the encoding you want to set: "
read -r CHARSET
# Get the MySQL username
printf "Enter mysql username: "
read -r USERNAME
# Get the MySQL password
printf "Enter mysql password for user %s:" "$USERNAME"
read -rs PASSWORD
DBLIST=( mydatabase1 mydatabase2 )
printf "\n"
for DB in "${DBLIST[@]}"
do
(
echo 'ALTER DATABASE `'"$DB"'` CHARACTER SET utf8 COLLATE `'"$CHARSET"'`;'
mysql "$DB" -u"$USERNAME" -p"$PASSWORD" -e "SHOW TABLES" --batch --skip-column-names \
| xargs -I{} echo 'ALTER TABLE `'{}'` CONVERT TO CHARACTER SET utf8 COLLATE `'"$CHARSET"'`;'
) \
| mysql "$DB" -u"$USERNAME" -p"$PASSWORD"
echo "$DB database done..."
done
echo "### DONE ###"
exit
如果数据使用不同的字符集,则可以考虑从http://dev.mysql.com/doc/refman/5.0/en/charset-conversion.html中获取此代码段
如果该列具有非二进制数据类型(CHAR,VARCHAR,TEXT),则其内容应使用列字符集而不是其他字符集进行编码。如果内容以不同的字符集编码,则可以先将该列转换为使用二进制数据类型,然后再转换为具有所需字符集的非二进制列。
这是一个例子:
ALTER TABLE t1 CHANGE c1 c1 BLOB;
ALTER TABLE t1 CHANGE c1 c1 VARCHAR(100) CHARACTER SET utf8;
确保选择正确的排序规则,否则可能会遇到唯一的键冲突。例如,在某些整理中,Éleanore和Eleanore可能被认为是相同的。
在旁边:
我遇到的情况是,即使电子邮件中某些字符以“ UTF-8”形式存储,电子邮件中的某些字符还是“中断”了。如果要使用utf8数据发送电子邮件,则可能还希望将电子邮件转换为以UTF8格式发送。
在PHPMailer中,只需更新以下行: public $CharSet = 'utf-8';
对于具有大量表的数据库,可以使用以下简单的php脚本来更新数据库和所有表的字符集:
$conn = mysqli_connect($host, $username, $password, $database);
if ($conn->connect_error) {
die("Connection failed: " . $conn->connect_error);
}
$alter_database_charset_sql = "ALTER DATABASE ".$database." CHARACTER SET utf8 COLLATE utf8_unicode_ci";
mysqli_query($conn, $alter_database_charset_sql);
$show_tables_result = mysqli_query($conn, "SHOW TABLES");
$tables = mysqli_fetch_all($show_tables_result);
foreach ($tables as $index => $table) {
$alter_table_sql = "ALTER TABLE ".$table[0]." CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci";
$alter_table_result = mysqli_query($conn, $alter_table_sql);
echo "<pre>";
var_dump($alter_table_result);
echo "</pre>";
}
DELIMITER $$
CREATE PROCEDURE `databasename`.`update_char_set`()
BEGIN
DECLARE done INT DEFAULT 0;
DECLARE t_sql VARCHAR(256);
DECLARE tableName VARCHAR(128);
DECLARE lists CURSOR FOR SELECT table_name FROM `information_schema`.`TABLES` WHERE table_schema = 'databasename';
DECLARE CONTINUE HANDLER FOR SQLSTATE '02000' SET done = 1;
OPEN lists;
FETCH lists INTO tableName;
REPEAT
SET @t_sql = CONCAT('ALTER TABLE ', tableName, ' CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci');
PREPARE stmt FROM @t_sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
FETCH lists INTO tableName;
UNTIL done END REPEAT;
CLOSE lists;
END$$
DELIMITER ;
CALL databasename.update_char_set();
mysqldump -uusername -ppassword -c -e --default-character-set=utf8 --single-transaction --skip-set-charset --add-drop-database -B dbname > dump.sql
cp dump.sql dump-fixed.sql
vim dump-fixed.sql
:%s/DEFAULT CHARACTER SET latin1/DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci/
:%s/DEFAULT CHARSET=latin1/DEFAULT CHARSET=utf8/
:wq
mysql -uusername -ppassword < dump-fixed.sql
最安全的方法是先将列修改为二进制类型,然后使用所需的字符集将其修改回其类型。
每个列类型都有其各自的二进制类型,如下所示:
例如。:
ALTER TABLE [TABLE_SCHEMA].[TABLE_NAME] MODIFY [COLUMN_NAME] VARBINARY;
ALTER TABLE [TABLE_SCHEMA].[TABLE_NAME] MODIFY [COLUMN_NAME] VARCHAR(140) CHARACTER SET utf8mb4;
我尝试了几个latin1表,它保留了所有变音符号。
您可以为执行此操作的所有列提取此查询:
SELECT
CONCAT('ALTER TABLE ', TABLE_SCHEMA,'.', TABLE_NAME,' MODIFY ', COLUMN_NAME,' VARBINARY;'),
CONCAT('ALTER TABLE ', TABLE_SCHEMA,'.', TABLE_NAME,' MODIFY ', COLUMN_NAME,' ', COLUMN_TYPE,' CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;')
FROM information_schema.columns
WHERE TABLE_SCHEMA IN ('[TABLE_SCHEMA]')
AND COLUMN_TYPE LIKE 'varchar%'
AND (COLLATION_NAME IS NOT NULL AND COLLATION_NAME NOT LIKE 'utf%');
在所有列上执行此操作之后,您将在所有表上执行此操作:
ALTER TABLE [TABLE_SCHEMA].[TABLE_NAME] CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
要为您的所有表生成此查询,请使用以下查询:
SELECT
CONCAT('ALTER TABLE ', TABLE_SCHEMA, '.', TABLE_NAME, ' CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;')
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_COLLATION NOT LIKE 'utf8%'
and TABLE_SCHEMA in ('[TABLE_SCHEMA]');
既然您已经修改了所有列和表,请在数据库上执行相同的操作:
ALTER DATABASE [DATA_BASE_NAME] CHARSET = utf8mb4 COLLATE = utf8mb4_general_ci;
唯一对我有用的解决方案:http : //docs.moodle.org/23/en/Converting_your_MySQL_database_to_UTF8
mysqldump -uusername -ppassword -c -e --default-character-set=utf8 --single-transaction --skip-set-charset --add-drop-database -B dbname > dump.sql
cp dump.sql dump-fixed.sql
vim dump-fixed.sql
:%s/DEFAULT CHARACTER SET latin1/DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci/
:%s/DEFAULT CHARSET=latin1/DEFAULT CHARSET=utf8/
:wq
mysql -uusername -ppassword < dump-fixed.sql
更改表table_name charset ='utf8';
这是我可以使用的简单查询,您可以根据需要更改table_name。
我只是在为@Jasny 回答其他问题,例如@Brian
在数据库中拥有视图的其他人。
如果您有这样的错误:
ERROR 1347 (HY000) at line 17: 'dbname.table_name' is not of type 'BASE TABLE'
这是因为您可能有视图,因此需要排除它们。但是当试图排除它们时,MySQL返回2列而不是1列。
SHOW FULL TABLES WHERE Table_Type = 'BASE TABLE';
-- table_name1 BASE TABLE
-- table_name2 BASE TABLE
因此,我们必须使Jasny的命令适应awk
于仅提取包含表名的第一列。
DB="dbname"
(
echo 'ALTER DATABASE `'"$DB"'` CHARACTER SET utf8 COLLATE utf8_general_ci;'
mysql "$DB" -e "SHOW FULL TABLES WHERE Table_Type = 'BASE TABLE'" --batch --skip-column-names \
| awk '{print $1 }' \
| xargs -I{} echo 'ALTER TABLE `'{}'` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;'
) \
| mysql "$DB"
DB="dbname"; ( echo 'ALTER DATABASE `'"$DB"'` CHARACTER SET utf8 COLLATE utf8_general_ci;'; mysql "$DB" -e "SHOW FULL TABLES WHERE Table_Type = 'BASE TABLE'" --batch --skip-column-names | awk '{print $1 }' | xargs -I{} echo 'ALTER TABLE `'{}'` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;' ) | mysql "$DB"
utf8_unicode_ci
而不是utf8_general_ci
。