我花了最后8个小时尝试将mysqldump --compatible = postgresql的输出导入到PostgreSQL 8.4.9中,并且在这里和其他地方已经阅读了至少20个不同的线程,但是都没有找到关于此特定问题的信息。实际可行的答案。
MySQL 5.1.52数据转储:
mysqldump -u root -p --compatible=postgresql --no-create-info --no-create-db --default-character-set=utf8 --skip-lock-tables rt3 > foo
PostgreSQL 8.4.9服务器作为目标
使用'psql -U rt_user -f foo'加载数据正在报告(其中很多,这是一个示例):
psql:foo:29: ERROR: invalid byte sequence for encoding "UTF8": 0x00
HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".
根据以下内容,输入文件中没有NULL(0x00)字符。
database-dumps:rcf-temp1# sed 's/\x0/ /g' < foo > nonulls
database-dumps:rcf-temp1# sum foo nonulls
04730 2545610 foo
04730 2545610 nonulls
database-dumps:rcf-temp1# rm nonulls
同样,使用Perl进行的另一次检查也没有显示NULL:
database-dumps:rcf-temp1# perl -ne '/\000/ and print;' foo
database-dumps:rcf-temp1#
正如错误中提到的“提示”一样,我尝试了所有可能的方法将“ client_encoding”设置为“ UTF8”,虽然成功了,但对解决我的问题没有任何作用。
database-dumps:rcf-temp1# psql -U rt_user --variable=client_encoding=utf-8 -c "SHOW client_encoding;" rt3
client_encoding
-----------------
UTF8
(1 row)
database-dumps:rcf-temp1#
完美,但是:
database-dumps:rcf-temp1# psql -U rt_user -f foo --variable=client_encoding=utf-8 rt3
...
psql:foo:29: ERROR: invalid byte sequence for encoding "UTF8": 0x00
HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".
...
除非听到正确的答案“ Hoyle Hoyle”,并且知道我真的不在乎为这个很少引用的数据保留任何非ASCII字符,否则您有什么建议?
更新:导入时,同一转储文件的纯ASCII版本出现相同的错误。真正令人难以置信:
database-dumps:rcf-temp1# # convert any non-ASCII character to a space
database-dumps:rcf-temp1# perl -i.bk -pe 's/[^[:ascii:]]/ /g;' mysql5-dump.sql
database-dumps:rcf-temp1# sum mysql5-dump.sql mysql5-dump.sql.bk
41053 2545611 mysql5-dump.sql
50145 2545611 mysql5-dump.sql.bk
database-dumps:rcf-temp1# cmp mysql5-dump.sql mysql5-dump.sql.bk
mysql5-dump.sql mysql5-dump.sql.bk differ: byte 1304850, line 30
database-dumps:rcf-temp1# # GOOD!
database-dumps:rcf-temp1# psql -U postgres -f mysql5-dump.sql --variable=client_encoding=utf-8 rt3
...
INSERT 0 416
psql:mysql5-dump.sql:30: ERROR: invalid byte sequence for encoding "UTF8": 0x00
HINT: This error can also happen if the byte sequence does not match the encod.
INSERT 0 455
INSERT 0 424
INSERT 0 483
INSERT 0 447
INSERT 0 503
psql:mysql5-dump.sql:36: ERROR: invalid byte sequence for encoding "UTF8": 0x00
HINT: This error can also happen if the byte sequence does not match the encod.
INSERT 0 502
INSERT 0 507
INSERT 0 318
INSERT 0 284
psql:mysql5-dump.sql:41: ERROR: invalid byte sequence for encoding "UTF8": 0x00
HINT: This error can also happen if the byte sequence does not match the encod.
INSERT 0 382
INSERT 0 419
INSERT 0 247
psql:mysql5-dump.sql:45: ERROR: invalid byte sequence for encoding "UTF8": 0x00
HINT: This error can also happen if the byte sequence does not match the encod.
INSERT 0 267
INSERT 0 348
^C
有问题的表之一定义为:
Table "public.attachments"
Column | Type | Modifie
-----------------+-----------------------------+--------------------------------
id | integer | not null default nextval('atta)
transactionid | integer | not null
parent | integer | not null default 0
messageid | character varying(160) |
subject | character varying(255) |
filename | character varying(255) |
contenttype | character varying(80) |
contentencoding | character varying(80) |
content | text |
headers | text |
creator | integer | not null default 0
created | timestamp without time zone |
Indexes:
"attachments_pkey" PRIMARY KEY, btree (id)
"attachments1" btree (parent)
"attachments2" btree (transactionid)
"attachments3" btree (parent, transactionid)
我没有更改数据库架构任何部分的类型的自由。这样做可能会中断软件等的未来升级。
可能的问题列是“文本”类型的“内容”(也许在其他表中也是如此)。正如我从先前的研究中已经知道的那样,PostgreSQL不允许在'text'值中使用NULL。但是,请参见上面的sed和Perl均不显示NULL字符的地方,然后再向下查看我从整个转储文件中剥离所有非ASCII字符的位置,但仍会阻塞。
head -29 foo | tail -1 | cat -v
可能有用。