使用Javascript的atob解码base64不能正确解码utf-8字符串

106

我正在使用Javascript window.atob()函数解码base64编码的字符串（特别是来自GitHub API的base64编码的内容）。问题是我回来了ASCII编码字符（例如â¢而不是™）。如何正确处理传入的以base64编码的流，以便将其解码为utf-8？

javascript encoding utf-8

— 布兰登
source

3

您链接的MDN页面上有一个段落，其开头为“用于Unicode或UTF-8字符串”。

— 尖尖的2015年

1

您在节点上吗？有更好的解决方案而不是atob

— Bergi 2015年

268

Mozilla的MDN文档上有一篇很棒的文章准确地描述了此问题：

“ Unicode问题”由于DOMStrings是16位编码的字符串，因此在大多数浏览器中window.btoa，Character Out Of Range exception如果字符超出8位字节的范围（0x00〜0xFF），则调用Unicode字符串将导致。有两种方法可以解决此问题：

第一个是转义整个字符串（使用UTF-8，请参见 encodeURIComponent），然后对其进行编码；

第二个是将UTF-16 DOMString转换为UTF-8字符数组，然后对其进行编码。

关于以前的解决方案的注释：MDN文章最初建议使用unescape和escape解决Character Out Of Range异常问题，但是自那以后就不建议使用。这里的其他一些答案建议使用decodeURIComponent和解决此问题，encodeURIComponent事实证明这是不可靠且不可预测的。此答案的最新更新使用现代JavaScript函数来提高速度和代码现代化。

如果您想节省一些时间，也可以考虑使用一个库：

js-base64（NPM，非常适合Node.js）
base64-js

编码UTF8⇢base64

function b64EncodeUnicode(str) {
    // first we use encodeURIComponent to get percent-encoded UTF-8,
    // then we convert the percent encodings into raw bytes which
    // can be fed into btoa.
    return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g,
        function toSolidBytes(match, p1) {
            return String.fromCharCode('0x' + p1);
    }));
}

b64EncodeUnicode('✓ à la mode'); // "4pyTIMOgIGxhIG1vZGU="
b64EncodeUnicode('\n'); // "Cg=="

解码base64⇢UTF8

function b64DecodeUnicode(str) {
    // Going backwards: from bytestream, to percent-encoding, to original string.
    return decodeURIComponent(atob(str).split('').map(function(c) {
        return '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2);
    }).join(''));
}

b64DecodeUnicode('4pyTIMOgIGxhIG1vZGU='); // "✓ à la mode"
b64DecodeUnicode('Cg=='); // "\n"

2018年前的解决方案（功能齐全，虽然可能会更好地支持旧版浏览器，但不是最新的）

这是当前的建议，直接来自MDN，并通过@ MA-Maddin具有一些其他TypeScript兼容性：

// Encoding UTF8 ⇢ base64

function b64EncodeUnicode(str) {
    return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g, function(match, p1) {
        return String.fromCharCode(parseInt(p1, 16))
    }))
}

b64EncodeUnicode('✓ à la mode') // "4pyTIMOgIGxhIG1vZGU="
b64EncodeUnicode('\n') // "Cg=="

// Decoding base64 ⇢ UTF8

function b64DecodeUnicode(str) {
    return decodeURIComponent(Array.prototype.map.call(atob(str), function(c) {
        return '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2)
    }).join(''))
}

b64DecodeUnicode('4pyTIMOgIGxhIG1vZGU=') // "✓ à la mode"
b64DecodeUnicode('Cg==') // "\n"

原始解决方案（已弃用）

使用了escape和unescape（现在已弃用，尽管在所有现代浏览器中仍然可以使用）：

function utf8_to_b64( str ) {
    return window.btoa(unescape(encodeURIComponent( str )));
}

function b64_to_utf8( str ) {
    return decodeURIComponent(escape(window.atob( str )));
}

// Usage:
utf8_to_b64('✓ à la mode'); // "4pyTIMOgIGxhIG1vZGU="
b64_to_utf8('4pyTIMOgIGxhIG1vZGU='); // "✓ à la mode"

最后一件事：我在调用GitHub API时首先遇到了这个问题。为了使此功能在（Mobile）Safari上正常运行，实际上我什至必须解码base64源中的所有空白，然后才能对其进行解码。在2017年这是否仍然有意义，我不知道：

function b64_to_utf8( str ) {
    str = str.replace(/\s/g, '');    
    return decodeURIComponent(escape(window.atob( str )));
}

— 布兰登
source

1

“ w3schools.com/jsref/jsref_unescape.asp ” JavaScript版本1.5中不推荐使用unescape（）函数。请改用decodeURI（）或decodeURIComponent（）。

— Tedd Hansen

1

您保存了我的时光，兄弟

— Neo Neo先生

2

更新： MDN中的解决方案＃1 “ Unicode问题”已修复，b64DecodeUnicode('4pyTIMOgIGxhIG1vZGU=');现在可以正确输出“

— ✓àla

2

解码的另一种方式decodeURIComponent(atob('4pyTIMOgIGxhIG1vZGU=').split('').map(x => '%' + x.charCodeAt(0).toString(16)).join('')) 不是性能最高的代码，而是它的本质。

— daniel.gindi

2

return String.fromCharCode(parseInt(p1, 16));具有TypeScript兼容性。

— Martin Schneider

20

事情会改变的。该逃逸/ UNESCAPE方法已被弃用。

在对字符串进行Base64编码之前，可以对字符串进行URI编码。请注意，这不会生成Base64编码的UTF8，而是Base64编码的URL编码的数据。双方必须就相同的编码达成共识。

参见此处的工作示例：http : //codepen.io/anon/pen/PZgbPW

// encode string
var base64 = window.btoa(encodeURIComponent('€ 你好 æøåÆØÅ'));
// decode string
var str = decodeURIComponent(window.atob(tmp));
// str is now === '€ 你好 æøåÆØÅ'

对于OP的问题，可以使用第三方库（例如js-base64）解决该问题。

— 泰德·汉森
source

1

我想指出的是，您不是在生成输入字符串的base64，而是在生成其编码组件的base64。因此，如果您将其发送出去，则另一方无法将其解码为“ base64”并获得原始字符串

— Riccardo Galli，2017年

3

您是正确的，我已经更新了文字以指出这一点。谢谢。另一种选择似乎是自己使用第三方库（例如js-base64）来实现base64或接收到“错误：无法在'Window'上执行'btoa'：要编码的字符串包含Latin1范围之外的字符。 ”

— Tedd Hansen

14

如果将字符串视为字节更是您的事，则可以使用以下函数

function u_atob(ascii) {
    return Uint8Array.from(atob(ascii), c => c.charCodeAt(0));
}

function u_btoa(buffer) {
    var binary = [];
    var bytes = new Uint8Array(buffer);
    for (var i = 0, il = bytes.byteLength; i < il; i++) {
        binary.push(String.fromCharCode(bytes[i]));
    }
    return btoa(binary.join(''));
}


// example, it works also with astral plane characters such as '𝒞'
var encodedString = new TextEncoder().encode('✓');
var base64String = u_btoa(encodedString);
console.log('✓' === new TextDecoder().decode(u_atob(base64String)))

— 里卡多·加利（Riccardo Galli）
source

1

谢谢。您的回答对帮助我完成这项工作至关重要，这使我花了好几天的时间。+1。stackoverflow.com/a/51814273/470749

— 瑞安

有关更快，更跨浏览器的解决方案（但本质上是相同的输出），请参阅stackoverflow.com/a/53433503/5601591

— Jack Giffin

u_atob和u_btoa使用自IE10（2012）起在每个浏览器中可用的功能，对我来说似乎很可靠（如果您引用TextEncoder，那只是一个示例）

— Riccardo Galli

4

这是Mozilla开发资源中所述的2018年更新的解决方案

从UNICODE编码到B64

function b64EncodeUnicode(str) {
    // first we use encodeURIComponent to get percent-encoded UTF-8,
    // then we convert the percent encodings into raw bytes which
    // can be fed into btoa.
    return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g,
        function toSolidBytes(match, p1) {
            return String.fromCharCode('0x' + p1);
    }));
}

b64EncodeUnicode('✓ à la mode'); // "4pyTIMOgIGxhIG1vZGU="
b64EncodeUnicode('\n'); // "Cg=="

从B64解码为UNICODE

function b64DecodeUnicode(str) {
    // Going backwards: from bytestream, to percent-encoding, to original string.
    return decodeURIComponent(atob(str).split('').map(function(c) {
        return '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2);
    }).join(''));
}

b64DecodeUnicode('4pyTIMOgIGxhIG1vZGU='); // "✓ à la mode"
b64DecodeUnicode('Cg=='); // "\n"

— 曼努埃尔·G
source

3

我假设人们可能想要一种可产生广泛使用的base64 URI的解决方案。请访问data:text/plain;charset=utf-8;base64,4pi44pi54pi64pi74pi84pi+4pi/以观看演示（复制数据uri，打开一个新标签，将数据URI粘贴到地址栏中，然后按Enter进入页面）。尽管此URI是base64编码的事实，但浏览器仍然能够识别高代码点并对其进行正确解码。最小的编码器+解码器为1058字节（+ Gzip→589字节）

!function(e){"use strict";function h(b){var a=b.charCodeAt(0);if(55296<=a&&56319>=a)if(b=b.charCodeAt(1),b===b&&56320<=b&&57343>=b){if(a=1024*(a-55296)+b-56320+65536,65535<a)return d(240|a>>>18,128|a>>>12&63,128|a>>>6&63,128|a&63)}else return d(239,191,189);return 127>=a?inputString:2047>=a?d(192|a>>>6,128|a&63):d(224|a>>>12,128|a>>>6&63,128|a&63)}function k(b){var a=b.charCodeAt(0)<<24,f=l(~a),c=0,e=b.length,g="";if(5>f&&e>=f){a=a<<f>>>24+f;for(c=1;c<f;++c)a=a<<6|b.charCodeAt(c)&63;65535>=a?g+=d(a):1114111>=a?(a-=65536,g+=d((a>>10)+55296,(a&1023)+56320)):c=0}for(;c<e;++c)g+="\ufffd";return g}var m=Math.log,n=Math.LN2,l=Math.clz32||function(b){return 31-m(b>>>0)/n|0},d=String.fromCharCode,p=atob,q=btoa;e.btoaUTF8=function(b,a){return q((a?"\u00ef\u00bb\u00bf":"")+b.replace(/[\x80-\uD7ff\uDC00-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]?/g,h))};e.atobUTF8=function(b,a){a||"\u00ef\u00bb\u00bf"!==b.substring(0,3)||(b=b.substring(3));return p(b).replace(/[\xc0-\xff][\x80-\xbf]*/g,k)}}(""+void 0==typeof global?""+void 0==typeof self?this:self:global)

以下是用于生成它的源代码。

var fromCharCode = String.fromCharCode;
var btoaUTF8 = (function(btoa, replacer){"use strict";
    return function(inputString, BOMit){
        return btoa((BOMit ? "\xEF\xBB\xBF" : "") + inputString.replace(
            /[\x80-\uD7ff\uDC00-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]?/g, replacer
        ));
    }
})(btoa, function(nonAsciiChars){"use strict";
    // make the UTF string into a binary UTF-8 encoded string
    var point = nonAsciiChars.charCodeAt(0);
    if (point >= 0xD800 && point <= 0xDBFF) {
        var nextcode = nonAsciiChars.charCodeAt(1);
        if (nextcode !== nextcode) // NaN because string is 1 code point long
            return fromCharCode(0xef/*11101111*/, 0xbf/*10111111*/, 0xbd/*10111101*/);
        // https://mathiasbynens.be/notes/javascript-encoding#surrogate-formulae
        if (nextcode >= 0xDC00 && nextcode <= 0xDFFF) {
            point = (point - 0xD800) * 0x400 + nextcode - 0xDC00 + 0x10000;
            if (point > 0xffff)
                return fromCharCode(
                    (0x1e/*0b11110*/<<3) | (point>>>18),
                    (0x2/*0b10*/<<6) | ((point>>>12)&0x3f/*0b00111111*/),
                    (0x2/*0b10*/<<6) | ((point>>>6)&0x3f/*0b00111111*/),
                    (0x2/*0b10*/<<6) | (point&0x3f/*0b00111111*/)
                );
        } else return fromCharCode(0xef, 0xbf, 0xbd);
    }
    if (point <= 0x007f) return nonAsciiChars;
    else if (point <= 0x07ff) {
        return fromCharCode((0x6<<5)|(point>>>6), (0x2<<6)|(point&0x3f));
    } else return fromCharCode(
        (0xe/*0b1110*/<<4) | (point>>>12),
        (0x2/*0b10*/<<6) | ((point>>>6)&0x3f/*0b00111111*/),
        (0x2/*0b10*/<<6) | (point&0x3f/*0b00111111*/)
    );
});

然后，要解码base64数据，请HTTP将数据获取为数据URI或使用下面的函数。

var clz32 = Math.clz32 || (function(log, LN2){"use strict";
    return function(x) {return 31 - log(x >>> 0) / LN2 | 0};
})(Math.log, Math.LN2);
var fromCharCode = String.fromCharCode;
var atobUTF8 = (function(atob, replacer){"use strict";
    return function(inputString, keepBOM){
        inputString = atob(inputString);
        if (!keepBOM && inputString.substring(0,3) === "\xEF\xBB\xBF")
            inputString = inputString.substring(3); // eradicate UTF-8 BOM
        // 0xc0 => 0b11000000; 0xff => 0b11111111; 0xc0-0xff => 0b11xxxxxx
        // 0x80 => 0b10000000; 0xbf => 0b10111111; 0x80-0xbf => 0b10xxxxxx
        return inputString.replace(/[\xc0-\xff][\x80-\xbf]*/g, replacer);
    }
})(atob, function(encoded){"use strict";
    var codePoint = encoded.charCodeAt(0) << 24;
    var leadingOnes = clz32(~codePoint);
    var endPos = 0, stringLen = encoded.length;
    var result = "";
    if (leadingOnes < 5 && stringLen >= leadingOnes) {
        codePoint = (codePoint<<leadingOnes)>>>(24+leadingOnes);
        for (endPos = 1; endPos < leadingOnes; ++endPos)
            codePoint = (codePoint<<6) | (encoded.charCodeAt(endPos)&0x3f/*0b00111111*/);
        if (codePoint <= 0xFFFF) { // BMP code point
          result += fromCharCode(codePoint);
        } else if (codePoint <= 0x10FFFF) {
          // https://mathiasbynens.be/notes/javascript-encoding#surrogate-formulae
          codePoint -= 0x10000;
          result += fromCharCode(
            (codePoint >> 10) + 0xD800,  // highSurrogate
            (codePoint & 0x3ff) + 0xDC00 // lowSurrogate
          );
        } else endPos = 0; // to fill it in with INVALIDs
    }
    for (; endPos < stringLen; ++endPos) result += "\ufffd"; // replacement character
    return result;
});

更加标准的优势在于，此编码器和解码器可更广泛地应用，因为它们可以用作可以正确显示的有效URL。观察一下。

(function(window){
    "use strict";
    var sourceEle = document.getElementById("source");
    var urlBarEle = document.getElementById("urlBar");
    var mainFrameEle = document.getElementById("mainframe");
    var gotoButton = document.getElementById("gotoButton");
    var parseInt = window.parseInt;
    var fromCodePoint = String.fromCodePoint;
    var parse = JSON.parse;
    
    function unescape(str){
        return str.replace(/\\u[\da-f]{0,4}|\\x[\da-f]{0,2}|\\u{[^}]*}|\\[bfnrtv"'\\]|\\0[0-7]{1,3}|\\\d{1,3}/g, function(match){
          try{
            if (match.startsWith("\\u{"))
              return fromCodePoint(parseInt(match.slice(2,-1),16));
            if (match.startsWith("\\u") || match.startsWith("\\x"))
              return fromCodePoint(parseInt(match.substring(2),16));
            if (match.startsWith("\\0") && match.length > 2)
              return fromCodePoint(parseInt(match.substring(2),8));
            if (/^\\\d/.test(match)) return fromCodePoint(+match.slice(1));
          }catch(e){return "\ufffd".repeat(match.length)}
          return parse('"' + match + '"');
        });
    }
    
    function whenChange(){
      try{ urlBarEle.value = "data:text/plain;charset=UTF-8;base64," + btoaUTF8(unescape(sourceEle.value), true);
      } finally{ gotoURL(); }
    }
    sourceEle.addEventListener("change",whenChange,{passive:1});
    sourceEle.addEventListener("input",whenChange,{passive:1});
    
    // IFrame Setup:
    function gotoURL(){mainFrameEle.src = urlBarEle.value}
    gotoButton.addEventListener("click", gotoURL, {passive: 1});
    function urlChanged(){urlBarEle.value = mainFrameEle.src}
    mainFrameEle.addEventListener("load", urlChanged, {passive: 1});
    urlBarEle.addEventListener("keypress", function(evt){
      if (evt.key === "enter") evt.preventDefault(), urlChanged();
    }, {passive: 1});
    
        
    var fromCharCode = String.fromCharCode;
    var btoaUTF8 = (function(btoa, replacer){
		    "use strict";
        return function(inputString, BOMit){
        	return btoa((BOMit?"\xEF\xBB\xBF":"") + inputString.replace(
        		/[\x80-\uD7ff\uDC00-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]?/g, replacer
    		));
    	}
    })(btoa, function(nonAsciiChars){
		"use strict";
    	// make the UTF string into a binary UTF-8 encoded string
    	var point = nonAsciiChars.charCodeAt(0);
    	if (point >= 0xD800 && point <= 0xDBFF) {
    		var nextcode = nonAsciiChars.charCodeAt(1);
    		if (nextcode !== nextcode) { // NaN because string is 1code point long
    			return fromCharCode(0xef/*11101111*/, 0xbf/*10111111*/, 0xbd/*10111101*/);
    		}
    		// https://mathiasbynens.be/notes/javascript-encoding#surrogate-formulae
    		if (nextcode >= 0xDC00 && nextcode <= 0xDFFF) {
    			point = (point - 0xD800) * 0x400 + nextcode - 0xDC00 + 0x10000;
    			if (point > 0xffff) {
    				return fromCharCode(
    					(0x1e/*0b11110*/<<3) | (point>>>18),
    					(0x2/*0b10*/<<6) | ((point>>>12)&0x3f/*0b00111111*/),
    					(0x2/*0b10*/<<6) | ((point>>>6)&0x3f/*0b00111111*/),
    					(0x2/*0b10*/<<6) | (point&0x3f/*0b00111111*/)
    				);
    			}
    		} else {
    			return fromCharCode(0xef, 0xbf, 0xbd);
    		}
    	}
    	if (point <= 0x007f) { return inputString; }
    	else if (point <= 0x07ff) {
    		return fromCharCode((0x6<<5)|(point>>>6), (0x2<<6)|(point&0x3f/*00111111*/));
    	} else {
    		return fromCharCode(
    			(0xe/*0b1110*/<<4) | (point>>>12),
    			(0x2/*0b10*/<<6) | ((point>>>6)&0x3f/*0b00111111*/),
    			(0x2/*0b10*/<<6) | (point&0x3f/*0b00111111*/)
    		);
    	}
    });
    setTimeout(whenChange, 0);
})(window);

img:active{opacity:0.8}

<center>
<textarea id="source" style="width:66.7vw">Hello \u1234 W\186\0256ld!
Enter text into the top box. Then the URL will update automatically.
</textarea><br />
<div style="width:66.7vw;display:inline-block;height:calc(25vw + 1em + 6px);border:2px solid;text-align:left;line-height:1em">
<input id="urlBar" style="width:calc(100% - 1em - 13px)" /><img id="gotoButton" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABsAAAAeCAMAAADqx5XUAAAAclBMVEX///9NczZ8e32ko6fDxsU/fBoSQgdFtwA5pAHVxt+7vLzq5ex23y4SXABLiiTm0+/c2N6DhoQ6WSxSyweVlZVvdG/Uz9aF5kYlbwElkwAggACxs7Jl3hX07/cQbQCar5SU9lRntEWGum+C9zIDHwCGnH5IvZAOAAABmUlEQVQoz7WS25acIBBFkRLkIgKKtOCttbv//xdDmTGZzHv2S63ltuBQQP4rdRiRUP8UK4wh6nVddQwj/NtDQTvac8577zTQb72zj65/876qqt7wykU6/1U6vFEgjE1mt/5LRqrpu7oVsn0sjZejMfxR3W/yLikqAFcUx93YxLmZGOtElmEu6Ufd9xV3ZDTGcEvGLbMk0mHHlUSvS5svCwS+hVL8loQQyfpI1Ay8RF/xlNxcsTchGjGDIuBG3Ik7TMyNxn8m0TSnBAK6Z8UZfp3IbAonmJvmsEACum6aNv7B0CnvpezDcNhw9XWsuAr7qnRg6dABmeM4dTgn/DZdXWs3LMspZ1KDMt1kcPJ6S1icWNp2qaEmjq6myx7jbQK3VKItLJaW5FR+cuYlRhYNKzGa9vF4vM5roLW3OSVjkmiGJrPhUq301/16pVKZRGFYWjTP50spTxBN5Z4EKnSonruk+n4tUokv1aJSEl/MLZU90S3L6/U6o0J142iQVp3HcZxKSo8LfkNRCtJaKYFSRX7iaoAAUDty8wvWYR6HJEepdwAAAABJRU5ErkJggg==" style="width:calc(1em + 4px);line-height:1em;vertical-align:-40%;cursor:pointer" />
<iframe id="mainframe" style="width:66.7vw;height:25vw" frameBorder="0"></iframe>
</div>
</center>

展开摘要

除了非常标准化之外，上述代码片段也非常快。上面的代码片段不是直接的连续链，在连续的链中数据必须在各种形式之间进行多次转换（例如，在Riccardo Galli的响应中），而上面的代码段则尽可能地表现直接。String.prototype.replace在编码时，它仅使用一个简单的快速调用来处理数据，在解码时，仅使用一个简单的快速调用来对数据进行解码。另一个优点是（特别是对于大字符串）。最后，锦上添花的是，对于您的拉丁脚本排除用户，不包含高于0x7f的任何代码点的字符串的处理速度特别快，因为替换算法不会修改该字符串。String.prototype.replace允许浏览器自动处理调整字符串大小的底层内存管理，从而显着提高性能，尤其是在对Chrome和Firefox等常绿浏览器进行了大幅优化的情况下String.prototype.replace

我已经在https://github.com/anonyco/BestBase64EncoderDecoder/为该解决方案创建了一个github存储库。

— 杰克·吉芬
source

您能否详细说明“用户创建的方式”与“可由浏览器解释”的含义？在Mozilla建议的基础上，使用此解决方案有什么附加值？

— brandonscript

@brandonscript Mozilla与MDN不同。MDN是用户创建的内容。MDN上建议您解决方案的页面是用户创建的内容，而不是浏览器供应商创建的内容。

— 杰克·吉芬

是否创建了解决方案供应商？我会的，我建议您将信誉归功于此。如果不是，那么它也是用户创建的，与MDN的答案没有什么不同？

— brandonscript

@brandonscript好点。你是对的。我删除了那段文字。另外，请查看我添加的演示。

— 杰克·吉芬

3

适合我的完整文章：https : //developer.mozilla.org/en-US/docs/Web/JavaScript/Base64_encoding_and_decoding

我们从Unicode / UTF-8编码的部分是

function utf8_to_b64( str ) {
   return window.btoa(unescape(encodeURIComponent( str )));
}

function b64_to_utf8( str ) {
   return decodeURIComponent(escape(window.atob( str )));
}

// Usage:
utf8_to_b64('✓ à la mode'); // "4pyTIMOgIGxhIG1vZGU="
b64_to_utf8('4pyTIMOgIGxhIG1vZGU='); // "✓ à la mode"

这是当今最常用的方法之一。

— 里卡
source

这与接受的答案相同。

— brandonscript

0

不建议进行小改正，转义和转义，因此：

function utf8_to_b64( str ) {
    return window.btoa(decodeURIComponent(encodeURIComponent(str)));
}

function b64_to_utf8( str ) {
     return decodeURIComponent(encodeURIComponent(window.atob(str)));
}


function b64_to_utf8( str ) {
    str = str.replace(/\s/g, '');    
    return decodeURIComponent(encodeURIComponent(window.atob(str)));
}

— 达克斯
source

2

看起来doc链接与此现在甚至有所不同，建议使用正则表达式解决方案来管理它。

— brandonscript

2

这encodeURIComponent是不起作用的，因为是的倒数decodeURIComponent，即它将撤消转换。有关和发生的情况的详细说明，请参见stackoverflow.com/a/31412163/1534459。escapeunescape

— bodo '16

1

@canaaerus我不明白您的评论？不建议使用转义和转义，我只需要使用[decode | encode] URIComponent函数交换它们即可：-)一切正常。首先阅读问题

— Darkves '16

1

@Darkves：之所以encodeURIComponent使用它，是为了正确处理（整个范围的）unicode字符串。所以eg window.btoa(decodeURIComponent(encodeURIComponent('€')))提供，Error: String contains an invalid character因为它与window.btoa('€')并且btoa不能编码相同€。

— bodo '16

2

@Darkves：是的，这是正确的。但是您不能使用EncodeURIComponent交换转义字符，而不能使用DecodeURIComponent交换转义字符，因为Encode和转义方法不会做相同的事情。与解码和转义相同。我最初犯了同样的错误，顺便说一句。您应该注意，如果您使用字符串，先对其进行UriEncode编码，然后再进行UriDecode编码，则返回的字符串与输入的字符串相同。因此，这样做是胡说八道。当对一个使用encodeURIComponent编码的字符串进行转义时，您不会得到与输入相同的字符串，因此这就是使用转义/转义可以工作的原因，但不适用于您的转义。

— Stefan Steiger's

0

这是一些可能缺乏的针对浏览器的面向未来的代码escape/unescape()。请注意，IE 9和更早版本不支持atob/btoa()，因此您需要为其使用自定义base64函数。

// Polyfill for escape/unescape
if( !window.unescape ){
    window.unescape = function( s ){
        return s.replace( /%([0-9A-F]{2})/g, function( m, p ) {
            return String.fromCharCode( '0x' + p );
        } );
    };
}
if( !window.escape ){
    window.escape = function( s ){
        var chr, hex, i = 0, l = s.length, out = '';
        for( ; i < l; i ++ ){
            chr = s.charAt( i );
            if( chr.search( /[A-Za-z0-9\@\*\_\+\-\.\/]/ ) > -1 ){
                out += chr; continue; }
            hex = s.charCodeAt( i ).toString( 16 );
            out += '%' + ( hex.length % 2 != 0 ? '0' : '' ) + hex;
        }
        return out;
    };
}

// Base64 encoding of UTF-8 strings
var utf8ToB64 = function( s ){
    return btoa( unescape( encodeURIComponent( s ) ) );
};
var b64ToUtf8 = function( s ){
    return decodeURIComponent( escape( atob( s ) ) );
};

可以在以下位置找到有关UTF-8编码和解码的更全面的示例：http : //jsfiddle.net/47zwb41o/

— 比约尔
source

-1

如果仍然遇到问题，请尝试上述解决方案，如下所示，请考虑TS不支持转义的情况。

blob = new Blob(["\ufeff", csv_content]); // this will make symbols to appears in excel

对于csv_content，您可以尝试以下操作。

function b64DecodeUnicode(str: any) {        
        return decodeURIComponent(atob(str).split('').map((c: any) => {
            return '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2);
        }).join(''));
    }

— 迪瓦卡
source