Tag: Unicode

将UTF-8数据转换为正确的string格式: 如果我通过一个套接字（或通过任何外部来源）收到一个UTF-8string，我想把它作为一个正确parsing的string对象。以下代码显示了我的意思 var str='21\r\nJust a demo string \xC3\xA4\xC3\xA8-should not be anymore parsed'; // Find CRLF var i=str.indexOf('\r\n'); // Parse size up until CRLF var x=parseInt(str.slice(0, i)); // Read size bytes var s=str.substr(i+2, x) console.log(s); 这个代码应该打印只是一个演示stringäè 但是由于UTF-8数据没有被正确parsing，所以只能parsing到第一个Unicode字符只是一个演示stringÃ¤ 谁会有一个想法如何正确地转换这个？

无法在node.js中转换iTunes XML播放列表: 前段时间我写了一个快速的小节点命令行实用工具，将XML格式的iTunes播放列表转换为M3U，XSPF等，这样我就可以在工作的Android系统上使用它们。 *我有一个25吉字节的音乐collections，doubletwist等人只是龙骨，并试图与我的Mac同步起初这没什么问题，但是随着音乐collections的增长，我遇到了一个问题：任何媒体播放器都无法find任何带有非英文Unicode字符的文件，例如ñ，í和几乎所有的日文汉字。它并不是每一个造成这个问题的angular色，但大部分是它的一个东西。由于itunes文件path是部分url编码的（并且不需要匹配目标格式的约束），并且需要部分replace为目标机器上的正确path，所以我有以下代码来处理文件path（剥离不相关的东西）： let location; // need try/catch because some track names contain unescaped '%' that // cause the decode function to throw. try { location = decodeURIComponent(x.location.slice(7)); } catch (e) { // function references a hash of about 200 url encodings and // replaces occurences of them in the path, poor man's […]

阅读代码点时发生偏移: 简历：我目前正在写一个ActionScript 3词法分析器，将源代码转换为令牌。我select用代码点解释input，一个包含在UString类中的可选代理对的string。在引擎盖下，我使用UStringPos类来caching最后一个读取位置。我已经testing了它如何扫描标识符"huehuehue" … 'use strict'; import {Lexer} from 'core/Lexer'; import {UString} from 'utils/UString'; import ErrorHandler from 'core/ErrorHandler'; const errorHandler = new ErrorHandler(true); // Tell the length to the `Lexer` manually. const lexer = new Lexer( new UString('huehuehue'), 9, errorHandler); // Scan first token lexer.next(); const id = lexer.lookahead.value; console.log( id, id.length […]

节点JS检测string编码: 如何检测节点JS中的string编码，并将string转换为有效的Unicodestring。例如，如何检测CP437编码的string并将其转换为有效的Unicodestring。 input：¡¡¡¡¡¡¡¡ 输出：Quiénhaengañado 我希望dynamic检测编码types，并将string转换为有效的Unicodestring。提前致谢。

从Cheerio.js内容中删除unicode字符: 我正在使用cheeriojs从网页中删除内容，并使用以下HTML。 Although the PM's office could neither confirm nor deny this, the spokesperson, John Doe said the meeting took place on Sunday. “The outcome will be made public in due course,” John said in an SMS yesterday. 我可以通过class和id标签来获得感兴趣的内容，如下所示： $('.top-stories .line.more').each(function(i, el){ //Do something… let content = $(this).next().html(); } 一旦我捕获了感兴趣的内容，我使用正则expression式来“清理”它，如下所示： […]

NodeJS Decrypt des3 Unicode: 我有以下的代码片段 var crypto = require("crypto"); var iv = new Buffer('d146ec4ce3f955cb', "hex"); var key = new Buffer('dc5c3319dc25c1f6f11f6a792a6dd28864c9dd48be26c2e4', "hex"); var encrypted = new Buffer('6A57201D19B07ABFAE74B453BA46381C', "hex"); var cipher = crypto.createDecipheriv('des3', key, iv); var result = cipher.update(encrypted); result += cipher.final(); console.log("result: " + result); 结果是“密码”这个片段很适合基于ASCII的单词。不过，我有一些unicode密码。所以比如这个Pi： UU__3185CDAA15C1CDED 我尝试过使用这个值，加上“UU__”的去除，但没有获得。我也尝试了这样的encryption数据： var encrypted = new Buffer('UU__3185CDAA15C1CDED', "utf16le"); 和 var result […]

在Windows中将Windows-1252hex值转换为Unicode: 比方说，我有一个string包含一个字符的Windows-1252hex值，我想做出适当的Unicode字符： const asciiHex = '85' //represents hellip parseInt(asciiHex, 16) //I get 133 as expected 我不能做String.fromCharCode现在，因为这需要Unicode代码，而不是ASCII（在unicode hellip是8230（十进制））。有谁知道任何简单的转换？顺便说一句，我在节点6做这个

用于validationUTF-8的正则expression式仅包含“普通”字符: 在我的项目中，用户可以注册一个可公开查看的昵称。我希望允许该名称包含来自任何脚本（阿拉伯文，拉丁文，西里尔文，日文等）的字符，但要防止控制字符，标点符号和非字母字符（如✇或✈）。我发现了很多用于过滤来自各种单独脚本的字母数字字符的例子，但是我不想花费数天的时间来通过编码表来挖掘每一个脚本，以便通过手动方式。任何build议？

在PHP或node.js中读取严重编码的JSON文件: 我试图parsinghttp://www.spetnik.com/files/alerts.json – Chrome似乎做了一个很好的工作，如果我使用wget下载它，我可以在VIM中查看它。但是，当试图用node.js或PHPparsing时，parsing失败。我尝试了各种各样的东西，包括mb_convert_encoding，没有任何工作。什么是parsing这个JSON最简单的方法？

node.js神经框架unicode响应: 码： var nerve = require("./nerve"); var sitemap = [ ["/", function(req, res) { res.respond("Русский"); }] ]; nerve.create(sitemap).listen(8100); 在浏览器中显示： CAA:89 它应该如何正确？