blog » PHP / PKI » PHP的UTF16转UTF8代码
PHP的UTF16转UTF8代码
今天碰到个客户的CA证书用的都是中文信息,相当不爽,而且还是UNICODE,明显WINDOWS上的程序处理的。想了想办法,UNICODE标准使用的是UTF16格式,只要把UTF16转到UTF8即可。用openssl 获取证书内容,可以发现如下信息:
Subject: C=N-V\xFD, ST=mYl_w\x01, …….
怎么解呢,从老外写的JSON函数里掏了个解码函数:
- /**
- * convert a string from one UTF-16 char to one UTF-8 char
- *
- * Normally should be handled by mb_convert_encoding, but
- * provides a slower PHP-only method for installations
- * that lack the multibye string extension.
- *
- * @param string $utf16 UTF-16 character
- * @return string UTF-8 character
- * @access private
- */
- function utf162utf8($utf16)
- {
- // oh please oh please oh please oh please oh please
- if(function_exists('mb_convert_encoding')) {
- return mb_convert_encoding($utf16, 'UTF-8', 'UTF-16');
- }
- $bytes = (ord($utf16{0}) << 8) | ord($utf16{1});
- switch(true) {
- case ((0x7F & $bytes) == $bytes):
- // this case should never be reached, because we are in ASCII range
- // see: http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8
- return chr(0x7F & $bytes);
- case (0x07FF & $bytes) == $bytes:
- // return a 2-byte UTF-8 character
- // see: http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8
- return chr(0xC0 | (($bytes >> 6) & 0x1F))
- . chr(0x80 | ($bytes & 0x3F));
- case (0xFFFF & $bytes) == $bytes:
- // return a 3-byte UTF-8 character
- // see: http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8
- return chr(0xE0 | (($bytes >> 12) & 0x0F))
- . chr(0x80 | (($bytes >> 6) & 0x3F))
- . chr(0x80 | ($bytes & 0x3F));
- }
- // ignoring UTF-32 for now, sorry
- return '';
- }
然后就很简单了,由于UTF16格式是固定双字节的,那只要2个一处理即可:
- $s = "N-V\xfd";
- for($i = 0,$step = 2;$i<strlen($s);$i+=2) {
- $t = substr($s,$i,$step);
- $ret = utf162utf8($t);
- echo $ret;
- }
成功得到结果,中国,即C的值,国家了~
相关文章:
- php多进程时,遇到mysql has gone away的解决
- bind_textdomain_codeset undefined
- 自建CA和服务器证书
- PHP中控制proc_open的执行时间
- 使用apache 的PHP cgi时遇到的Internal Server Error可能原因
RSS 2.0 | leave a response | trackback
发表评论