选择语言 :

Core_UTF8

UTF8

A port of phputf8 to a unified set of files. Provides multi-byte aware replacement string functions.

For UTF-8 support to work correctly, the following requirements must be met:

PCRE needs to be compiled with UTF-8 support (--enable-utf8)
Support for Unicode properties is highly recommended (--enable-unicode-properties)
UTF-8 conversion will be much more reliable if the iconv extension is loaded
The mbstring extension is highly recommended, but must not be overloading string functions

This file is licensed differently from the rest of MyQEE. As a port of phputf8, this file is released under the LGPL.

API - Core_UTF8

UTF8::$called

UTF8::clean - Recursively cleans arrays, objects, and strings. Removes ASCII control

UTF8::is_ascii - Tests whether a string contains only 7-bit ASCII bytes. This is used to

UTF8::strip_ascii_ctrl - Strips out device control codes in the ASCII range.

UTF8::strip_non_ascii - Strips out all non-7bit ASCII bytes.

UTF8::transliterate_to_ascii - Replaces special/accented UTF-8 characters by ASCII-7 "equivalents".

UTF8::strlen - Returns the length of the given string. This is a UTF8-aware version

UTF8::strpos - Finds position of first occurrence of a UTF-8 string. This is a

UTF8::strrpos - Finds position of last occurrence of a char in a UTF-8 string. This is

UTF8::substr - Returns part of a UTF-8 string. This is a UTF8-aware version

UTF8::substr_replace - Replaces text within a portion of a UTF-8 string. This is a UTF8-aware

UTF8::strtolower - Makes a UTF-8 string lowercase. This is a UTF8-aware version

UTF8::strtoupper - Makes a UTF-8 string uppercase. This is a UTF8-aware version

UTF8::ucfirst - Makes a UTF-8 string's first character uppercase. This is a UTF8-aware

UTF8::ucwords - Makes the first character of every word in a UTF-8 string uppercase.

UTF8::strcasecmp - Case-insensitive UTF-8 string comparison. This is a UTF8-aware version

UTF8::str_ireplace - Returns a string or an array with all occurrences of search in subject

UTF8::stristr - Case-insenstive UTF-8 version of strstr. Returns all of input string

UTF8::strspn - Finds the length of the initial segment matching mask. This is a

UTF8::strcspn - Finds the length of the initial segment not matching mask. This is a

UTF8::str_pad - Pads a UTF-8 string to a certain length with another string. This is a

UTF8::str_split - Converts a UTF-8 string to an array. This is a UTF8-aware version of

UTF8::strrev - Reverses a UTF-8 string. This is a UTF8-aware version of [strrev](http://php.net/strrev).

UTF8::trim - Strips whitespace (or other UTF-8 characters) from the beginning and

UTF8::ltrim - Strips whitespace (or other UTF-8 characters) from the beginning of

UTF8::rtrim - Strips whitespace (or other UTF-8 characters) from the end of a string.

UTF8::ord - Returns the unicode ordinal for a character. This is a UTF8-aware

UTF8::to_unicode - Takes an UTF-8 string and returns an array of ints representing the Unicode characters.

UTF8::from_unicode - Takes an array of ints representing the Unicode characters and returns a UTF-8 string.

author: 呼吸二氧化碳 jonwang@myqee.com
category: MyQEE
package: System
subpackage: Core
copyright: Copyright © 2008-2013 myqee.com
license: http://www.myqee.com/license.html

UTF8::clean( $var , $charset = 'UTF-8')

Recursively cleans arrays, objects, and strings. Removes ASCII control codes and converts to the requested charset while silently discarding incompatible characters.

UTF8::clean($_GET); // Clean GET data

This method requires Iconv

参数列表

参数	类型	描述	默认值
`$var`	`mixed`	Variable to clean
`$charset`	`string`	Character set, defaults to UTF-8	string(5) "UTF-8"

返回值

mixed

参数	类型	描述	默认值
`$str`	`mixed`	String or array of strings to check

UTF8::strip_ascii_ctrl( $str )

Strips out device control codes in the ASCII range.

$str = UTF8::strip_ascii_ctrl($str);

参数列表

参数	类型	描述	默认值
`$str`	`string`	String to clean

返回值

string

UTF8::transliterate_to_ascii( $str , $case = 0)

Replaces special/accented UTF-8 characters by ASCII-7 "equivalents".

$ascii = UTF8::transliterate_to_ascii($utf8);

参数列表

参数	类型	描述	默认值
`$str`	`string`	String to transliterate
`$case`	`integer`	-1 lowercase only, +1 uppercase only, 0 both cases	integer 0

返回值

string

参数	类型	描述	默认值
`$str`	`string`	String being measured for length

UTF8::strpos( $str , $search , $offset = 0)

Finds position of first occurrence of a UTF-8 string. This is a UTF8-aware version of strpos.

$position = UTF8::strpos($str, $search);

参数列表

参数	类型	描述	默认值
`$str`	`string`	Haystack
`$search`	`string`	Needle
`$offset`	`integer`	Offset from which character in haystack to start searching	integer 0

返回值

integer position of needle
boolean FALSE if the needle is not found

UTF8::strrpos( $str , $search , $offset = 0)

Finds position of last occurrence of a char in a UTF-8 string. This is a UTF8-aware version of strrpos.

$position = UTF8::strrpos($str, $search);

参数列表

参数	类型	描述	默认值
`$str`	`string`	Haystack
`$search`	`string`	Needle
`$offset`	`integer`	Offset from which character in haystack to start searching	integer 0

返回值

integer position of needle
boolean FALSE if the needle is not found

UTF8::substr( $str , $offset , $length = null)

Returns part of a UTF-8 string. This is a UTF8-aware version of substr.

$sub = UTF8::substr($str, $offset);

参数列表

参数	类型	描述	默认值
`$str`	`string`	Input string
`$offset`	`integer`	Offset
`$length`	`integer`	Length limit	null

返回值

string

UTF8::substr_replace( $str , $replacement , $offset , $length = null)

Replaces text within a portion of a UTF-8 string. This is a UTF8-aware version of substr_replace.

$str = UTF8::substr_replace($str, $replacement, $offset);

参数列表

参数	类型	描述	默认值
`$str`	`string`	Input string
`$replacement`	`string`	Replacement string
`$offset`	`integer`	Offset
`$length`	`unknown`		null

返回值

string

参数	类型	描述	默认值
`$str`	`string`	Mixed case string

参数	类型	描述	默认值
`$str`	`string`	Mixed case string

参数	类型	描述	默认值
`$str`	`string`	Mixed case string

参数	类型	描述	默认值
`$str`	`string`	Mixed case string

UTF8::strcasecmp( $str1 , $str2 )

Case-insensitive UTF-8 string comparison. This is a UTF8-aware version of strcasecmp.

$compare = UTF8::strcasecmp($str1, $str2);

参数列表

参数	类型	描述	默认值
`$str1`	`string`	String to compare
`$str2`	`string`	String to compare

返回值

integer less than 0 if str1 is less than str2
integer greater than 0 if str1 is greater than str2
integer 0 if they are equal

UTF8::str_ireplace( $search , $replace , $str , & $count = null)

Returns a string or an array with all occurrences of search in subject (ignoring case) and replaced with the given replace value. This is a UTF8-aware version of str_ireplace.

This function is very slow compared to the native version. Avoid

using it when possible.

参数列表

参数	类型	描述	默认值
`$search`	`string\|array`	Text to replace
`$replace`	`string\|array`	Replacement text
`$str`	`string\|array`	Subject text
`$count`	`integer`	Number of matched and replaced needles will be returned via this parameter which is passed by reference	null

返回值

string if the input was a string
array if the input was an array

UTF8::stristr( $str , $search )

Case-insenstive UTF-8 version of strstr. Returns all of input string from the first occurrence of needle to the end. This is a UTF8-aware version of stristr.

$found = UTF8::stristr($str, $search);

参数列表

参数	类型	描述	默认值
`$str`	`string`	Input string
`$search`	`string`	Needle

返回值

string matched substring if found
FALSE if the substring was not found

UTF8::strspn( $str , $mask , $offset = null, $length = null)

Finds the length of the initial segment matching mask. This is a UTF8-aware version of strspn.

$found = UTF8::strspn($str, $mask);

参数列表

参数	类型	描述	默认值
`$str`	`string`	Input string
`$mask`	`string`	Mask for search
`$offset`	`integer`	Start position of the string to examine	null
`$length`	`integer`	Length of the string to examine	null

返回值

integer length of the initial segment that contains characters in the mask

UTF8::strcspn( $str , $mask , $offset = null, $length = null)

Finds the length of the initial segment not matching mask. This is a UTF8-aware version of strcspn.

$found = UTF8::strcspn($str, $mask);

参数列表

参数	类型	描述	默认值
`$str`	`string`	Input string
`$mask`	`string`	Mask for search
`$offset`	`integer`	Start position of the string to examine	null
`$length`	`integer`	Length of the string to examine	null

返回值

integer length of the initial segment that contains characters not in the mask

UTF8::str_pad( $str , $final_str_length , $pad_str = ' ', $pad_type = 1)

Pads a UTF-8 string to a certain length with another string. This is a UTF8-aware version of str_pad.

$str = UTF8::str_pad($str, $length);

参数列表

参数	类型	描述	默认值
`$str`	`string`	Input string
`$final_str_length`	`integer`	Desired string length after padding
`$pad_str`	`string`	String to use as padding	string(1) " "
`$pad_type`	`string`	Padding type: STR_PAD_RIGHT, STR_PAD_LEFT, or STR_PAD_BOTH	integer 1

返回值

string

UTF8::str_split( $str , $split_length = 1)

Converts a UTF-8 string to an array. This is a UTF8-aware version of str_split.

$array = UTF8::str_split($str);

参数列表

参数	类型	描述	默认值
`$str`	`string`	Input string
`$split_length`	`integer`	Maximum length of each chunk	integer 1

返回值

array

参数	类型	描述	默认值
`$str`	`string`	String to be reversed

UTF8::trim( $str , $charlist = null)

Strips whitespace (or other UTF-8 characters) from the beginning and end of a string. This is a UTF8-aware version of trim.

$str = UTF8::trim($str);

参数列表

参数	类型	描述	默认值
`$str`	`string`	Input string
`$charlist`	`string`	String of characters to remove	null

返回值

string

UTF8::ltrim( $str , $charlist = null)

Strips whitespace (or other UTF-8 characters) from the beginning of a string. This is a UTF8-aware version of ltrim.

$str = UTF8::ltrim($str);

参数列表

参数	类型	描述	默认值
`$str`	`string`	Input string
`$charlist`	`string`	String of characters to remove	null

返回值

string

UTF8::rtrim( $str , $charlist = null)

Strips whitespace (or other UTF-8 characters) from the end of a string. This is a UTF8-aware version of rtrim.

$str = UTF8::rtrim($str);

参数列表

参数	类型	描述	默认值
`$str`	`string`	Input string
`$charlist`	`string`	String of characters to remove	null

返回值

string

参数	类型	描述	默认值
`$chr`	`string`	UTF-8 encoded character

Takes an UTF-8 string and returns an array of ints representing the Unicode characters. Astral planes are supported i.e. the ints in the output can be > 0xFFFF. Occurrences of the BOM are ignored. Surrogates are not allowed.

$array = UTF8::to_unicode($str);

The Original Code is Mozilla Communicator client code. The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 1998 the Initial Developer. Ported to PHP by Henri Sivonen hsivonen@iki.fi, see http://hsivonen.iki.fi/php-utf8/ Slight modifications to fit with phputf8 library by Harry Fuecks hfuecks@gmail.com

参数列表

参数	类型	描述	默认值
`$str`	`string`	UTF-8 encoded string

返回值

array unicode code points
FALSE if the string is invalid

UTF8::from_unicode( $arr )

Takes an array of ints representing the Unicode characters and returns a UTF-8 string. Astral planes are supported i.e. the ints in the input can be > 0xFFFF. Occurrances of the BOM are ignored. Surrogates are not allowed.

$str = UTF8::to_unicode($array);

参数列表

参数	类型	描述	默认值
`$arr`	`array`	Unicode code points representing a string

返回值

string utf8 string of characters
boolean FALSE if a code point cannot be found

« Core_Upload::set_error

Core_UTF8::$called »