选择语言 :

Core_UTF8

UTF8

A port of phputf8 to a unified set of files. Provides multi-byte aware replacement string functions.

For UTF-8 support to work correctly, the following requirements must be met:

  • PCRE needs to be compiled with UTF-8 support (--enable-utf8)
  • Support for Unicode properties is highly recommended (--enable-unicode-properties)
  • UTF-8 conversion will be much more reliable if the iconv extension is loaded
  • The mbstring extension is highly recommended, but must not be overloading string functions
This file is licensed differently from the rest of MyQEE. As a port of phputf8, this file is released under the LGPL.

API - Core_UTF8

  • UTF8::$called
  • UTF8::clean - Recursively cleans arrays, objects, and strings. Removes ASCII control
  • UTF8::is_ascii - Tests whether a string contains only 7-bit ASCII bytes. This is used to
  • UTF8::strip_ascii_ctrl - Strips out device control codes in the ASCII range.
  • UTF8::strip_non_ascii - Strips out all non-7bit ASCII bytes.
  • UTF8::transliterate_to_ascii - Replaces special/accented UTF-8 characters by ASCII-7 "equivalents".
  • UTF8::strlen - Returns the length of the given string. This is a UTF8-aware version
  • UTF8::strpos - Finds position of first occurrence of a UTF-8 string. This is a
  • UTF8::strrpos - Finds position of last occurrence of a char in a UTF-8 string. This is
  • UTF8::substr - Returns part of a UTF-8 string. This is a UTF8-aware version
  • UTF8::substr_replace - Replaces text within a portion of a UTF-8 string. This is a UTF8-aware
  • UTF8::strtolower - Makes a UTF-8 string lowercase. This is a UTF8-aware version
  • UTF8::strtoupper - Makes a UTF-8 string uppercase. This is a UTF8-aware version
  • UTF8::ucfirst - Makes a UTF-8 string's first character uppercase. This is a UTF8-aware
  • UTF8::ucwords - Makes the first character of every word in a UTF-8 string uppercase.
  • UTF8::strcasecmp - Case-insensitive UTF-8 string comparison. This is a UTF8-aware version
  • UTF8::str_ireplace - Returns a string or an array with all occurrences of search in subject
  • UTF8::stristr - Case-insenstive UTF-8 version of strstr. Returns all of input string
  • UTF8::strspn - Finds the length of the initial segment matching mask. This is a
  • UTF8::strcspn - Finds the length of the initial segment not matching mask. This is a
  • UTF8::str_pad - Pads a UTF-8 string to a certain length with another string. This is a
  • UTF8::str_split - Converts a UTF-8 string to an array. This is a UTF8-aware version of
  • UTF8::strrev - Reverses a UTF-8 string. This is a UTF8-aware version of [strrev](http://php.net/strrev).
  • UTF8::trim - Strips whitespace (or other UTF-8 characters) from the beginning and
  • UTF8::ltrim - Strips whitespace (or other UTF-8 characters) from the beginning of
  • UTF8::rtrim - Strips whitespace (or other UTF-8 characters) from the end of a string.
  • UTF8::ord - Returns the unicode ordinal for a character. This is a UTF8-aware
  • UTF8::to_unicode - Takes an UTF-8 string and returns an array of ints representing the Unicode characters.
  • UTF8::from_unicode - Takes an array of ints representing the Unicode characters and returns a UTF-8 string.
author
呼吸二氧化碳 jonwang@myqee.com
category
MyQEE
package
System
subpackage
Core
copyright
Copyright © 2008-2013 myqee.com
license
http://www.myqee.com/license.html

UTF8::clean( $var , $charset = 'UTF-8')

Recursively cleans arrays, objects, and strings. Removes ASCII control codes and converts to the requested charset while silently discarding incompatible characters.

UTF8::clean($_GET); // Clean GET data

This method requires Iconv

参数列表

参数 类型 描述 默认值
$var mixed Variable to clean
$charset string Character set, defaults to UTF-8 string(5) "UTF-8"
返回值
  • mixed

UTF8::is_ascii( $str )

Tests whether a string contains only 7-bit ASCII bytes. This is used to determine when to use native functions or UTF-8 functions.

$ascii = UTF8::is_ascii($str);

参数列表

参数 类型 描述 默认值
$str mixed String or array of strings to check
返回值
  • boolean

UTF8::strip_ascii_ctrl( $str )

Strips out device control codes in the ASCII range.

$str = UTF8::strip_ascii_ctrl($str);

参数列表

参数 类型 描述 默认值
$str string String to clean
返回值
  • string

UTF8::strip_non_ascii( $str )

Strips out all non-7bit ASCII bytes.

$str = UTF8::strip_non_ascii($str);

参数列表

参数 类型 描述 默认值
$str string String to clean
返回值
  • string

UTF8::transliterate_to_ascii( $str , $case = 0)

Replaces special/accented UTF-8 characters by ASCII-7 "equivalents".

$ascii = UTF8::transliterate_to_ascii($utf8);

参数列表

参数 类型 描述 默认值
$str string String to transliterate
$case integer -1 lowercase only, +1 uppercase only, 0 both cases integer 0
返回值
  • string

UTF8::strlen( $str )

Returns the length of the given string. This is a UTF8-aware version of strlen.

$length = UTF8::strlen($str);

参数列表

参数 类型 描述 默认值
$str string String being measured for length
返回值
  • integer

UTF8::strpos( $str , $search , $offset = 0)

Finds position of first occurrence of a UTF-8 string. This is a UTF8-aware version of strpos.

$position = UTF8::strpos($str, $search);

参数列表

参数 类型 描述 默认值
$str string Haystack
$search string Needle
$offset integer Offset from which character in haystack to start searching integer 0
返回值
  • integer position of needle
  • boolean FALSE if the needle is not found

UTF8::strrpos( $str , $search , $offset = 0)

Finds position of last occurrence of a char in a UTF-8 string. This is a UTF8-aware version of strrpos.

$position = UTF8::strrpos($str, $search);

参数列表

参数 类型 描述 默认值
$str string Haystack
$search string Needle
$offset integer Offset from which character in haystack to start searching integer 0
返回值
  • integer position of needle
  • boolean FALSE if the needle is not found

UTF8::substr( $str , $offset , $length = null)

Returns part of a UTF-8 string. This is a UTF8-aware version of substr.

$sub = UTF8::substr($str, $offset);

参数列表

参数 类型 描述 默认值
$str string Input string
$offset integer Offset
$length integer Length limit null
返回值
  • string

UTF8::substr_replace( $str , $replacement , $offset , $length = null)

Replaces text within a portion of a UTF-8 string. This is a UTF8-aware version of substr_replace.

$str = UTF8::substr_replace($str, $replacement, $offset);

参数列表

参数 类型 描述 默认值
$str string Input string
$replacement string Replacement string
$offset integer Offset
$length unknown null
返回值
  • string

UTF8::strtolower( $str )

Makes a UTF-8 string lowercase. This is a UTF8-aware version of strtolower.

$str = UTF8::strtolower($str);

参数列表

参数 类型 描述 默认值
$str string Mixed case string
返回值
  • string

UTF8::strtoupper( $str )

Makes a UTF-8 string uppercase. This is a UTF8-aware version of strtoupper.

参数列表

参数 类型 描述 默认值
$str string Mixed case string
返回值
  • string

UTF8::ucfirst( $str )

Makes a UTF-8 string's first character uppercase. This is a UTF8-aware version of ucfirst.

$str = UTF8::ucfirst($str);

参数列表

参数 类型 描述 默认值
$str string Mixed case string
返回值
  • string

UTF8::ucwords( $str )

Makes the first character of every word in a UTF-8 string uppercase. This is a UTF8-aware version of ucwords.

$str = UTF8::ucwords($str);

参数列表

参数 类型 描述 默认值
$str string Mixed case string
返回值
  • string

UTF8::strcasecmp( $str1 , $str2 )

Case-insensitive UTF-8 string comparison. This is a UTF8-aware version of strcasecmp.

$compare = UTF8::strcasecmp($str1, $str2);

参数列表

参数 类型 描述 默认值
$str1 string String to compare
$str2 string String to compare
返回值
  • integer less than 0 if str1 is less than str2
  • integer greater than 0 if str1 is greater than str2
  • integer 0 if they are equal

UTF8::str_ireplace( $search , $replace , $str , & $count = null)

Returns a string or an array with all occurrences of search in subject (ignoring case) and replaced with the given replace value. This is a UTF8-aware version of str_ireplace.

This function is very slow compared to the native version. Avoid

using it when possible.

参数列表

参数 类型 描述 默认值
$search string|array Text to replace
$replace string|array Replacement text
$str string|array Subject text
$count integer Number of matched and replaced needles will be returned via this parameter which is passed by reference null
返回值
  • string if the input was a string
  • array if the input was an array

UTF8::stristr( $str , $search )

Case-insenstive UTF-8 version of strstr. Returns all of input string from the first occurrence of needle to the end. This is a UTF8-aware version of stristr.

$found = UTF8::stristr($str, $search);

参数列表

参数 类型 描述 默认值
$str string Input string
$search string Needle
返回值
  • string matched substring if found
  • FALSE if the substring was not found

UTF8::strspn( $str , $mask , $offset = null, $length = null)

Finds the length of the initial segment matching mask. This is a UTF8-aware version of strspn.

$found = UTF8::strspn($str, $mask);

参数列表

参数 类型 描述 默认值
$str string Input string
$mask string Mask for search
$offset integer Start position of the string to examine null
$length integer Length of the string to examine null
返回值
  • integer length of the initial segment that contains characters in the mask

UTF8::strcspn( $str , $mask , $offset = null, $length = null)

Finds the length of the initial segment not matching mask. This is a UTF8-aware version of strcspn.

$found = UTF8::strcspn($str, $mask);

参数列表

参数 类型 描述 默认值
$str string Input string
$mask string Mask for search
$offset integer Start position of the string to examine null
$length integer Length of the string to examine null
返回值
  • integer length of the initial segment that contains characters not in the mask

UTF8::str_pad( $str , $final_str_length , $pad_str = ' ', $pad_type = 1)

Pads a UTF-8 string to a certain length with another string. This is a UTF8-aware version of str_pad.

$str = UTF8::str_pad($str, $length);

参数列表

参数 类型 描述 默认值
$str string Input string
$final_str_length integer Desired string length after padding
$pad_str string String to use as padding string(1) " "
$pad_type string Padding type: STR_PAD_RIGHT, STR_PAD_LEFT, or STR_PAD_BOTH integer 1
返回值
  • string

UTF8::str_split( $str , $split_length = 1)

Converts a UTF-8 string to an array. This is a UTF8-aware version of str_split.

$array = UTF8::str_split($str);

参数列表

参数 类型 描述 默认值
$str string Input string
$split_length integer Maximum length of each chunk integer 1
返回值
  • array

UTF8::strrev( $str )

Reverses a UTF-8 string. This is a UTF8-aware version of strrev.

$str = UTF8::strrev($str);

参数列表

参数 类型 描述 默认值
$str string String to be reversed
返回值
  • string

UTF8::trim( $str , $charlist = null)

Strips whitespace (or other UTF-8 characters) from the beginning and end of a string. This is a UTF8-aware version of trim.

$str = UTF8::trim($str);

参数列表

参数 类型 描述 默认值
$str string Input string
$charlist string String of characters to remove null
返回值
  • string

UTF8::ltrim( $str , $charlist = null)

Strips whitespace (or other UTF-8 characters) from the beginning of a string. This is a UTF8-aware version of ltrim.

$str = UTF8::ltrim($str);

参数列表

参数 类型 描述 默认值
$str string Input string
$charlist string String of characters to remove null
返回值
  • string

UTF8::rtrim( $str , $charlist = null)

Strips whitespace (or other UTF-8 characters) from the end of a string. This is a UTF8-aware version of rtrim.

$str = UTF8::rtrim($str);

参数列表

参数 类型 描述 默认值
$str string Input string
$charlist string String of characters to remove null
返回值
  • string

UTF8::ord( $chr )

Returns the unicode ordinal for a character. This is a UTF8-aware version of ord.

$digit = UTF8::ord($character);

参数列表

参数 类型 描述 默认值
$chr string UTF-8 encoded character
返回值
  • integer

UTF8::to_unicode( $str )

Takes an UTF-8 string and returns an array of ints representing the Unicode characters. Astral planes are supported i.e. the ints in the output can be > 0xFFFF. Occurrences of the BOM are ignored. Surrogates are not allowed.

$array = UTF8::to_unicode($str);

The Original Code is Mozilla Communicator client code. The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 1998 the Initial Developer. Ported to PHP by Henri Sivonen hsivonen@iki.fi, see http://hsivonen.iki.fi/php-utf8/ Slight modifications to fit with phputf8 library by Harry Fuecks hfuecks@gmail.com

参数列表

参数 类型 描述 默认值
$str string UTF-8 encoded string
返回值
  • array unicode code points
  • FALSE if the string is invalid

UTF8::from_unicode( $arr )

Takes an array of ints representing the Unicode characters and returns a UTF-8 string. Astral planes are supported i.e. the ints in the input can be > 0xFFFF. Occurrances of the BOM are ignored. Surrogates are not allowed.

$str = UTF8::to_unicode($array);

The Original Code is Mozilla Communicator client code. The Initial Developer of the Original Code is Netscape Communications Corporation. Portions created by the Initial Developer are Copyright (C) 1998 the Initial Developer. Ported to PHP by Henri Sivonen hsivonen@iki.fi, see http://hsivonen.iki.fi/php-utf8/ Slight modifications to fit with phputf8 library by Harry Fuecks hfuecks@gmail.com.

参数列表

参数 类型 描述 默认值
$arr array Unicode code points representing a string
返回值
  • string utf8 string of characters
  • boolean FALSE if a code point cannot be found