|
3 weeks ago | |
---|---|---|
.. | ||
build | 3 weeks ago | |
src | 3 weeks ago | |
.whitesource | 3 weeks ago | |
CHANGELOG.md | 3 weeks ago | |
LICENSE.txt | 3 weeks ago | |
README.md | 3 weeks ago | |
composer.json | 3 weeks ago |
It is written in PHP (PHP 7+) and can work without "mbstring", "iconv" or any other extra encoding php-extension on your server.
The benefit of Portable ASCII is that it is easy to use, easy to bundle.
The project based on ...
If you like a more Object Oriented Way to edit strings, then you can take a look at voku/Stringy, it's a fork of "danielstjules/Stringy" but it used the "Portable ASCII"-Class and some extra methods.
// Portable ASCII
use voku\helper\ASCII;
ASCII::to_transliterate('déjà σσς iıii'); // 'deja sss iiii'
// voku/Stringy
use Stringy\Stringy as S;
$stringy = S::create('déjà σσς iıii');
$stringy->toTransliterate(); // 'deja sss iiii'
composer require voku/portable-ascii
I need ASCII char handling in different classes and before I added this functions into "Portable UTF-8", but this repo is more modular and portable, because it has no dependencies.
Example: ASCII::to_ascii()
echo ASCII::to_ascii('�Düsseldorf�', 'de');
// will output
// Duesseldorf
echo ASCII::to_ascii('�Düsseldorf�', 'en');
// will output
// Dusseldorf
The API from the "ASCII"-Class is written as small static methods.
↑ Returns an replacement array for ASCII methods.
EXAMPLE:
$array = ASCII::charsArray();
var_dump($array['ru']['б']); // 'b'
Parameters:
bool $replace_extra_symbols [optional] <p>Add some more replacements e.g. "£" with " pound ".</p>
Return:
array
↑ Returns an replacement array for ASCII methods with a mix of multiple languages.
EXAMPLE:
$array = ASCII::charsArrayWithMultiLanguageValues();
var_dump($array['b']); // ['β', 'б', 'ဗ', 'ბ', 'ب']
Parameters:
bool $replace_extra_symbols [optional] <p>Add some more replacements e.g. "£" with " pound ".</p>
Return:
array <p>An array of replacements.</p>
↑ Returns an replacement array for ASCII methods with one language.
For example, German will map 'ä' to 'ae', while other languages will simply return e.g. 'a'.
EXAMPLE:
$array = ASCII::charsArrayWithOneLanguage('ru');
$tmpKey = \array_search('yo', $array['replace']);
echo $array['orig'][$tmpKey]; // 'ё'
Parameters:
string $language [optional] <p>Language of the source string e.g.: en, de_at, or de-ch.
(default is 'en') | ASCII::*_LANGUAGE_CODE</p>
bool $replace_extra_symbols [optional] <p>Add some more replacements e.g. "£" with " pound ".</p>
bool $asOrigReplaceArray [optional] <p>TRUE === return {orig: string[], replace: string[]}
array</p>
Return:
array <p>An array of replacements.</p>
↑ Returns an replacement array for ASCII methods with multiple languages.
EXAMPLE:
$array = ASCII::charsArrayWithSingleLanguageValues();
$tmpKey = \array_search('hnaik', $array['replace']);
echo $array['orig'][$tmpKey]; // '၌'
Parameters:
bool $replace_extra_symbols [optional] <p>Add some more replacements e.g. "£" with " pound ".</p>
bool $asOrigReplaceArray [optional] <p>TRUE === return {orig: string[], replace: string[]}
array</p>
Return:
array <p>An array of replacements.</p>
↑ Accepts a string and removes all non-UTF-8 characters from it + extras if needed.
Parameters:
string $str <p>The string to be sanitized.</p>
bool $normalize_whitespace [optional] <p>Set to true, if you need to normalize the
whitespace.</p>
bool $keep_non_breaking_space [optional] <p>Set to true, to keep non-breaking-spaces, in
combination with
$normalize_whitespace</p>
bool $normalize_msword [optional] <p>Set to true, if you need to normalize MS Word chars
e.g.: "…"
=> "..."</p>
bool $remove_invisible_characters [optional] <p>Set to false, if you not want to remove invisible
characters e.g.: "\0"</p>
Return:
string <p>A clean UTF-8 string.</p>
↑ Get all languages from the constants "ASCII::.*LANGUAGE_CODE".
Parameters: nothing
Return:
string[]
↑ Checks if a string is 7 bit ASCII.
EXAMPLE:
ASCII::is_ascii('白'); // false
Parameters:
string $str <p>The string to check.</p>
Return:
bool <p>
<strong>true</strong> if it is ASCII<br>
<strong>false</strong> otherwise
</p>
↑ Returns a string with smart quotes, ellipsis characters, and dashes from Windows-1252 (commonly used in Word documents) replaced by their ASCII equivalents.
EXAMPLE:
ASCII::normalize_msword('„Abcdef…”'); // '"Abcdef..."'
Parameters:
string $str <p>The string to be normalized.</p>
Return:
string <p>A string with normalized characters for commonly used chars in Word documents.</p>
↑ Normalize the whitespace.
EXAMPLE:
ASCII::normalize_whitespace("abc-\xc2\xa0-öäü-\xe2\x80\xaf-\xE2\x80\xAC", true); // "abc-\xc2\xa0-öäü- -"
Parameters:
string $str <p>The string to be normalized.</p>
bool $keepNonBreakingSpace [optional] <p>Set to true, to keep non-breaking-spaces.</p>
bool $keepBidiUnicodeControls [optional] <p>Set to true, to keep non-printable (for the web)
bidirectional text chars.</p>
bool $normalize_control_characters [optional] <p>Set to true, to convert e.g. LINE-, PARAGRAPH-SEPARATOR with "\n" and LINE TABULATION with "\t".</p>
Return:
string <p>A string with normalized whitespace.</p>
↑ Remove invisible characters from a string.
e.g.: This prevents sandwiching null characters between ascii characters, like Java\0script.
copy&past from https://github.com/bcit-ci/CodeIgniter/blob/develop/system/core/Common.php
Parameters:
string $str
bool $url_encoded
string $replacement
bool $keep_basic_control_characters
Return:
string
↑ Returns an ASCII version of the string. A set of non-ASCII characters are replaced with their closest ASCII counterparts, and the rest are removed by default. The language or locale of the source string can be supplied for language-specific transliteration in any of the following formats: en, en_GB, or en-GB. For example, passing "de" results in "äöü" mapping to "aeoeue" rather than "aou" as in other languages.
EXAMPLE:
ASCII::to_ascii('�Düsseldorf�', 'en'); // Dusseldorf
Parameters:
string $str <p>The input string.</p>
string $language [optional] <p>Language of the source string.
(default is 'en') | ASCII::*_LANGUAGE_CODE</p>
bool $remove_unsupported_chars [optional] <p>Whether or not to remove the
unsupported characters.</p>
bool $replace_extra_symbols [optional] <p>Add some more replacements e.g. "£" with " pound
".</p>
bool $use_transliterate [optional] <p>Use ASCII::to_transliterate() for unknown chars.</p>
bool|null $replace_single_chars_only [optional] <p>Single char replacement is better for the
performance, but some languages need to replace more then one char
at the same time. | NULL === auto-setting, depended on the
language</p>
Return:
string <p>A string that contains only ASCII characters.</p>
↑ Convert given string to safe filename (and keep string case).
EXAMPLE:
ASCII::to_filename('שדגשדג.png', true)); // 'shdgshdg.png'
Parameters:
string $str
bool $use_transliterate <p>ASCII::to_transliterate() is used by default - unsafe characters are
simply replaced with hyphen otherwise.</p>
string $fallback_char
Return:
string <p>A string that contains only safe characters for a filename.</p>
↑ Converts the string into an URL slug. This includes replacing non-ASCII characters with their closest ASCII equivalents, removing remaining non-ASCII and non-alphanumeric characters, and replacing whitespace with $separator. The separator defaults to a single dash, and the string is also converted to lowercase. The language of the source string can also be supplied for language-specific transliteration.
Parameters:
string $str
string $separator [optional] <p>The string used to replace whitespace.</p>
string $language [optional] <p>Language of the source string.
(default is 'en') | ASCII::*_LANGUAGE_CODE</p>
array<string, string> $replacements [optional] <p>A map of replaceable strings.</p>
bool $replace_extra_symbols [optional] <p>Add some more replacements e.g. "£" with "
pound ".</p>
bool $use_str_to_lower [optional] <p>Use "string to lower" for the input.</p>
bool $use_transliterate [optional] <p>Use ASCII::to_transliterate() for unknown
chars.</p>
Return:
string <p>A string that has been converted to an URL slug.</p>
↑ Returns an ASCII version of the string. A set of non-ASCII characters are replaced with their closest ASCII counterparts, and the rest are removed unless instructed otherwise.
EXAMPLE:
ASCII::to_transliterate('déjà σσς iıii'); // 'deja sss iiii'
Parameters:
string $str <p>The input string.</p>
string|null $unknown [optional] <p>Character use if character unknown. (default is '?')
But you can also use NULL to keep the unknown chars.</p>
bool $strict [optional] <p>Use "transliterator_transliterate()" from PHP-Intl
Return:
string <p>A String that contains only ASCII characters.</p>
1) Composer is a prerequisite for running the tests.
composer install
2) The tests can be executed by running this command from the root directory:
./vendor/bin/phpunit
For support and donations please visit Github | Issues | PayPal | Patreon.
For status updates and release announcements please visit Releases | Twitter | Patreon.
For professional support please contact me.
Released under the MIT License - see LICENSE.txt
for details.