[ACCEPTED]-PHP Transliteration-transliteration
You can use iconv, which has a special transliteration 8 encoding.
When the string "//TRANSLIT" is 7 appended to tocode, transliteration is activated. This 6 means that when a character cannot be represented 5 in the target character set, it can be approximated 4 through one or several characters that look 3 similar to the original character.
-- http://www.gnu.org/software/libiconv/documentation/libiconv/iconv_open.3.html
See here for 2 a complete example that matches your use 1 case.
If you are using iconv then make sure your 3 locale is set correctly before you try the 2 transliteration, otherwise some characters 1 will not be correctly transliterated
setlocale(LC_CTYPE, 'en_US.UTF8');
This will convert as much as possible foreign 3 characters (including Cyrillic, Chinese, Arabic 2 etc.) to their A-z equivalents:
$AzString = transliterator_transliterate('Any-Latin;Latin-ASCII;', $foreignString);
You might 1 want install PHP Intl extension first.
If you are stuck with an development&release 11 environment that doesn't support PHP 5.4 10 or newer, you should either use iconv or a custom 9 Transliteration library.
In case of iconv, I 8 find it extremely unhelpful especially using 7 it on Arabic or Cyrillic alphabets. I would 6 go for a PHP 5.4 built-in Transliteration 5 class or a custom Transliteration class.
One 4 of the solutions posted mentioned a custom library which 3 I did not test.
When I was using Drupal, I 2 loved their transliteration module, that I've recently ported 1 it to make it usable without Drupal.
You can download it here and use as follows:
<?php
include "JTransliteration.php";
$mombojombotext = "誓曰:『時日害喪?予及女偕亡。』民欲與之偕亡,雖有";
$nonmombojombotex = JTransliteration::transliterate($mombojombotext);
echo $nonmombojombotex;
?>
Note: I'm reposting this from another similar 7 question in the hope that it's helpful to 6 others.
I ended up writing a PHP library 5 based on URLify.js from the Django project, since 4 I found iconv() to be too incomplete. You 3 can find it here:
https://github.com/jbroadway/urlify
Handles Latin characters 2 as well as Greek, Turkish, Russian, Ukrainian, Czech, Polish, and 1 Latvian.
<?php
/**
* @author bulforce[]gmail.com # 2011
* Simple class to attempt transliteration of bulgarian lating text into bulgarian cyrilic text
*/
// Usage:
// $text = "yagoda i yundola";
// $tl = new Transliterate();
// echo $tl->lat_to_cyr($text); //ягода и юндола
class Transliterate {
private $cyr_identical = array("а", "б", "в", "в", "г", "д", "е", "ж", "з", "и", "к", "л", "м", "н", "о", "п", "р", "с", "т", "у", "ф", "х", "ц", "ъ", "я");
private $lat_identical = array("a", "b", "v", "w", "g", "d", "e", "j", "z", "i", "k", "l", "m", "n", "o", "p", "r", "s", "t", "u", "f", "h", "c", "y", "q");
private $cyr_fricative = array("ж", "ч", "ш", "щ", "ц", "я", "ю", "я", "ю");
private $lat_fricative = array("zh", "ch", "sh", "sht", "ts", "ia", "iu", "ya", "yu");
public function __construct() {
$this->identical_to_upper();
$this->fricative_to_variants();
}
public function lat_to_cyr($str) {
for ($i = 0; $i < count($this->cyr_fricative); $i++) {
$c_cyr = $this->cyr_fricative[$i];
$c_lat = $this->lat_fricative[$i];
$str = str_replace($c_lat, $c_cyr, $str);
}
for ($i = 0; $i < count($this->cyr_identical); $i++) {
$c_cyr = $this->cyr_identical[$i];
$c_lat = $this->lat_identical[$i];
$str = str_replace($c_lat, $c_cyr, $str);
}
return $str;
}
private function identical_to_upper() {
foreach ($this->cyr_identical as $k => $v) {
$this->cyr_identical[] = mb_strtoupper($v, 'UTF-8');
}
foreach ($this->lat_identical as $k => $v) {
$this->lat_identical[] = mb_strtoupper($v, 'UTF-8');
}
}
private function fricative_to_variants() {
foreach ($this->lat_fricative as $k => $v) {
// This handles all chars to Upper
$this->lat_fricative[] = mb_strtoupper($v, 'UTF-8');
$this->cyr_fricative[] = mb_strtoupper($this->cyr_fricative[$k], 'UTF-8');
// This handles variants
// TODO: fix the 3 leter sounds
for ($i = 0; $i <= count($v); $i++) {
$v[$i] = mb_strtoupper($v[$i], 'UTF-8');
$this->lat_fricative[] = $v;
if ($i == 0) {
$this->cyr_fricative[] = mb_strtoupper($this->cyr_fricative[$k], 'UTF-8');
} else {
$this->cyr_fricative[] = $this->cyr_fricative[$k];
}
$v[$i] = mb_strtolower($v[$i], 'UTF-8');
}
}
}
}
0
for composer adepts there is slugify
https://github.com/cocur/slugify
use Cocur\Slugify\Slugify;
$slugify = new Slugify();
echo $slugify->slugify('Hello World!'); // hello-world
//You can also change the separator used by Slugify:
echo $slugify->slugify('Hello World!', '_'); // hello_world
//The library also contains Cocur\Slugify\SlugifyInterface. Use this interface whenever you need to type hint an instance of Slugify.
//To add additional transliteration rules you can use the addRule() method.
$slugify->addRule('i', 'ey');
echo $slugify->slugify('Hi'); // hey
0
Try this one
function Unaccent( $string ) {
$transliterator = Transliterator::createFromRules(':: NFD; :: [:Nonspacing Mark:] Remove; :: NFC;', Transliterator::FORWARD);
$normalized = $transliterator->transliterate($string);
return $normalized;
}
0
The problem with your query is that it is 12 a very hard thing to do. Not all glyphs 11 in most languages have a-z equivalents, all 10 glyphs have phonetic equivalents (but these 9 are words not letters), if you are just 8 dealing with Latin based languages then 7 things are a little easier but you still 6 have issues with things like I-mutation.
Your 5 best solution word be to come up with a 4 crude list of phonetic sounds -> a-z equivalents, it 3 won't be perfect but without any more information 2 on you exact requirements it is hard to 1 develop a solution.
Nice library found at:
1) https://github.com/ashtokalo/php-translit (many languages, however, lacks 2 of some languages)
2) https://github.com/fre5h/transliteration (only for Russian 1 and Ukrainian)
More Related questions
We use cookies to improve the performance of the site. By staying on our site, you agree to the terms of use of cookies.