CakePHP encoding problem : storing uppercase S with caron on top, saves in the database but causes errors while processed by cake -


so working in site sores cuneiform tablets info. use semitic chars transliteration.

in script, create term list translittaration of tablet.

my problem Š, script created 2 different terms because thinks there space in word because of way cake treats special char.

exemple :

partial contents of tablet :

  1. utu-diŠ-nu-il2

terms tablet when treated script :

utu-diŠ, -nu-il2

it should :

utu-diŠ-nu-il2

when print contents of array in course of treatment of contents, see :

  1. utu-di� -nu-il2

so means uncorrect parsing of text creates space interpreted in script 2 words instead of one.

in database, text fine...

i these errors :

warning (512): sql error: 1366: incorrect string value: '\xc5' column 'term' @ row 1 [core\cake\libs\model\datasources\dbo_source.php, line 684]

query: insert terms (term, lft, rght) values ('utu-di�', 449, 450)

query: insert terms (term, lft, rght) values ('a�', 449, 450)

query: insert terms (term, lft, rght) values ('xdi�', 449, 450)

anybody knows make work ?

thanks !

added info :

    $terms=$this->data['tablet']['translit']; $terms= str_replace(array('\r\n', '\r', '\n','\n\r','\t'), ' ', $terms); $terms = trim($terms, chr(173)); print_r($terms); $terms = preg_replace('/\s+/', ' ', $terms); $terms = explode(" ", $terms); $terms=array_map('trim', $terms); $anti_terms = array('@tablet','1.','2.','3.','4.','5.','6.','7.','7.','9.','10.','11.','12.','13.','14.','15.','16.','17.','18.','19.','20.','rev.', 'obv.','@tablet','@obverse','@reverse','c1','c2','c3','c4','c5','c6','c7','c8','c9', '\r', '\n','\r\n', '\t',''. ' ', null, chr(173), 'x', '[x]','[...]' ); foreach($terms $key => $term) {     if(in_array($term, $anti_terms) || is_numeric($term)) {         unset($terms[$key]);         }     } 

if put print_r before preg, s good, if after, display black lozenge. guess preg function problem !


just found : http://www.php.net/manual/fr/function.preg-replace.php#84385

but seems that

mb_ereg_replace()

causes same problem preg_replace() ....


solutuion :

mb_internal_encoding("utf-8"); mb_regex_encoding("utf-8"); $terms = mb_ereg_replace('\s+', ' ', $terms); 

and error gone ... !

mb_internal_encoding("utf-8"); mb_regex_encoding("utf-8"); $terms = mb_ereg_replace('\s+', ' ', $terms); 

Comments

Popular posts from this blog

c# - SharpSVN - How to get the previous revision? -

c++ - Is it possible to compile a VST on linux? -

url - Querystring manipulation of email Address in PHP -