php - Need assistance with regexp and cyrillic -
php - Need assistance with regexp and cyrillic -
i´ve got next problem
i have regular look did not made myself:
"|(?!<.*?)\b$old_text\b(?![^<>]*?>)|s"    it wonderfully finds $old_text in $text
but if $old_text illustration
"ОртоЦентр"    it wont find it
i´m sure /b boundaries reffering 
regular look cyrillic alphabet
so tried adapt
\[wа-я]+$old_text\[wа-я]+    or
\wа-я$old_text\wа-я    also tried sth. using unicode range:
|(?!<.*?)\x{0410}$old_text\x{042f}(?![^<>]*?>)|    also tried cyrillic thing i´m sur i´m not using correct:
"|(?!<.*?)\b{cyrillic}$old_text\b{cyrillic}(?![^<>]*?>)|si"    maybe rather right direction ? hey dont work, can genius pls. assist me ? in advance
update:
"|(?!<.*?)\p{cyrillic}+\b$old_text\b(?![^<>]*?>)|si"    update, heres php-code:
$text = "bar foo <p> barfoo </p> foobar ОртоЦентр bar bar"; $old_text = "ОртоЦентр"; $new_text = '<a href="http://foo.bar">ОртоЦентр</a>'; $limit = '-1';  $replaced = preg_replace( "|(?!<.*?)(\p{cyrillic}+$old_text\b)(?![^<>]*?>)|si", $new_text, $text, $limit );       
as understand question, want replace such ОртоЦентр aaaОртоЦентрzzz <a href="http://foo.bar">...</a> ... matching word. 
from inital regex looks should done "outside tags".
to work unicode need specify u (pcre_utf8) modifier. both, pattern , input expected valid utf-8 then.  next  illustration uses i caseless modifier.
would skip tags: <[^>]*>(*skip)(*f) or | match word * amount of \p{l} unicode letters before , after: \b\p{l}*word\p{l}*\b , capture. sample-pattern be:
~<[^>]*>(*skip)(*f)|\b(\p{l}*ОртоЦентр\p{l}*)\b~ui    test @ regex101.com (see explanation on right side)
and php-sample variables:
$txt = "bar foo <p> barfoo </p> foobar aОртоЦентрz bar bar";  $w = "ОртоЦентр";  $s = '~<[^>]*>(*skip)(*f)|\b(\p{l}*'.preg_quote($w,'~').'\p{l}*)\b~ui'; $r = '<a href="http://foo.bar">\1</a>';  $replaced = preg_replace($s, $r, $txt);    test @ eval.in
 php regex unicode 
 
  
Comments
Post a Comment