php - Need assistance with regexp and cyrillic -
php - Need assistance with regexp and cyrillic -
i´ve got next problem
i have regular look did not made myself:
"|(?!<.*?)\b$old_text\b(?![^<>]*?>)|s"
it wonderfully finds $old_text in $text
but if $old_text illustration
"ОртоЦентр"
it wont find it
i´m sure /b
boundaries reffering
regular look cyrillic alphabet
so tried adapt
\[wа-я]+$old_text\[wа-я]+
or
\wа-я$old_text\wа-я
also tried sth. using unicode range:
|(?!<.*?)\x{0410}$old_text\x{042f}(?![^<>]*?>)|
also tried cyrillic thing i´m sur i´m not using correct:
"|(?!<.*?)\b{cyrillic}$old_text\b{cyrillic}(?![^<>]*?>)|si"
maybe rather right direction ? hey dont work, can genius pls. assist me ? in advance
update:
"|(?!<.*?)\p{cyrillic}+\b$old_text\b(?![^<>]*?>)|si"
update, heres php-code:
$text = "bar foo <p> barfoo </p> foobar ОртоЦентр bar bar"; $old_text = "ОртоЦентр"; $new_text = '<a href="http://foo.bar">ОртоЦентр</a>'; $limit = '-1'; $replaced = preg_replace( "|(?!<.*?)(\p{cyrillic}+$old_text\b)(?![^<>]*?>)|si", $new_text, $text, $limit );
as understand question, want replace such ОртоЦентр
aaaОртоЦентрzzz
<a href="http://foo.bar">...</a>
...
matching word.
from inital regex looks should done "outside tags".
to work unicode need specify u
(pcre_utf8) modifier. both, pattern , input expected valid utf-8 then. next illustration uses i
caseless modifier.
would skip tags: <[^>]*>(*skip)(*f)
or |
match word *
amount of \p{l}
unicode letters before , after: \b\p{l}*word\p{l}*\b
, capture. sample-pattern be:
~<[^>]*>(*skip)(*f)|\b(\p{l}*ОртоЦентр\p{l}*)\b~ui
test @ regex101.com (see explanation on right side)
and php-sample variables:
$txt = "bar foo <p> barfoo </p> foobar aОртоЦентрz bar bar"; $w = "ОртоЦентр"; $s = '~<[^>]*>(*skip)(*f)|\b(\p{l}*'.preg_quote($w,'~').'\p{l}*)\b~ui'; $r = '<a href="http://foo.bar">\1</a>'; $replaced = preg_replace($s, $r, $txt);
test @ eval.in
php regex unicode
Comments
Post a Comment