#5071
Yutaka Emura
Keymaster

jugaor wrote:
Hi, I tried several versions (5 up 7beta) and I found the next ‘bugs’, both in manual / script searches (Spanish texts):

a por (eeFindReplaceOnlyWord)
matches “creería por”, “CAMPAÑA POR”, etc. (i.e., it breaks the words at the accented vowels or “Ñ”/”ñ”)

any accented vowel (eeFindReplaceOnlyWord)
matches “diseñé”, “ENSEÑÓ”, etc. (i.e., it breaks at the “Ñ”/”ñ” the words with final accented vowels)

In manual searches (with an open document), it matches all the accented vowels inside words despite “Search Only Word” (i.e. it matches “cómprale”, “mamá”, “después”, etc.)

(?!es |son )esta(s?)(!|?)
discards the first negative subexpression (i.e., it matches “esta!” / “esta?” / “estas!” / “estas?”), despite the fact I use ‘eeFindReplaceRegExp Or eeFindReplaceOnlyWord’ options

If I simplify the expression
(?!es) esta(!|?)
(?!es )esta(!|?)
or
(?!son) estas(!|?)
(?!son )estas(!|?)

it has the same behavior. However,
(¡|¿)esta(s?)(?! es| son)
excepts the correct ones.

If you need more information, please email-me.
TIA.
jugaor

As far as your first question is concerned, EmEditor did not try to check unicode characters (character code > U+0080) in previous versions for the speed. However, I will add a routine to check some Latin character (ch >= 0x00c0 && ch <= 0x02b8) in the next beta version. This addition will not cover all the Unicode characters but still improve "whole word" accuracy in most cases while not sacrificing much speed.

I was not sure about your latter question, but there are two unnecessary spaces in your regular expression: (?!es |son )esta(s?)(!|?)

One between “s” and “|”, and the other between ‘n’ and ‘)’.

Removing these spaces does not solve your issue?