Viewing 2 posts - 1 through 2 (of 2 total)
  • Author
    Posts
  • #9606
    Deipotent
    Participant

    Related to my original post about a regex for finding duplicate lines:

    http://www.emeditor.com/modules/newbb/viewtopic.php?topic_id=3&forum=3&post_id=5812#forumpost5812

    and my subsequent enhancement request about ignoring r in a regex:

    http://www.emeditor.com/modules/newbb/viewtopic.php?topic_id=1811&post_id=5888&forum=4#forumpost5888

    it would appear there is a bug when “r” appears in a regex, and is followed by a regex operator like ? or * (and probably others).

    From playing around with it, it looks like EmEditor just removes the “r”, so when there is a regex like

    ^(.*)(r?n1)+$

    EmEditor silently strips out “r”, leaving

    ^(.*)(?n1)+$

    which is incorrect.

    A few possible solutions involve:

    1) Checking if the “r” is followed by an operator which applies to the previous character (eg. ? or * etc.), and stripping that operator as well.

    2) Replacing the “r” with “(r)” seems to solve the problem, but that would then affect back references.

    3) If possible, tell the regex engine to ignore “r”.

    Option (3) may be the best option if available, since it will probably handle all cases properly.

    Either way, it’s not as easy at it would first seem.

    #9647
    Yutaka Emura
    Keymaster

    Hi Deipotent,

    As you wrote, EmEditor strips out ‘r’ from regular expressions when you use Find. This is because a new line is represented as a ‘n’ and not ‘rn’ no matter whether ‘r’ or ‘n’ or ‘rn’ is used for a new line. Currently, this is the specification because many users were confused whey they needed to specify ‘r’ or ‘rn’ for a new line in earlier versions.

    When you specify a new line, please use ‘n’, and not ‘r’.

    Thank you!

Viewing 2 posts - 1 through 2 (of 2 total)
  • You must be logged in to reply to this topic.