EmEditor (text editor) Forum Index
   EmEditor Core Bug Reports
     Regex bug related to ignoring \r when regex operator appears after it
Register To Post

Flat Previous Topic | Next Topic
Poster Thread
Deipotent
Posted on: 8/25/2011 1:27 pm
Just can't stay away
Joined: 2/15/2008
From:
Posts: 118
Regex bug related to ignoring \r when regex operator appears after it
Related to my original post about a regex for finding duplicate lines:

http://www.emeditor.com/modules/newbb/viewtopic.php?topic_id=3&forum=3&post_id=5812#forumpost5812

and my subsequent enhancement request about ignoring \r in a regex:

http://www.emeditor.com/modules/newbb/viewtopic.php?topic_id=1811&post_id=5888&forum=4#forumpost5888

it would appear there is a bug when "\r" appears in a regex, and is followed by a regex operator like ? or * (and probably others).

From playing around with it, it looks like EmEditor just removes the "\r", so when there is a regex like

^(.*)(\r?\n\1)+$


EmEditor silently strips out "\r", leaving

^(.*)(?\n\1)+$


which is incorrect.

A few possible solutions involve:

1) Checking if the "\r" is followed by an operator which applies to the previous character (eg. ? or * etc.), and stripping that operator as well.

2) Replacing the "\r" with "(\r)" seems to solve the problem, but that would then affect back references.

3) If possible, tell the regex engine to ignore "\r".

Option (3) may be the best option if available, since it will probably handle all cases properly.

Either way, it's not as easy at it would first seem.
Flat Previous Topic | Next Topic


Subject Poster Date
 » Regex bug related to ignoring \r when regex operator appears after it Deipotent 8/25/2011 1:27 pm
     Re: Regex bug related to ignoring \r when regex operator appears after it Yutaka 9/16/2011 8:37 am

Register To Post
 
English čeština Deutsch español français italiano 日本語 한국어 Русский 简体中文 繁體中文