- AuthorPosts
- June 26, 2025 at 4:36 am #30329
Patrick C
ParticipantI’ve posted this before, but it might have been overlooked as it was in a mixed up context.
The syntax highlighting issue arises when a regex uses quantifiers, i.e. * or + or {3,} etc.
Example:
Two regex matches are active, all string and comment matches are disabled.″.*?″
regex match 1/2
%%.*$
regex match 2/2Issue:
EmEditor stops matching ″.*?″ as soon as it encounters %%.*$, even though the regex match for ″.*?″ has not yet been completed.
Would this be difficult to fix?
Because it is a serious limitation. One could write significantly more accurate syntax highlighters than what is currently possible.June 26, 2025 at 4:07 pm #30330Yutaka Emura
KeymasterYour observation is correct: highlighting with regex allows overlapping.
June 28, 2025 at 9:34 am #30338Patrick C
ParticipantIs there any way to work around this?
Because I often fall foul of it.June 29, 2025 at 9:57 am #30339Yutaka Emura
KeymasterIf
%%
always appears at the beginning of a line, you can use:^%%.*$
Otherwise, I don’t think there’s a good way to write a regex for this case. I’ll consider adding an option to control how regex is applied in situations like this.
June 30, 2025 at 12:18 am #30340Patrick C
ParticipantThe %% is just for the sake of illustration.
I came up with the idea of using regex expressions for highlighting from other highlighters.
In the case of highlighting javascript these are:
[1] https://github.com/pygments/pygments/blob/master/pygments/lexers/javascript.py
[2] https://github.com/speed-highlight/core/blob/main/src/languages/js.js
and several others.To highlight javascript regex literals,
/…/
one can, for example use (from [1]):
\/((?!\/)[^\r\n\\]|\\.)+\/[dgimsuy]*
And to highlight template literals ‵…‵
‵(?:(?!‵|${).)*?(?:‵|\${)
}(?:(?!‵|${).)*?(?:‵|\${))
The problem is that these two interfere:
I’ve written several variable length regex based highlighters for Python, Javascript, PowerShell and more. These work well, but only up to the point where they don’t overlap with another variable length regex highlight definition.
I’ll consider adding an option to control how regex is applied in situations like this.
This would be awesome 😃.
I realise that I’m just one customer, so please first focus on what’s most important for EmEditor rather than my request. Should you find the time, then I’ll greatly appreciate the effort.
Thank you, Yutaka!July 30, 2025 at 9:35 pm #30348Yutaka Emura
KeymasterEmEditor v25.3 preview 1 (25.2.901) introduces experimental support for using special keywords in regex highlight strings to set additional conditions.
–
(?#_text_c==n)
applies the highlight only if the text at the start of the match is already using color ‘n’.
–(?#_text_c!=n)
applies the highlight only if the text at the start of the match is *not* using color ‘n’.Here,
n
is an integer representing a specific color, as defined in plugin.h. For example:#define SMART_COLOR_NORMAL 0 #define SMART_COLOR_HILITE_4 17 // ...and so on
To set a condition, one of these keywords must appear at the very beginning of your highlight string.
For instance, instead of just writing
%%.*$
, you could use(?#_text_c!=17)%%.*$
to highlight lines starting with%%
only if they don’t already use the color corresponding ton = 17
(SMART_COLOR_HILITE_4
).Hope this helps clarify how to use this new feature!
July 31, 2025 at 3:19 am #30350Patrick C
ParticipantWow 😃
I’ll test this tomorrow Friday and will give feedback.
Thank you very much Yutaka!August 1, 2025 at 10:44 am #30351Patrick C
ParticipantEssentially I need the following (using the regex matching example at the top):
(?#_text_c==0)″.*?″ (?#_text_c==0)%%.*$
I.e. apply the highlight only when the start of the match uses
SMART_COLOR_NORMAL = 0
This works fantastically well, but only on odd lines:
→ Line 1, 3, 5 and 7 are correct.My test file is UTF-8 with LF as line terminator (no CR).
I cannot thank you enough for taking time for this!
August 1, 2025 at 9:07 pm #30352Yutaka Emura
KeymasterI’ve fixed this issue on preview 3 (25.2.903).
August 2, 2025 at 9:50 am #30353Patrick C
ParticipantThank you! 🙏
I’ve just adapted my JavaScript highlighter template and the results are fantastic:
String and regex literals now render really well 😃.
With respect to single line highlighting EmEditor now is perfect for my needs.The only thing I do not have a solution for is multiline matching.
As an example: For the JavaScript multiline comment/*…*/
one could use the regex
(?#_text_c==0)\/\*.*?\*\/
with the /s flag:
Should something like a
/s
flag or a directive
#Keyword color=10, …, regexp=on, multiline=on
be possible, then EmEditor’s highlighter would be one of the best I’ve ever seen.August 11, 2025 at 9:49 am #30365Yutaka Emura
KeymasterWhile we’re on the subject of the
(?#_text_c==n)
and(?#_text_c!=n)
syntax, are there any known issues with it?August 11, 2025 at 10:09 am #30366Patrick C
ParticipantYes, thank you for asking!
While the formatting is a lot better, there is the following shortcoming:Rather than not applying a rule,
(?^#_text_c==0)
only postpones a rule’s formatting untilc==0
.
This can lead to incorrect formatting.Example case (simplified regex):
Rule 1) Format javascript strings(?^#_text_c==0)".*?"
and
Rule 2) Format javascript template literals(?^#_text_c==0)\/.+?\/
On line 2:
The formatting between the «; "
» is incorrect.
And rule 1 is not applied to «"a string"
»If it were possible to set
(?^#_text_c==0)
to ignore (i.e. not apply) the rule rather than just postpone its formatting, then this shortcoming would be solved.August 11, 2025 at 1:34 pm #30367Yutaka Emura
KeymasterIf
(?^#_text_c==0)
were ignored, the literal “a string” would also be ignored. One fix is to use the regex(?#_text_c==0)"[^/]*?"
.August 12, 2025 at 1:50 am #30370Patrick C
Participant(?#_text_c==0)"[^/]*?"
Isn’t exactly a fix as it makes it impossible for the string to contain a /.
Side note:
The regex I use in the example are intentionally simplified for the sake of illustration.
The regex to match a string actually is(?#_text_c==0)".*?(?<!\\)"
, which in the example is simplified to (?#_text_c==0)”.*?”. - AuthorPosts
- You must be logged in to reply to this topic.