Viewing 11 posts - 1 through 11 (of 11 total)
  • Author
    Posts
  • #9063

    Deipotent
    Participant

    One feature I’ve had the need for recently is the ability to extract text matching a Regular Expression, with the actual text that is extracted matching a Replacement Expression (ie. so you can use back references from the Find RE).

    For example, some text I wish to match is as follows:

    http://www.emeditor.com/123/321

    where the numbers can change. When extracted, I want the numbered parts to be swapped, so in this example, the text extracted to the new document would be:

    http://www.emeditor.com/321/123

    So, I use the following Find Regular Expression:

    http://www.emeditor.com/(d{3})/(d{3})

    and want to use the following Replacement Expression to extract the text to a new line:

    http://www.emeditor.com/2/1n

    It would be great if there was an easy way to do this from both the Replace dialog, Replace in Files dialog, and Find bar. For the Replace in Files, it would be handy if there was an option to either put all matches in a single new file, or have separate new documents for each file.

    Also, maybe the Find plugin could be extended to become a Find/Replace/Extract plugin.

    With this feature, text could be extracted/mined from files.

    #9065

    CrashNBurn
    Member

    Find/Replace should be able to enable the “Output bar” to indicate what should be placed there:
    1) All Found Text, and 2) All Replaced Text
    or just 2) All Replaced Text.

    I thought it was possible, but it appears only the “Find in Files” allows usage of the Output bar for searches.

    #9066

    Yutaka Emura
    Keymaster

    Hello Deipotent and CrashNBurn,

    I will consider those options in future versions.
    Thanks for your inputs!

    #9087

    Deipotent
    Participant

    Further to my suggestion, it would be useful if the Output bar could be used to display the extracted text. Even better, would be if this could be updated in realtime, similar to the incremental search, so you can see immediately the extracted text as the Find/Replacement expression is modified.

    This would make it far easier to create the correct Find/Replacement expressions, since it would give immediate feedback on what is matched (ie. the current incremental search feature), along with what is extracted.

    While this real-time updating is probably not feasible for Find/Replace in Files with the option “Look in Subfolders” enabled, it could come in handy when the “Look in Subfolders” is not enabled.

    #9088

    Deipotent
    Participant

    A few more suggestions related to extract text, plus Find/Replace In Files:

    1) Ability to exclude file name from search results (ie. the opposite of the “Display File Names Only” option)

    2) Similar to (1), but also option to only show matched/extracted text (ie. not the whole line).

    3) Incremental Search option on Find/Replace In Files, which will apply to currently active document, to aid creation of Find Expression without the need to switch to normal Find/Replace windows.

    4) Option to leave Find/Replace in Files window open after pressing Find button.

    #9090

    CrashNBurn
    Member

    I dunno about the live-regex/Incremental search for “In Files”

    But XNews.exe has a very useful tool built-in, “Test Regex”

    Pattern: (Regex Pattern).*(goes here).?
    Test String: Example string for Regex Pattern to match goes here.
    Result: ***Match***
    Matched String and sub-expressions:

    00) Regex Pattern to match goes here.
    01) Regex Pattern
    02) goes here

    The dialog has a [Match] and [Close] button, and a [x] Match case, checkbox.

    #9204

    Deipotent
    Participant

    I dunno about the live-regex/Incremental search for “In Files”

    A tool called BareGrep offers live-regex/Incremental search for files, so it’s feasible, particularly given the powerful nature of computers these days. Admittedly, an option to disable it should be present though, if not required.

    But XNews.exe has a very useful tool built-in, “Test Regex”

    EmEditor already has this functionality for Find, except it will match against the current document, rather than a Test String. Given this, I don’t see a need for a separate “Text RegEx” tool.

    #9206

    CrashNBurn
    Member

    Merely finding text in a given document (valid regex) is in no way the same functionality as displaying how a given regex breaks up into its various pieces.

    These days if a particular regex is being difficult, I’ll likely just Run a 3 line AHK script and messageBox the output.

    #18285

    Deipotent
    Participant

    In the hope I can finally persuade Yutaka to fully implement text extraction capability to EmEditor, I thought I’d revisit this thread :) Let me try to explain what I’m after:

    EE v14.0 beta 1 added the ability to “Display only matched strings” in the Output bar for Find In Files. This allows text to be extracted from files matching the Find pattern (including RegEx). This is useful, but it can easily be made a lot more powerful/useful.

    1) Output option to “Display only strings matching Replace pattern” – this allows you to extract text matching a pattern, but manipulate the matched text with the Replace pattern. For example, suppose I have the following text in a file:

    AJCKDDSVCDhttp://www.emeditor.com/123/321 osajknosdnkojsdnfksd
    http://www.emeditor.com/223/321 gfdbfdgbnfdgfdgfdg
    fdgfdgfdgdfghttp://www.emeditor.com/323/321 fdgfdgfdgfdgfdghttp://www.emeditor.com/423/321 fdgfdgfdg
    http://www.emeditor.com/523/321 fdgfdgfdgfdgfdgfdg

    and want to extract all the URL’s, but switch the two path components (ie. 1st URL becomes http://www.emeditor.com/321/123). I might use a Find pattern of

    http://www.emeditor.com/(\d{3})/(\d{3})

    and Replace pattern of

    http://www.emeditor.com/\2/\1

    If I could specify an Output option of “Display only strings matching Replace pattern”, then I would see the following in the Output Bar:

    http://www.emeditor.com/321/123
    http://www.emeditor.com/321/223
    http://www.emeditor.com/321/323 
    http://www.emeditor.com/321/423
    http://www.emeditor.com/321/523

    So, I’ve managed to extract all URL’s but also switch over the path components of the URL.

    #18287

    Deipotent
    Participant

    – Often, you may just want to extract text from the current document (or all open documents), so the “Use Output Bar” option and “Output Options drop-down” should be added to the Find and Replace dialogs.

    – To aid in regex construction, and to visually see what will being matched, the Output Bar should be updated with matches when “Incremental Search” is enabled (similar to how the text matching the Find pattern is updated as you type into the Find box).

    – Incremental Search option on Find/Replace In Files, which will apply to currently active document, to aid creation of Find Expression without the need to switch to normal Find/Replace windows.

    – Option to leave Find/Replace in Files window open after pressing Find button.

    – As you may want to do further processing on extracted text, a “Copy to new document” button in the Output Bar header (to the right of the text “Output”) would make this easy. ie. one click would create a new document containing the contents of the Output Bar.

    Member CrashNBurn also wanted the Output option, “Display only text matching find pattern and also text matching Replace pattern”

    As already mentioned, EmEditor already has the capability to do all the above. eg. When “Use Output Bar” checkbox state changes, show or hide the Output Bar. When incremental matches are being highlighted in document, update Output Bar (if enabled) with matches.

    #18288

    Deipotent
    Participant

    To allow extraction of a lot of text, it might be worth replacing the current Output Bar editor control with the same control used for editing documents (which supports massive files), although I would be happy without this.

Viewing 11 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic.