EmEditor (text editor) Forum Index
   Beta Version Bug Reports
     [b29] Find in Files very slow to search a large number of files
Register To Post

Threaded | Newest First Previous Topic | Next Topic | Bottom
Poster Thread
tungwaiyip
Posted on: 11/7/2007 11:11 am
Not too shy to talk
Joined: 12/8/2006
From:
Posts: 33
[b29] Find in Files very slow to search a large number of files
I'm searching for a string among 2500 files spreading over a directory tree. The performance number:

Another program: 2 seconds
EmEditor: 9 minutes

This translates to 4.6 files per second for EmEditor v.s. 1250 files per second for the other program.

Another interesting observation. There are two types of files, one with .rst extension and one with .html extension. The search seems to go much faster with .html files. It takes 65 second to search through 1000 .html files. This translates to 15 files per second. Not great, but much better than average. The total files size of the .html files actually make up 75% of the original search.
Yutaka
Posted on: 11/7/2007 11:27 am
Webmaster
Joined: 9/28/2006
From: Redmond
Posts: 2397
Re: [b29] Find in Files very slow to search a large number of files
Thanks for descriptions, but it might be more helpful if you can write which encodings those files are written with, and whether you use regular expressions, or escape sequences including new lines. A possible reason for the delay is that EmEditor internally works in Unicode, and it support Unicode-aware regular expressions. You can write exactly what you search for, and what the files are like, and you can email me sample files and more descriptions at tech@emurasoft.com
Thanks!


----------------
Yutaka Emura
Developer of EmEditor
http://www.emeditor.com/

Yutaka
Posted on: 11/7/2007 11:33 am
Webmaster
Joined: 9/28/2006
From: Redmond
Posts: 2397
Re: [b29] Find in Files very slow to search a large number of files
Also, make sure you write which options (such as Regular Expressions, Match Catch, etc.) you checked in the Find in Files dialog box. Using different options can make significant difference in search speed. Thanks!


----------------
Yutaka Emura
Developer of EmEditor
http://www.emeditor.com/

tungwaiyip
Posted on: 11/7/2007 12:16 pm
Not too shy to talk
Joined: 12/8/2006
From:
Posts: 33
Re: [b29] Find in Files very slow to search a large number of files
I've figured it out. The problem is in the "Encoding" listbox. The default "Configured Encoding" is the slowest. Choosing anything other than "Configured Encoding", even if you pick something random like Korean, will give you good performance.

The other options I have set is "Look in subfolder" and "Use Escape Sequence". Unfortunately I cannot post my data file. But this is irrelevant. I repeat the search over 1000 files from an open source project and the result is the same.
Yutaka
Posted on: 11/8/2007 8:27 am
Webmaster
Joined: 9/28/2006
From: Redmond
Posts: 2397
Re: [b29] Find in Files very slow to search a large number of files
When "Configured Encoding" is selected in "Find in Files" dialog box, EmEditor uses the encoding configured in the File tab of the configuration properties assiciated with the file extension you are searching. In default, Text configuration uses "System Default" encoding with UTF-8 and Unicode signature detection. The UTF-8 detection can make search slower. If the "Detect All" is also checked in the File tab of the configuration properties, it may become even more slower. Please let me know which options are checked in the File tab of associated Configuration Properties.

Also, in what encoding do those files you search are encoded? UTF-8 or Western European (CP:1252) or any other?


----------------
Yutaka Emura
Developer of EmEditor
http://www.emeditor.com/

tungwaiyip
Posted on: 11/8/2007 9:58 am
Not too shy to talk
Joined: 12/8/2006
From:
Posts: 33
Re: [b29] Find in Files very slow to search a large number of files
In the File tab I have these options selected:

* Prompt if null char found
* Prompt if invalid char
* show file name w/full path
* Detect BOM
* Detect UTF-8
* Prompt at inconsistent returns
* Opening Encoding: UTF-8

The files I have searched should be all Ascii files.

Even if EmEditor is doing more detection, it just doesn't sound right to me that one program can search all files in 2 seconds but it takes EmEditor 9 minutes. Also if I select any encoding, says UTF-8, the performance goes up dramatically to a few seconds.

Of course now I have a workaround as stated in my previous sentence
Yutaka
Posted on: 11/8/2007 10:27 am
Webmaster
Joined: 9/28/2006
From: Redmond
Posts: 2397
Re: [b29] Find in Files very slow to search a large number of files
You should change the Opening Encoding in the File tab of the configuration properties from "UTF-8" to "System Default Encoding". That should solve this issue.


----------------
Yutaka Emura
Developer of EmEditor
http://www.emeditor.com/

shaohao
Posted on: 11/8/2007 4:39 pm
Not too shy to talk
Joined: 11/12/2006
From:
Posts: 21
Re: [b29] Find in Files very slow to search a large number of files
You'd better use register mode -- save all settings in register, not the INI file mode -- save all settings in .ini files.

The INI file mode will slow down the searching in files.
tungwaiyip
Posted on: 11/9/2007 8:53 am
Not too shy to talk
Joined: 12/8/2006
From:
Posts: 33
Re: [b29] Find in Files very slow to search a large number of files
This surely is a bug isn't it? There is no explanation why it would take so long for certain combination of encoding setting.

Besides I tried you recommendation. It doesn't help. Picking "utf-8" or any other encoding in the Find in File dialog however does solve the problem.

Is it true that choosing .ini for configuration slow things down? This is one thing to make me love EmEditor 7.
Yutaka
Posted on: 11/9/2007 9:10 am
Webmaster
Joined: 9/28/2006
From: Redmond
Posts: 2397
Re: [b29] Find in Files very slow to search a large number of files
Please try beta 32, and you should find better performance. Thanks!


----------------
Yutaka Emura
Developer of EmEditor
http://www.emeditor.com/

(1) 2 »
Threaded | Newest First Previous Topic | Next Topic | Top


Register To Post
 
English čeština Deutsch español français italiano 日本語 한국어 Русский 简体中文 繁體中文