codepage autodetection – EmEditor (Text Editor)

Viewing 7 posts - 1 through 7 (of 7 total)

Author
Posts
October 14, 2006 at 12:38 am #3888
andres99
Participant
I am presently looking for a text editor to suit all my needs. Basically, EmEditor does it. The free SuperEdi program also does it but I like EmEditor more for some configurability options.
I think that a valuable addition to EmEditor would be codepage autodetection on document open (without the program asking the user). SuperEdi has it already now (and it seems to work).
I have downloaded the test version of EmEditor Standard (which is smaller and I like it better), and it seems that EmEditor Standard already has the possibility to introduce codepage autodetection (basically it allows the user to select). An option for absolute autodetection would be very nice because I have files with different codepages and I sometimes just have to check the detect codepage dialog. I would prefer an automatic selection and if the codepage really is not detected, there should be an option to turn it off.
I am probably going to purchase EmEditor Standard but I’d like to think what the developers think about the possibility to add codepage autodetection in the future. Thank you.
October 14, 2006 at 12:52 am #3889
Yutaka Emura
Keymaster
You can disable the Auto Detect result dialog box:
Select Customize on the Tools menu, select the File tab, uncheck Always Show Detect All Result check box.
By default this is checked because the Auto Detection can make mistakes sometimes especially in small files. But if that is not a problem, you can uncheck this option. If you see a mistake, you can always reload the file as a correct encoding by double-clicking the encoding on the status bar.
October 14, 2006 at 2:48 pm #3890
andres99
Participant
Yes, I know that but what I meant was more the feature to “improve” autodetection.
For example, my native codepage on Windows XP is Windows-1257 (Baltic). For comfortability reasons, the “Opening encoding” for plaintext files is set as “System Default” (i.e. 1257).
I also work with some other codepages like Windows-1251 (Cyrillic), ISO-8859-15 (Latin 9, Pan European), ISO-8859-1, KOI-8 (Cyrillic) and, of course, UTF-8 / UTF-16.
Now, when I open UTF, there is no problem. Everything is detected automatically. There is no problem either when I open Windows-1257 files.
But when I open a Windows-1251 file (no matter whether I have Always display detect all result checked or not), EmEditor displays this in Windows-1257.
Another example, I have an ISO-8859-1 text file and when I open it, EmEditor thinks this is Windows-1257 again. When I use Reload as -> Detect All, EmEditor makes the correct guess that the file is ISO-8859-1.
When I open the same files in SuperEdi, I already get the correct display (SuperEdi detects the codepages without asking from me).
Now, what I meant was that EmEditor could have an option to automatically display a file in the most probable codepage (which I can already see in DetectAll results).
More specifically, when I open the Windows-1251 file, which is displayed incorrectly as Windows-1257, when I press “Detect All”, the box already shows that this file is most probably Windows-1251. (EmEditor actually knows the correct or most probable codepage!) However, EmEditor still opens this in Windows-1257.
Since EmEditor actually seems to understand the correct codepage but for some reason does not open it correctly (maybe because the opening encoding for text files is set as “System default”, I thought something like this:
There could be an option in EmEditor called “Always open files with autodetected codepage” or “Autodetect codepages without asking” (which means “without displaying the “detect all” dialog”). In that case, EmEditor should internally operate DetectAll, discover that the most probable codepage for the file is e.g. Windows-1251 and then apply this codepage.
It seems that EmEditor is capable of this anyway (because in DetectAll, the correct codepage is already displayed but it is not applied).
I know that there could be mistakse but if EmEditor detects the codepage incorrectly, I can always use “Reopen” or uncheck the autodetection option. Presently I have checked about 100 files and EmEditor’s detect all has always discovered the right codepage. Therefore, I think, that the mistakes are not very probable (of course they can sometimes occur).
I do not have presently any big problems with that because I can really always use the AutoDetect dialog, but it would be much more comfortable to open any file right away with the correct (or most probable) codepage (without asking the user). Just comfortability :)
As to pragmatics, since I think that “Detect all” discovers the right codepage correctly in most cases, anyway, it would be more comfortable to have the files opened with autodetected codepages (and not always to use “detect all” manually. In those rare cases when a mistake occurs (or may occur), one can use “detect all” (from “Reload as”).
October 14, 2006 at 3:55 pm #3891
Yutaka Emura
Keymaster
Can you please send me a few Windows-1251 files or any files that can cause a problem (after zipped as .Zip) as an attachment. Please also write a list explaining which file should be opened as which encoding. My email address is [email protected]. I will need to repro your problem here to fix this issue. Thanks!
October 14, 2006 at 7:02 pm #3892
andres99
Participant
I have sent you the files, please let me know :)
Thanks!
October 15, 2006 at 3:41 am #3895
andres99
Participant
My bad! This can already be done in EmEditor as Mr. Yutaka Emura has now explained to me. Sorry for bringing up a void topic.
October 15, 2006 at 4:15 am #3896
Yutaka Emura
Keymaster
No problem. For the rest of us, EmEditor already has a feature to auto-detect without displaying the result dialog box:
First, select Properties for Current Configuration (or All Configurations) on the Tools menu, select File tab, and check Detect All checkbox. Second, select Customize on the Tools menu, and clear Always Show Detect All Result checkbox. Now you can open a file with Auto Detect but without the Detect All Result dialog box.
Author
Posts

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic.