Forum Replies Created

Viewing 5 posts - 1 through 5 (of 5 total)
  • Author
    Posts
  • in reply to: Any way to separate data by bytes? #17872
    no1
    Participant

    Thank you, Yutaka.

    My files are many and large (thousands of lines or even more per file, about 200 columns per line, the column widths variable). So I’m afraid operation macros might not fit my case.

    I wrote a script macro with the help of others. Could you (or anyone) take a look to see if there’s anything to be amended in it? And see if the regular expressions could be optimized. (After I tested a few, I think the shorter, the faster.)
    The files are large. So the efficiency and stability should be considered.

    Currently the only but fatal problem is:
    The regular expression ^ would match the position after this character:
    U+0085 <control> : NEXT LINE
    How to resolve it?

    if(document.Encoding != eeEncodingBinary) {
    	alert("TSV ASCII\n\nUse this macro in Binary (ASCII View) mode only!");
    	Quit();
    };
    
    var str = "8,50,30,30,30,50,50,50,50,30,15,8,1,6,25,25,55,55,55,55,12,30,11,3,30,24,55,55,55,55,12,30,11,3,30,55,55,55,55,3,30,30,10,3,1,1,1,30,1,10,4,1,6,3,8,10,6,30,10,6,6,1,1,1,1,1,1,1,1,30,30,30,1,8,8,8,8,8,8,8,8,8,8,8,8,8,1,10,3,30,4,5,7,7,4,3,30,5,19,19,19,19,20,7,8,9,19,19,8,10,30,50,5,35,3,30,3,30,4,30,4,30,25,4,50,3,30,3,30,30,40,30,40,30,40,30,40,30,40,30,40,6,8,50,50,50,5,25,6,6,70,20,1,5,5,3,10,30,25,5,30,8,3,3,8,8,3,5,3,8,8,6,8,2,2,8,12,3,2,8,1,5,5,1";
    str = prompt("[TSV ASCII]    Enter width of each column:    (e.g. 10,5,1,7)", str);
    var arr = str.split(",");
    
    Redraw = false;
    
    var n = 0
    for(var i = 0; i < arr.length - 1; i++) {
    	n = n + parseInt(arr[i]);
    	document.selection.Replace("^.{" + n + "}", "\\t", eeReplaceAll | eeFindReplaceRegExp);
    	n = n + 1;
    
    	//Monitor/Break:
    	document.HighlightFind = false;
    	Redraw = true;
    	if(!confirm("Monitor/Break:\n\n" + i + " " + arr[i] + " " + n + " done.\n\nContinue?")) Quit();
    	Redraw = false;
    };
    
    document.selection.Replace(" +\t", "\t", eeReplaceAll | eeFindReplaceRegExp);
    
    in reply to: Any way to separate data by bytes? #17853
    no1
    Participant

    Drag and drop the picture to see the original size of it.

    in reply to: Any way to separate data by bytes? #17849
    no1
    Participant

    Let me give an example with some pictures:

    The bytes are continuous in the file. To make it clear in the picture, I broke the stream at 0D0A.

    The data fields are fixed length in bytes. The red Solid lines are where I want to insert the delimiter bytes.

    It would be best if there is a way to insert the delimiter bytes into the byte stream directly. But I don’t know such a way (which is what I exactly want). So currently I have to open the files with a text editor and handle the text by characters.

    The data fields are fixed length in bytes. But the number of characters could be different. (I highlighted some corresponding bytes and characters with different colors.)

    UltraEdit’s Convert to Character Delimited command can only handle the text by characters. (And I don’t know if regular expressions can handle by bytes.) So I insert ! next to the multi-byte characters according to the numbers of the bytes.
    Now the number of characters is equal to the number of the original bytes, which lets UltraEdit’s Convert to Character Delimited command insert the delimiters to the right positions.

    Any better solutions/tools are welcome.

    The example text and its Hex(UTF-8):
    (Column Width: 8,30,15,10,13 bytes)

    Field 1 Field 2 ăĕĭŏŭ âêîôû Field3(15bytes)Field 4   Field 5, etc.
    123     Any unicode string without tab诸如此类             Field 5, etc.
    12345678[---This field is 30 bytes---]エトセトラ[10 bytes]Field 5, etc.
    
    4669656C642031204669656C64203220C483C495C4ADC58FC5AD20C3A2C3AAC3AEC3B4C3BB204669656C64332831356279746573294669656C6420342020204669656C6420352C206574632E0D0A3132332020202020416E7920756E69636F646520737472696E6720776974686F757420746162E8AFB8E5A682E6ADA4E7B1BB202020202020202020202020204669656C6420352C206574632E0D0A31323334353637385B2D2D2D54686973206669656C642069732033302062797465732D2D2D5DE382A8E38388E382BBE38388E383A95B31302062797465735D4669656C6420352C206574632E0D0A

    By the way, about the picture above, if someone would be interested:
    All the colorful highlightings and lines in the text are done within EmEditor, not by an image editor.

    Thank you, Yutaka, for the new features.
    The User-Defined Guides are useful.

    in reply to: Any way to separate data by bytes? #17843
    no1
    Participant

    I’ll explain more with some pictures later.

    Currently please read this:
    http://www.ultraedit.com/forums/viewtopic.php?f=2&t=14711

    Thank you all!

    in reply to: Any way to separate data by bytes? #17817
    no1
    Participant

    But I need a macro or something…
    There are about 200 columns. And the file is large.

Viewing 5 posts - 1 through 5 (of 5 total)