#7156

Yutaka Emura
Keymaster

Hellados wrote:
This is a good macros for small files, but it is very slow for me 🙁
I have more 50-100mb txt files, and i need to replace dublicate lines (words) more then 406000 words and this macro working very slow 🙁
my pc’s performances is very good, I have Intel COre 2 Duo E8400 2GB ram corsair 1TB HDD
What can i do?
🙁

I did some optimization. Please try this. This also shows the current status on the status bar.


function Pair( i, s )
{
this.index = i;
this.str = s;
}

nLines = document.GetLines();

// Create an array
a = new Array( nLines );

status = "Reading lines..."

// Fill the array a with all lines (with returns) in the document.
for( i = 1; i <= nLines; i++ ) {
if( (i \% 1000) == 0 ){
status = "Reading lines: " + String(i + 1) + "/" + String(nLines);
}
var pair = new Pair( i, document.GetLine( i, eeGetLineWithNewLines ) );
a.push( pair );
}

status = "Sorting lines..."

a.sort( function(a,b){
if( a.str > b.str ){
return 1;
}
if( a.str < b.str ){
return -1;
}
return a.index - b.index;
});

// Delete duplicate elements.
for( i = 1; i < nLines; i++ ){
if( (i \% 10) == 0 ){
status = "Deleting duplicate lines: " + String(i + 1) + "/" + String(nLines);
}
if( a[i].str == a[i-1].str ){
a[i].index = 0; // disable
}
}

status = "Sorting lines again..."

a.sort( function(a,b){
return a.index - b.index;
});

var str = "";
for( i = 0; i < nLines; i++ ){
if( a[i].index != 0 ){
if( (i \% 1000) == 0 ){
status = "Joining lines: " + String(i + 1) + "/" + String(nLines);
}
str += a[i].str;
}
}

// Replace the entire document with new elements
document.selection.SelectAll();
document.selection.Text = str;
status = "Duplicate lines deleteded."