RedLZSS lzss compression/decompression


see http://michael.dipperstein.com/lzss/

and http://en.wikipedia.org/wiki/Lempel-Ziv-Storer-Szymanski


a dictionary look-up compressor that performs best with long repeating patterns.  e.g. [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3]

only works with 8bit Integer (0-255)


see also: RedBase64 RedHuffman RedLZ77 RedLZ78 RedLZW RedRLE


*compress(array)

expects an array of 8bit integers

returns a string of binary numbers

*decompress(string)

expects a string of binary numbers

returns an array of 8bit integers

*binaryStringToBytes(string)

converts a binary string into 8-bit bytes.

up to 7 zeros might be padded automatically at the end.  the number of zeros is stored in pad.

pad is semi-private and used when converting back to bytes.

*bytesToBinaryString(array)

converts an array of 8-bit bytes into a binary string.

<>pad

semi-private counter for number of padded zeros.

set by binaryStringToBytes and read by bytesToBinaryString.

<>window

maximum sliding window size.  default= 4096

<>length

maximum length for matching pattern.  default= 32



//--

a= [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3];

b= RedLZSS.compress(a);

RedLZSS.binaryStringToBytes(b).size/a.size; //compressed to 29%

c= RedLZSS.decompress(b);

a==c



a= "AABCBBABC".ascii;

b= RedLZSS.compress(a);

c= RedLZSS.decompress(b);

c.collect{|x| x.asAscii}.join;

a==c



a= "abracadabra".ascii;

b= RedLZSS.compress(a);

c= RedLZSS.decompress(b);

a==c



a= "JOEYNJOEYNJOEYJOEYNJOEYNJOEYJOEYNJOEYNJOEYJOEYNJOEYNJOEY".ascii;

b= RedLZSS.compress(a);

c= RedLZSS.binaryStringToBytes(b);

c.size/a.size; //compressed to 27%

d= RedLZSS.bytesToBinaryString(c);

e= RedLZSS.decompress(d);

e.collect{|x| x.asAscii}.join;

a==e



a= "Man is distinguished, not only by his reason, but by this singular passion from other animals, which is a lust of the mind, that by a perseverance of delight in the continued and indefatigable generation of knowledge, exceeds the short vehemence of any carnal pleasure.".ascii;

b= RedLZSS.compress(a);

RedLZSS.binaryStringToBytes(b).size/a.size; //compressed to 99%

c= RedLZSS.decompress(b);

c.collect{|x| x.asAscii}.join;

a==c



a= {|i| [0, 0, 0, 1].choose}.dup(5000);

RedLZSS.window= 512; //decrease window and length to speed up compression

RedLZSS.length= 16;

b= RedLZSS.compress(a); //very slow to compress large arrays

RedLZSS.binaryStringToBytes(b).size/a.size; //compressed to around 22% (also 22% with default window and length values)

c= RedLZSS.decompress(b); //fast to decompress

a==c