Message Compressor
Satellite communication is expensive! I get 5 texts a day, and that includes sending, recieving, and weather forecasts. As you might remember from the days before 4G, the character limit on a text can be as low as 148 GSM characters, with longer messages being split into multiple pages and counting as multiple texts.
But there's a lot of unused space in that character set. How often do you and I communicate using the uppercase greek alphabet? How often does the charcter sequence zQ appear in one of our messages?
I built a compressor to pack as much information into those 148 characters as possible. A Huffman prefix code tree is built using the frequency of each word from a 2016 Wikipedia dump. More common words (e.g. "the") are assigned shorter binary codes, and less common words (e.g. "tranglycosylation") are assigned longer ones.
The resulting binary is packed into 7-bit characters from the GSM character set for maximum information transmission.
I'd appreciate it if you could use the compressor for any long messsges! If the page compression percentage below is above 0%, the compression saves me money.
Unknown Words (Not in Dictionary)
These words will be escaped and encoded raw, reducing compression efficiency:
Compression Results
Compression Statistics
Original Length: | 0 |
Encoded Length: | 0 |
Compression Ratio: | 0.0% |
Original SMS Pages: | 0 |
Encoded SMS Pages: | 0 |
Page Compression: | 0.0% |