How to decode Base64 to original String?

What is the best way to create a String in Java that will take up the least amount of kilobytes?

  • UPDATE: Although this question is a good discussion on String Compression it is a bad idea to try and store a large 64kb record in DynamoDB as you are charged by the kb that you store.  The better way to store large data is to have the look up data in DynamoDB and put the large data in S3 and return the 2 sources when you want to bring a result back.  I need to get a String (a very, very big block of Json) down to the smallest physical size possible to enable me to get it under 64kb.  So I am trying to find the most efficient way to encode and decode a String to reduce its size. As an example let say I had this Donut Json but instead of the 7 toppings there were 1000's, enough to create a String that was more than 64kb.  What can I do to reduce the size to less than 64kb.  (Apart from removing white space).  My motivation to do this is to store it in Amazon DynamoDB that has a limit per item of 64kb -http://aws.amazon.com/dynamodb/faqs/#Is_there_a_limit_on_the_number_of_attributes_an_item_can_have  {     "id": "0001",     "type": "donut",     "name": "Cake",     "ppu": 0.55,     "batters": {         "batter": [             {                 "id": "1001",                 "type": "Regular"             },             {                 "id": "1002",                 "type": "Chocolate"             },             {                 "id": "1003",                 "type": "Blueberry"             },             {                 "id": "1004",                 "type": "Devil's Food"             }         ]     },     "topping": [         {             "id": "5001",             "type": "None"         },         {             "id": "5002",             "type": "Glazed"         },         {             "id": "5005",             "type": "Sugar"         },         {             "id": "5007",             "type": "Powdered Sugar"         },         {             "id": "5006",             "type": "Chocolate with Sprinkles"         },         {             "id": "5003",             "type": "Chocolate"         },         {             "id": "5004",             "type": "Maple"         }     ] }

  • Answer:

    I'd probably use java.util.zip. For the kind of string you're talking about, you'll probably get compression of at least 10 and perhaps more. The code is pretty simple: ByteArrayOutputStream buffer = new ByteArrayOutputStream(65536); ZipOutputStream z = new ZipOutputStream(buffer); byte[] bytes = string.getBytes[]; int n = z.write(b, 0, b.length); Now buffer.toByteArray() contains your bytes. If n>65536, then you're screwed and you need to think of a different solution. You reverse the process to get your string back: InputStream is = new ByteArrayInputStream(buffer); ZipInputStream z = new ZipInputStream(is); byte[] bytes = new byte[65536]; int n = z.read(b, 0, 65536); String string = new String(bytes, 0, n); If that's not sufficient, you need to parse the JSON string into a real data structure and store that instead. That'll be even more efficient in both space and time (and a better way to access the data than the error-prone name access, which changes a lot of code every time you change the JSON format) but this is simple, generic, and already solved.

Joshua Engel at Quora Visit the source

Was this solution helpful to you?

Other answers

If your json has a fixed structure, the best way would be using popular binary serialization tools to store your data in DynamoDB. For instance, you can define a protocol buffer message (http://code.google.com/p/protobuf/), or a Thrift struct (http://thrift.apache.org/), and get a very packed binary serialization.

Soheil Hassas Yeganeh

In Java, string are stored as UTF16 in memory. You might be able to reduce a string's size if you create you own class that stores strings as UTF8. But, in this case, you're going to dump the dictionary in DynamoDB, in which case the size of the string in Java language is irrelevant. The language in which you make the processing is irrelevant. The only shortcuts you could take is to remove keys and assume that in every JSON, the order and meaning of every value is the same. You could probably implode the topping part of the dictionary into a string like "(id,type)5001|None,5002|Glazed,5005|Sugar...".

Cristian Andreica

Using Snappy - http://code.google.com/p/snappy-java/ and BASE64 encoding works nicely. public String squash(String squashThis) throws IOException { BASE64Encoder encoder = new BASE64Encoder(); byte[] compressed = Snappy.compress(squashThis.getBytes("UTF-8")); return encoder.encode(compressed); } public String unSquash(String unSquashThis) throws IOException { BASE64Decoder decoder = new BASE64Decoder(); byte[] uncompressed = Snappy.uncompress(decoder.decodeBuffer(unSquashThis)); return new String(uncompressed, "UTF-8"); }

Matt Wood

Related Q & A:

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.