Why does Python use str.encode and str.decode conventions?
-
Concerning str.encode and str.decode within the Python 2.7 language - Why do we decode a string to make it unicode? It seems more logical to encode a string to unicode considering a string is an array of 8-bit characters. I'm therefore encoding, in some cases, two or more characters, into one unicode character. Likewise, if I encode a unicode character, I'm - according to Python 2.7 - converting one character into one or more. It seems more logical here to decode a unicode character. I just don't get the logic of these conventions and was hoping someone had an explanation to make this concept a little easier to grasp.
-
Answer:
There's two separate concepts: "byte string" and "character string". The former is a sequence of bytes (8-bit values); whereas the latter is a sequence of characters (an abstract concept which represents a unit of text). The former is ideal for representing raw binary data; whereas the latter is ideal for representing text. In Python 2.x, character strings are implemented by the unicode type, and byte strings are implemented by the str type. Despite its name, "str" is not for representing general character strings. Part of your confusion may arise from this. In Python 3.x, character strings are implemented by the str type, and byte strings are implemented by the bytes type. The conversion between byte string and character string is called an "encoding". Regardless of the encoding, converting from character string to binary data is always called "encoding", and converting from binary data back into a character string is always called "decoding".
Xuan Luo at Quora Visit the source
Other answers
Unicode characters are not things with a fixed "physical" representation. There are a number of schemes to convert unicode charaters into actual bits and bytes, which are called encodings. When you use encode() you are encoding a unicode string into some bytes. When you decode, you are decoding the unicode string that was encoded in those bytes. Encoding is "To format (electronic data) according to a standard format" so, converting the unicode code point to bytes according to, say, UTF-8 is encoding, while converting UTF-8 bytes into whatever your internal representation for unicode is, is decoding. So, yes, you could bend over backwards and say "I am decoding these UTF-8 bytes into the UTF-16 (or whatever) the python interpreter is using internally", but it's not reasonable, because you are not even specifying the output format when you decode :-)
Roberto Alsina
Unicode is the preferred representation of text. From that point of view, encoding means converting the text to another character encoding; for example to print it to a console or send the text over a network connection. Decoding is the opposite; i.e., converting a sequence of bytes to Unicode. This is consistent with the general use of the "encode" and "decode". For example, if you want to save a Python data structure to a file or send it over the network, you first encode it as JSON/YAML/pickle format/... and decode it to get back the data structure.
Larry Rosenstein
Related Q & A:
- Why do we use quicksort instead of heapsort?Best solution by Yahoo! Answers
- Why does Parse use Javascript?Best solution by Stack Overflow
- Why cannot we use static keyword inside a method in java?Best solution by Stack Overflow
- Why should I use Dependency Injection?Best solution by Stack Overflow
- Why does my laptop freeze when i scroll and why cant i use my taskbar?Best solution by Yahoo! Answers
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.