0

What is Base64 Encoding and how it works

Hello Friends, You might have heard about a term called Base64 encoding here and there. This is also one kind of encoding schema which is worth pointing out.

Today we are going to discuss about Base64 and its usage.

Base64 Encoding

In simple terms, Base64 encoding is used in environments where, “perhaps for legacy reasons,” the “storage or transfer” of data is limited to ASCII characters.

According to Wiki:

Base64 is a group of similar binary-to-text encoding schemes that represent binary data in an ASCII string format by translating it into a radix-64 representation.

When you have some binary data that you want to ship across a network, you generally don’t do it by just streaming the bits and bytes over the wire in a raw format. Because some media are made for streaming text. There may be some protocols which may interpret your binary data as control characters (like a modem), or your binary data could be screwed up because the underlying protocol might think that you’ve entered a special character combination (like how FTP translates line endings).

So to get around this, people encode the binary data into characters. Base64 is one of these types of encodings.

Image data\File data is generally transferred into base64 encoding.

How does it works?

Base64 encoding takes minimum three bytes, each consisting of eight bits, and represents them as four printable characters in the ASCII standard.

Each character in Base64 can be multiple of 3 bytes, these bytes are then grouped into 6 bits. Padding is added if data bytes are not multiple of 3 bytes.

Now let’s see with examples, how it works:

Encode “A” with UTF7 (1 Byte)

I’ve written very simple code to show this with example

static void Main(string[] args)
{
    string myData = "A";
    byte[] dataBytes = System.Text.Encoding.UTF7.GetBytes(myData);
    Console.WriteLine(System.Convert.ToBase64String(dataBytes));
    Console.ReadLine();
}

In first example I am trying to convert “A”. Before discussing how it works, let’s discuss about the output. If I run this code then below will be the output.

base64 encoding single byte output

Now let’s see how this is calculated.

If you do not have idea how to encode decode in c# then you can check this.

In first line of code, I am passing character “A” in variable.

In next line I am using UTF7 encoding to get data bytes for “A”.

base64 encoding single byte code

  1. As we can see that in UTF7 we got 1 data byte i.e. “65”

Binary of 65 = 01000001

  1. As we discussed that data bytes should be multiple of 3 bytes. But our data is having one byte so we need to add 2 bytes of padding.

2 bytes of padding = 00000000 00000000

So total data bytes are: 01000001 00000000 00000000

  1. Now convert above bits to multiple of 6

First 6 bits group – 010000

Second group – 010000

Third group – 000000

Forth group – 000000

  1. Now calculate decimal values of these groups each.

First group – 010000 – 16 (Decimal)

Second group – 010000 – 16 (Decimal)

Third group – 000000 – It is padding so padding symbol is “=”

Forth group – 000000 – It is padding so padding symbol is “=”

  1. Now use below chart to get character related to decimal value.

base64 encoding char set

First group – 010000 – 16 (Decimal) – Q

Second group – 010000 – 16 (Decimal) – Q

Third group – 000000 – It is padding so padding symbol is “=”

Forth group – 000000 – It is padding so padding symbol is “=”

So calculated base64 string is “QQ==”

 

UTF7 Multibyte Data

Let’s consider one example where we have multiple bytes. Change code to use below character.

 

 

I’ve written very simple code to show this with example

Let’s discuss about the output. If I run this code then below will be the output.

base64 encoding multibyte output

 

Now let’s see how this is calculated.

In first line of code, I am passing different language character in variable.

In next line I am using UTF7 encoding to get data bytes.

base64 encoding multibyte code

  1. As we can see that in UTF7 we got 5 data bytes i.e. “43”,”65”,”75”,”73”,”45”. Below are calculated bits

43-00101011

65-01000001

75-01001011

73-01001001

45-00101101

  1. As we discussed that data bytes should be multiple of 3 bytes. But our data is having 5 bytes so we need to add 1 byte of padding.

1 bytes of padding = 00000000

So total data bytes are:

00101011 01000001 01001011 01001001 00101101 00000000

  1. Now convert above bits to multiple of 6

First 6 bits group – 001010

Second group – 110100

Third group – 000101

Forth group – 001011

Fifth group – 010010

Sixth group – 010010

Seventh group – 110100

Eighth group – 000000

  1. Now calculate decimal values of these groups each.

First group – 001010 – 10

Second group – 110100 – 52

Third group – 000101 – 5

Forth group – 001011 – 11

Fifth group – 010010 – 18

Sixth group – 010010 – 18

Seventh group – 110100 – 52

Eighth group – 000000 – It is padding so padding symbol is “=”

  1. Now use above chart to get character related to decimal value.

First group – 001010 – 10 – K

Second group – 110100 – 52 – 0

Third group – 000101 – 5 – F

Forth group – 001011 – 11 – L

Fifth group – 010010 – 18 – S

Sixth group – 010010 – 18 – S

Seventh group – 110100 – 52 – 0

Eighth group – 000000 – It is padding so padding symbol is “=”

So calculated base64 string is “K0FLSS0=”