0

Encoding using C#

Hi Friends, as we’ve discussed about encoding decoding in our previous article, so in this article we are going to discuss how we can implement encoding using c#.

Let’s summarize about encoding\decoding:

Computers doesn’t understand these characters. Computers understand only one language i.e. 0\1. These 0 and 1 are electric signal which used to maintain a state in computer memory. This state can be accessed later on and transformed into desired results.

Every character which we type or see in computer are saved somewhere in form of 0 and 1 e.g. If I type my name “Deepak Gera” then this name will be converted into stream of 0\1 by using some algorithm and then this stream will be stored somewhere in computer.

Later on when I try to access my name then this stream will be read from memory location and will be transformed into characters using the same algorithm which was used previously for transformation.

“The process of transforming characters into stream of bytes is called as Encoding”

“The process of transforming encoded bytes into characters is called as Decoding”

Encoding using C#

Create one console application and write following code in Program.cs file.   

class Program
{
    static void Main(string[] args)
    {
        string myData = "A";
        byte[] encodedData = Encode(myData);
        Console.WriteLine($"Encoded Data: {encodedData}");

        string origData = Decode(encodedData);
        Console.WriteLine($"Original Data: {origData}");
        Console.ReadLine();
    }

    public static byte[] Encode(string text)
    {
        byte[] dataBytes = System.Text.Encoding.UTF8.GetBytes(text);
        return dataBytes;
    }

    public static string Decode(byte[] dataBytes)
    {
        string returntext = System.Text.Encoding.UTF8.GetString(dataBytes);
        return returntext;
    }
}

Above code is having 2 methods. One method is for encoding in c# and another method is for decoding.

As you can see that I’ve used UTF8 encoding schema so when I debug this code, I got 1 byte for character “A”. ASCII code for “A” is 65. If I use UTF7, then also I’ll get the same results as UTF7 is also 1 byte.

encoding using c#

Let’s check another character which takes 2 bytes. “¢” symbol takes 2 bytes in code pages so let’s check this with UTF8 schema. We can see below that this character is taking 2 bytes.

If we encode same character using UTF7 schema then it will convert using symbols from ASCII so it will take more bytes so it is considered less efficient for multi byte characters.

encoding using c#

Same way you can perform encoding using c# using different schemas

ASCII

byte[] dataBytes = System.Text.Encoding.ASCII.GetBytes(text);

UTF-16 (Little Endian)

byte[] dataBytes = System.Text.Encoding.Unicode.GetBytes(text);

UTF-16 (Big Endian)

byte[] dataBytes = System.Text.Encoding.BigEndianUnicode.GetBytes(text);

In little endian machines, least significant byte of binary representation of the multi-byte datatype is stored first. On the other hand, in big endian machines, most significant byte of binary representation of the multi-byte datatype is stored first. You can see more about Big Endian\Little Endian

UTF-32

byte[] dataBytes = System.Text.Encoding.UTF32.GetBytes(text);

 

 

You can check yourself and see how these schema results are differ.

0

Little and Big Endian Mystery

Hello friends, as we’ve already discussed about Encoding and types so let’s now discuss little bit about “Little and Big Endian Mystery”.

Small topic but play important role whenever we talk about data transfer and storage.

What are these Little and Big Endian?

Both are ways to store multi-byte data types e.g. int, float etc.

In little endian machines, least significant byte of binary representation of the multi-byte datatype is stored first. On the other hand, in big endian machines, most significant byte of binary representation of the multi-byte datatype is stored first.

Their difference is similar to the difference between English and Arabic.

English is written and read from left to right, while Arabic from right to left.

Suppose integer is stored as 4 bytes then a variable x with value 0x01234567 will be stored as following.

Big Endian’s Advantages

Easier for (most) human to read:

When examining memory values. This sometimes also applies to serializing/deserializing values when communicating with networks.

Easier sign checking:

By checking the byte at offset 0 we can easily check sign.

Easier comparison:

Useful in arbitrary-precision math, as numbers are compared from the most significant digit.

No need for endianness conversion:

No conversion needed when sending/receiving data to/from the network. This is less useful because network adapters can already swap bytes and copy them to memory in the correct order without the help of the CPU, and most modern CPUs have the ability to swap bytes themselves.

Little Endian’s Advantages

Easier parity checking:

Parity check is easy by checking the byte at offset 0 we can see that it’s odd or even.

Easier for some people to read:

Arabic, Hebrew and many other languages write from right to left so they read numbers in little-endian order. Some languages also read number values in little-endian order (like 134 as 4 units, 3 tens and 1 hundred), so it’s easier to know how big the current digit is and the thousand separator will be less useful.

Natural in computation:

  • Mathematics operations mostly work from least to most significant digit, so it’s much easier to work in little endian.
  • This is extremely useful in Arbitrary-precision arithmetic (or any operations that are longer than the architecture’s natural word size like doing 64-bit maths on 32-bit computers) because it would be much more painful to read the digits backwards and do operations.
  • It’s also useful in situations like in case a computer with limited memory bandwidth (like some 32-bit ARM microcontrollers with 16-bit bus, or the Intel 8088 with 16-bit register but 8-bit data bus). Now the 32-bit CPU can do math 16 bits at a time by reading a half word at address A, add it while still reading the remaining half word at A+2 then do the final add instead of waiting for the two reads to be finished then adding from the LSB.

Always reads as the same value:

It always read same value if reading in the size less than or equal to the written value.

For example 20 = 0x14 if writing as a 64-bit value into memory at address A will be 14 00 00 00 00 00 00 00, and will always be read as 20 regardless of using 8, 16, 32, 64-bit reads (or actually any reads with length <= 64 at the address A like 24, 48 or 40 bits). This can be extended to arbitrarily longer types.

In big-endian system you have to know in which size you have written the value, in order to read it correctly. For example to get the least significant byte you need to read at byte A+n-1 (with n is the length in bytes of the write) instead of A.

This property also makes it easy to cast the value to a smaller type like int32 to int16 because the int16 value will always lie at the beginning of int32.

How to check Endianness?

Execute below program on your machine and you’ll be able to check.

#include<stdio.h>
void byte_order(char *start, int num)
{
    int i;
    for(i=0 ; i<n ; i++)
        printf("%.2x",start[i]);

    printf("\n");
}

void main()
{
    int n = 0x01234567;
    byte_order((char*)&n,n);
}

The above program when run on a Big Endian Machine produces ’01 23 45 67′ as output, while on a Little Endian Machine produces ’67 45 23 01′.

Final Note

Both big and little endian have their advantages and disadvantages. Even if one were clearly superior (which is not the case), there is no way that any legacy architecture would ever be able to switch endianness. You can have a look into more details about Endianness on Wiki.

 

0

What are Encoding Schemas, Types & Differences

Hello friends, Today we are going to discuss about Encoding Schemas, Usage, and Differences. However this is a simple topic but still there are many ways, one can get confused easily with types and pros\cons of various encoding schemas.

We’ll cover basic information about this topic. So let’s move step by step.

What is Encoding\Decoding

We all in this world communicate with each other using some language e.g. English, Spanish, and German etc. Each language has some set of characters which we used to write, read and understand.

Do you think that computer also able to understand these characters in its simple form?

The answer is No.

Computers doesn’t understand these characters. Computers understand only one language i.e. 0\1. These 0 and 1 are signals which used to maintain a state in computer memory. This state can be accessed later on and transformed into desired results.

Every character which we type or see in computer are saved somewhere in form of 0 and 1 e.g. If I type my name “Deepak Gera” then this name will be converted into stream of 0\1 by using some algorithm and then this stream will be stored somewhere in computer.

Later on when I try to access my name then this stream will be read from memory location and will be transformed into characters using the same algorithm which was used previously for transformation.

“The process of transforming characters into stream of bytes is called as Encoding”

“The process of transforming encoded bytes into characters is called as Decoding”

Microsoft says “Encoding is the process of transforming a set of Unicode characters into a sequence of bytes. In contrast, decoding is the process of transforming a sequence of encoded bytes into a set of Unicode characters.

Most popular types of Encoding Schemas

There are many encoding schemas which discovered\evolved time to time based on the requirements and shortcomings of previous ones. Let’s discuss most popular schemas and see how these evolved.

ASCII

ASCII stands for American standard code for Information Interchange. This code is basically used for identifying characters and numerals in a keyboard. These are 8 bit sequence code and are used for identifying letters, numbers and special characters.

ASCII uses 7 bits to represent a character. By using 7 bits, we can have a maximum of 2^7 i.e. 128 distinct characters. The last bit (8th) is used for avoiding errors as parity bit.

Below is the ASCII chart shows mapping of character and its corresponding numeric value. This numeric value is then converted into 7 bit binary value and stored\transferred.

encoding schemas ascii

But 127 characters are not enough to capture complete language so ASCII started using 8th bit also to encode more characters to support language (to support “é”, in French, for example). Just using one extra bit doubled the size of the original ASCII table to map up to 256 characters (2^8 = 256 characters).

Below is the list of extended characters which are now supported by ASCII.

encoding schemas ex ascii

UTF

ASCII Extended solves the problem for languages that are based on the Latin alphabet. But what about the others languages which have completely different character set. How those languages will be encoded.

That’s the reason behind Unicode. Unicode doesn’t contain every character from every language, but it sure contains a gigantic amount of characters.

You can check entire Unicode character set here.

In the Unicode standard, a character is called as “Code Point”. Unicode character set is divided into 17 blocks. Each block is a continuous group of 65,536 (2^16) code points. Each block is called as “Plane”.

There are 17 planes, identified by the numbers 0 to 16.

Plane-0 is called as BMP (Basic Multilingual Plane)

Basic Multilingual Plane:

The first plane, plane 0, the Basic Multilingual Plane (BMP) contains characters for almost all modern languages, and a large number of symbols. A primary objective for the BMP is to support the unification of prior character sets as well as characters for writing. Most of the assigned code points in the BMP are used to encode Chinese, Japanese, and Korean (CJK) characters.

UTF is a super set of ASCII symbols. We need schemas which can transform set of characters into relevant Unicode code points. Below are few popular schemas which are used here.

UTF-7, UTF-8, UTF-16, UTF-32

Let’s discuss these

UTF-7

UTF-7 is an encoding that is used to encode Unicode characters by using only the ASCII characters. This encoding has the advantage that even in environments or operating systems that understand only 7-bit ASCII, Unicode characters can be represented and transferred.

For example, some Internet protocols such as SMTP for email, only allow the 128 ASCII characters and all other major bytes are not allowed. All of the other UTF encodings use at least 8 bits, so that they cannot be used for such purposes.

All those characters which are there in ASCII are converted to its normal ASCII codes. All other characters are encoded and also converted to ASCII characters. The + marks the beginning of such an encoding, the – (or any other character which cannot occur in the encoding) marks the end.

The German word for cheese “Käse”, for instance, would be coded as K+AOQ-se. The ASCII characters K, s and e will be the same, while the ä will be converted to AOQ (other ASCII characters). The beginning and the end of this coding are marked with + and -. Same way decoding happens.

Pros:

Compatibility with ASCII character set.

Cons:

Because of issues with robustness and security, you should not use UTF-7 encoding in 8-bit environments where UTF-8 encoding can be used instead.

UTF-8\UTF-16\UTF-32

Main difference between UTF-8, UTF-16 and UTF-32 character encoding is how many bytes it require to represent a character in memory.

UTF-8 uses minimum one byte, while UTF-16 uses minimum 2 bytes. BTW, if character’s code point is greater than 127, maximum value of byte then UTF-8 may take 2, 3 or 4 bytes but UTF-16 will only take either two or four bytes. On the other hand, UTF-32 is fixed width encoding scheme and always uses 4 bytes to encode a Unicode code point. 

Fundamental difference between UTF-32 and UTF-8, UTF-16 is that former is fixed width encoding scheme, while later duo is variable length encoding.

UTF-8 pros:

  • Basic ASCII characters like digits, Latin characters with no accents, etc. occupy one byte which is identical to US-ASCII representation. This way all US-ASCII strings become valid UTF-8, which provides decent backwards compatibility in many cases.
  • No null bytes, which allows to use null-terminated strings, this introduces a great deal of backwards compatibility too.
  • UTF-8 is independent of byte order, so you don’t have to worry about Big Endian / Little Endian issue.

UTF-8 cons:

  • Many common characters have different length, which slows indexing by codepoint and calculating a codepoint count terribly.
  • Even though byte order doesn’t matter, sometimes UTF-8 still has BOM (byte order mark) which serves to notify that the text is encoded in UTF-8, and also breaks compatibility with ASCII software even if the text only contains ASCII characters. Microsoft software (like Notepad) especially likes to add BOM to UTF-8.

UTF-16 pros:

  • BMP (basic multilingual plane) characters, including Latin, Cyrillic, most Chinese, and Japanese can be represented with 2 bytes. This speeds up indexing and calculating codepoint count in case the text does not contain supplementary characters.
  • Even if the text has supplementary characters, they are still represented by pairs of 16-bit values, which means that the total length is still divisible by two and allows to use 16-bit chars the primitive component of the string.

Note: .NET strings are UTF-16 because that’s an excellent fit with the operating system encoding, no conversion is required.

But why UTF-16?

This is because of history. Windows became a Unicode operating system at its core in 1993. Back then, Unicode still only had a code space of 65535 codepoints, these days called UCS. At that time to cover all these characters two bytes were enough. So at that time UCS-16 was adopted as standard.

To maintain compatibility with windows UCS-2 encoding, UTF-16 was adopted as new standard for in-memory transformations.

UTF-16 cons:

  • Lots of null bytes in US-ASCII strings, which means no null-terminated strings and a lot of wasted memory.
  • Using it as a fixed-length encoding “mostly works” in many common scenarios (especially in US / EU / countries with Cyrillic alphabets / Israel / Arab countries / Iran and many others), often leading to broken support where it doesn’t. This means the programmers have to be aware of surrogate pairs and handle them properly in cases where it matters!
  • Its variable length, so counting or indexing codepoints is costly, though less than UTF-8.

UTF-32 pros:

  • This has fixed length i.e. 4 bytes, which fastens indexing by codepoint.

UTF-32 pros:

  • Takes most memory (4 bytes for all code points) compared to UTF-8/16.

 

UTF-8 is the default for text storage and transfer because it is a relatively compact form for most languages (some languages are more compact in UTF-16 than in UTF-8). Each specific language has a more efficient encoding.

UTF-16 is used for in-memory strings because it is faster per character to parse and maps directly to Unicode character class and other tables. All string functions in Windows use UTF-16 and have for years.

That’s all about encoding now. In upcoming articles we’ll discuss more about encoding implementation.

0

HTTP 1.0 vs HTTP 1.1 – Bandwidth Optimization

Hi Friends, as part of “differences between HTTP 1.0 vs HTTP 1.1”, we are now trying to cover another important difference about HTTP 1.1 – bandwidth optimization.

Let’s discuss about http 1.1 – bandwidth optimization changes.

Bandwidth Optimization

As we know that bandwidth is precious to serve client better and efficiently. If we are not utilizing bandwidth in effective way then it may cause issues while transferring data. Inefficient use of bandwidth increases network latency which decreases site performance.

So we should be very careful to decide mechanism through which we can preserve bandwidth and can use in effective way.

Bandwidth improvement is also part of those key changes which were done in HTTP 1.1.

What was in HTTP 1.0

We were having few issues in HTTP 1.0 due to which there are higher chances to get bandwidth wastage.

Sending large requests to server:

Every server is restricted to accept requests up to specific sizes. If request is having size more than specification then server returns error code. But server returns error code after bandwidth is consumed by this large request. So this entire consumed bandwidth is a waste. There is no way in HTTP 1.0 so that client server can negotiate before sending large request.

Sending large responses to client:

There could be cases where client needs only a part of any big document or image but in HTTP 1.0 there is no way so that server can send only partial data. Server sends the entire data to client. This is another example of bandwidth wastage which could be saved.

Optimizations in HTTP 1.1

As we discussed about cases where client needs only a part of any big document or image. In HTTP 1.0 there was no way to achieve this but in HTTP 1.1 this is possible. Let’s discuss, how we can achieve this.

Range Requests:

By using this way a client can send a request and in the request it can mention a range to get. Once server processes this requests and if server supports range requests then server will transfer the range of bytes requested by client.

Following is the example of one range request.

In above example we can see that range from 0 to 1023 bytes is requested from server.

If a response contains a range, rather than the entire resource, it carries the 206 (Partial Content) status code. This code prevents HTTP/1.0 proxy caches from accidentally treating the response as a full one, and then using it as a cached response to a subsequent request. In a range response, the Content- Range header indicates the offset and length of the returned range, and the new multipart/byteranges MIME type allows the transmission of multiple ranges in one message.

Range requests can be used in a variety of ways, such as:

  1. To read the initial part of an image, to determine its geometry and therefore do page layout without loading the entire image.
  2. To complete a response transfer that was interrupted (either by the user or by a network failure); in other words, to convert a partial cache entry into a complete response.
  3. To read the tail of a growing object.

Expect and 100 (Continue):

Whenever client sends request to server, it includes Expect header with 100 status code. This request is just to get an approval about if client can continue or not. If server is fine with the expectations of client then it returns 100-continue response and client can continue. But if server does not meet the expectation then it can send any 4xx status code due to which client cannot continue with requests.

Let’s consider one example:

Assume that we have a client which want to transmit a video file to server. This file length may be high and server may not be able to process. To check if this file length is acceptable by server or not, client sends a request with expect header.

A client sends a request with Expect header and waits for the server to respond before sending the message body.

PUT /somewhere/fun HTTP/1.1
Host: origin.example.com
Content-Type: video/h264
Content-Length: 1234567890987
Expect: 100-continue

The server now checks the request headers and may respond with a 100 (Continue) response to instruct the client to go ahead and send the message body, or it will send a 417 (Expectation Failed) status if any of the expectations cannot be met.

This way server is able to save its bandwidth. If this would have been http 1.0 then this request would have transferred to server with entire body. Bandwidth is consumed and then server would have rejected the request which is complete waste of bandwidth.

 

In next articles we’ll discuss about few more differences between http 1.0 and 1.1.

0

HTTP 1.0 vs HTTP 1.1 – Caching

Hi Friends, as we are discussing about differences between HTTP 1.0 vs HTTP 1.1, I am trying to cover one difference in one article so that it will be easy for us to grasp these. Http 1.1 caching is an important aspect to learn which impacts web behavior.

Previously we discussed about compatibility changes which were done as part of HTTP 1.1.

In this article we are going to discuss about http 1.1 caching changes.

Caching

Caching is a technique through which you can preserve few most required resources on the server\client. Once these resources are preserved, whenever any new client request the same resource from the server then rather than process again, server returns with the preserved resource.

This technique gives few benefits.

  1. As server do not need to process again for the same resource, so server performance improves.
  2. As this is preserved resource so there is no need to add network packets in the response. Due to which response size decreases a bit and network latency improved for other requests.
  3. Due to less network congestion, server is able to handle more and more requests which is cost effective solution.

Caching was there in HTTP 1.0 also but it was not at that much mature level so it causes couple of issues.

Caching in HTTP 1.0

The HTTP 1.0 caching mechanism worked well, but it had many shortcomings. It did not allow either servers or clients to give full and explicit instructions to caches. Therefore, this caching was not well-specified. Because of which we had following issues.

  1. Incorrect caching of some responses that should not have been cached. Due to this responses were unexpected.
  2. Failure to cache some responses that could have been cached. This causes performance problems.

Expires Header:

HTTP/1.0 provided a simple caching mechanism. An origin server may mark a response, using the Expires header, with a time until which a cache could return the response.

After above mentioned date time, cache will expire and will not be served to client. It should be validated with origin server.

If-Modified-Since, Last-Modified headers:

A cache may check the current validity of a response using what is known as a conditional request: it may include an If-Modified-Since header in a request for the resource, specifying the value given in the cached response’s Last-Modified header.

Below is the syntax for this header

 

The If-Modified-Since request HTTP header makes the request conditional: the server will send back the requested resource, with a 200 status, only if it has been last modified after the given date. If the resource has not been modified since, the response will be a 304 (Not modified) without any body; the Last-Modified header will contain the date of last modification. 

Pragma: no-cache header:

The Pragma: no-cache header, for the client to indicate that a request should not be satisfied from a cache.

Every time response will be processed fresh from origin server.

Caching in HTTP/1.1

HTTP 1.1 caching attempts to clarify the concepts behind caching, and to provide more mature mechanisms for caching. It retains the basic HTTP 1.0 caching plus it provide a design with new features and with more careful specifications of the existing features.

In HTTP 1.1, a cache entry is fresh until it reaches its expiration time. Once it expires, it should not be provided in the response but it should not be deleted from the cache also. But it normally must revalidate it with the origin server before returning it in response to a subsequent request. However, the protocol allows both origin servers and end-user clients to override this basic rule.

ETag, If-None-Match Headers:

https://www.slideshare.net/RakeshChaudhary4/advanced-caching-concepts-velocity-ny-2015

The ETag or entity tag is part of HTTP caching, which allows a client to make conditional requests. This allows caches to be more efficient, and saves bandwidth, as a web server does not need to send a full response if the content has not changed. 

When a URL is retrieved, the web server will return the resource’s current representation along with its corresponding ETag value, which is placed in an HTTP response header “ETag” field:

ETag: “686897696a7c8745rt5”

The client may then decide to cache the representation, along with its ETag. Later, if the client wants to retrieve the same URL resource again, it will first determine whether the local cached version of the URL has expired (through the Cache-Control and the Expire headers). If the cache has not expired, it will retrieve the local cached resource. If it determined that the cache has expired (is stale), then the client will contact the server and send its previously saved copy of the ETag along with the request in an “If-None-Match” field.

If-None-Match: “686897696a7c8745rt5”

On this subsequent request, the server may now compare the client’s ETag with the ETag for the current version of the resource. If the ETag values match, meaning that the resource has not changed, then the server may send back a very short response with a HTTP 304 Not Modified status. The 304 status tells the client that its cached version is still good and that it should use that.

However, if the ETag values do not match, meaning the resource has likely changed, then a full response including the resource’s content is returned, just as if ETags were not being used. In this case the client may decide to replace its previously cached version with the newly returned representation of the resource and the new ETag.

The Cache-Control Header:

Standard Cache-Control directives that can be used by the client in an HTTP request. There are many variations for using this directive. Below is directives list for request.

  • Cache-Control: max-age=<seconds>
  • Cache-Control: max-stale [=<seconds>]
  • Cache-Control: min-fresh=<seconds>
  • Cache-Control: no-cache
  • Cache-Control: no-store
  • Cache-Control: no-transform
  • Cache-Control: only-if-cached

Below is directives list for response.

  • Cache-Control: must-revalidate
  • Cache-Control: no-cache
  • Cache-Control: no-store
  • Cache-Control: no-transform
  • Cache-Control: public
  • Cache-Control: private
  • Cache-Control: proxy-revalidate
  • Cache-Control: max-age=<seconds>
  • Cache-Control: s-maxage=<seconds>

In upcoming sessions, we’ll discuss about all these directives in details.

The Vary header:

A cache finds a cache entry by using a key value in a lookup algorithm. HTTP/1.0 uses just the requested URL as the cache key. But this is not a perfect model as sometimes response may vary not only based on the URL, but also based on one or more request-headers (such as Accept-Language and Accept-Charset).

To support this type of caching, HTTP/1.1 includes the Vary response-header. This header field carries a list of the relevant selecting request-header fields that participated in the selection of the response variant. If new request exactly matches with the cached version of request, then only cached resource is returned else server does it normal processing to return the resource.

In addition to above headers, there are few more important headers but those are generally used in more complex scenarios. We’ll cover those headers separately e.g. If-Unmodified-Since and If-Match etc.

That’s all about caching. Please note that this is very simple and basic overview about caching changes in http 1.1. There is a long list of changes which are complex. We’ll cover those in another articles.

 

0

Database Compatibility Levels – Introduction

Hello friends, let’s have a little idea about database compatibility levels and impact. I’ve phrased few common questions and answered those. Let’s have a look into those and feel free to ask in case of any confusion.

What is Database Compatibility Level?

Database compatibility level is just a number which indicates what all features your database still supports. This is used for backward compatibility. The compatibility level of a database dictates how certain language elements of the database function as it relates to an earlier version of SQL Server.  In a nutshell, this offers up partial “backward compatibility” to an earlier version. 

If you are having a database which is designed in Sql server 2000. In this case, compatibility level of your database will be 80.

Now your organization want to upgrade to Sql server 2005. You upgraded Sql server version and restored your old database to this newer version.

Perhaps you didn’t notice that there are few features which are there in Sql server 2000. You are using those features and now Sql server 2005 is not having those features. You may expect that those DB object will not work which are using old features. But you are wrong. Those features still work in newer version of Sql server. But how?

This is because of compatibility level. When you restored your database to newer Sql version then compatibility level of your database is still 80. This indicates to Sql server 2005 that there are still few features of Sql server 2000 which you may be using so those should not break.

So things will not break immediately. You have some time to find out these old features and replace those with new ones.

You can check compatibility level of your database from management studio.

Right click on your database and click on properties. Go to options tab and you’ll be able to see Compatibility Level.

In addition to this, you can check compatibility level of all databases using below query.

SELECT name, compatibility_level FROM sys.databases;

 

Who sets Compatibility Level?

Database compatibility level is initially set by Sql server. Following are few scenarios to understand this.

  1. You are creating a new database in Sql server 2012, then compatibility level of this database will be 110.
  2. If you are having a database which is created in Sql 2000 and now you are restoring this backup into Sql 2005 then compatibility level of this database will be 80.
  3. If you are having a database which is created in Sql 2000 and you created scripts for all DB objects. You created a new database in Sql server 2005 and run all scripts to make it replica of your old database. Then this new database will have compatibility level 90. Note: In this case your old features will not work as your database is having compatibility level according to Sql 2005.

Below is the chart of default compatibility levels of Sql server versions.

database compatibility levels

What is the Impact of this?

Let’s see this with an example. You have a database which is designed in Sql server 2000. So the compatibility of this database would be 80. Now suppose that your database is having few queries, stored procedures which are having non-ANSI joins.

select a.name,a.empno,b.loc
from tab a, tab b
where a.deptno *= b.deptno;

From above query you can see that we are using *= in query. This is kind of Left Outer Join which is supported in Sql version 2000 and lower.

Now you upgraded your Sql server to 2005 and restored your database. When you run your stored procedure which is having non-ANSI join, it’ll work fine because of compatibility level. Your database compatibility level is still 80.

If you change your database compatibility level to 90 then you may break things. So make sure that you replace all old features with new ones before upgrading compatibility level of your database.

Can I change it?

Of course, you can change it but now you understand impact of this. If you are using old features and changing compatibility level to new then you can break few stored procedure or other Db objects. So before changing compatibility level, make sure you have replaced all old features with new one.

Here is an example of how to set the database compatibility level in T-SQL:

ALTER DATABASE AdventureWorks2014 SET COMPATIBILITY_LEVEL = 120;

You can change it using management studio also. Just select compatibility level according to your case.

database compatibility levels change

With SQL Server 2014 and newer, the database compatibility level also controls whether the new cardinality estimator or the legacy cardinality estimator is used. The new cardinality estimator gives equivalent or better query performance for many to most queries, but it sometimes has very significant performance regressions. It is very important to do as much testing as possible before you change the compatibility level to 120 or higher.

Note:

  1. Don’t assume that this setting will let you restore the database backward to an older version. You cannot restore new database backup to older version. There are certain rule for downgrading which you can check here. Sql Server Downgrades-Rules & Restrictions
  1. Don’t think that compatibility level is same as database version. Both are different properties with different concept behind. The database version is an internal versioning system that defines what version of SQL Server the database was a recent resident of. Will cover more details about database version in upcoming articles.

Upgrade Advisor

We have an option to help with upgrades i.e. Upgrade Advisor. SQL Server Upgrade Advisor helps you prepare for upgrades to SQL Server 2014. Upgrade Advisor analyzes installed components from earlier versions of SQL Server, and then generates a report that identifies issues to fix either before or after you upgrade.

You can get more information about upgrade advisor here Upgrade Advisor

Feels free to experiment new things and connect to me.

0

Windows Data Access Components – Introduction

What is MDAC\WDAC?

MDAC – Microsoft Data Access Components. We call it WDAC also i.e. Windows Data Access Components.

This is a set of various components that enable your application to access any kind of data and make use of it. This data can be from Sql Server, Oracle or any other RDBMS. Not only RDBMS but the source can be any non RDBMS also like xml files, documents etc.

MDAC is not a single application or component, there are multiple libraries involved to make the set. Major components which we are going to discuss in this article are:

  • ODBC
  • OLEDB
  • ADO
  • Ado.Net

 

MDAC architecture may be viewed as three layers:

  1. A programming interface layer: Consisting of ADO and ADO.NET
  2. A database access layer: Developed by database vendors such as Oracle and Microsoft (OLE DB, .NET managed providers and ODBC drivers)
  3. The database itself.

These component layers are all made available to applications through the MDAC API.

The latest version of MDAC (2.8) consists of several interacting components, all of which are Windows specific except for ODBC (which is available on several platforms).

Let’s discuss these components.

ODBC

Introduction

The ODBC interface is an industry standard and a component of Microsoft Windows Open Services Architecture (WOSA). The ODBC interface makes it possible for applications to access data from a variety of database management systems (DBMS). ODBC permits maximum interoperability as application can access data in diverse DBMS through a single interface. Furthermore, that application will be independent of any DBMS from which it accesses data. Users of the application can add software components called drivers, which create an interface between an application and a specific DBMS.

Where to Use

An ODBC driver uses the Open Database Connectivity (ODBC) interface by Microsoft that allows applications to access data in database management systems (DBMS) using SQL as a standard for accessing the data.

Shortcomings

As mentioned ODBC is an open standard to access Sql based data. So this is typically used in RDBMS only. If we want to access data from other sources which are not RDMS then we need to think of some other solution.

This is the place where OLEDB comes into picture.

OLEDB

Introduction

As we know that ODBC was specifically meant for SQL databases. Due to this limitation of ODBC, OLEDB came into picture. OLEDB providers are able to fetch data from other data sources also.

OLE DB is an open specification designed to build on the success of ODBC by providing an open standard for accessing all kinds of data.

Where to Use

Whereas ODBC was created to access relational databases, OLE DB is designed for relational and non-relational information sources, including mainframe ISAM/VSAM and hierarchical databases; e-mail and file system stores; text, graphical, and geographical data; custom business objects; and more.

If a database supports ODBC and that database is on a server that don’t support OLE then ODBC is your best choice. 

Non-SQL environment: ODBC is designed to work with SQL. If you have non-SQL environment then OLE-DB is better choice.

Shortcomings

However there is no issues in OLEDB at all if you are using it with native code i.e. VB6, C etc. In these cases it will work fine.

But if you are planning to use it somehow with managed code then there will be some underlying plumbing which will convert these managed calls to unmanaged code. This plumbing is going to impact your system. But this is something which we can’t consider limitation of OLEDB.

ADO

Introduction

ADO is the strategic application programming interface (API) to data and information. It provides consistent, high-performance access to data and supports a variety of development needs, including the creation of front-end database clients and middle-tier business objects that use applications, tools, languages, or Internet browsers.

ADO is designed to be the one data interface needed for single and multi-tier client/server and Web-based data-driven solution development. The primary benefits of ADO are ease of use, high speed, low memory overhead, and a small disk footprint.

ADO provides an easy-to-use interface to OLE DB, which provides the underlying access to data. ADO is implemented for minimal network traffic in key scenarios, and a minimal number of layers between the front end and data source-all to provide a lightweight, high-performance interface.

Where to Use

ADO is easy to use because it uses a familiar metaphor-the COM automation interface, available from all leading Rapid Application Development (RAD) tools, database tools, and languages on the market today. It’s a nice wrapper for OLDDB.

ADO Performance Advantages: As with OLE DB, ADO is designed for high performance. To achieve this, it reduces the amount of solution code developers must write by “flattening” the coding model.

The programmer can create a recordset in code and be ready to retrieve results by setting two properties, then execute a single method to run the query and populate the recordset with results. The ADO approach dramatically decreases the amount and complexity of code that needs to be written by the programmer.

Shortcomings

ADO is based on COM Technology and it used OLEDB data provider for accessing data. It has a limited number of data types which are defined by the COM standard.

ADO works with connected data architecture. That means, when you access the data from data source, such as viewing or updating data, ADO recordset is keeping connection with the data source.

ADO can’t be integrated with XML because ADO have limited access of XML.

In ADO, You can create only Client side cursor.

Using a single connection instance, ADO cannot handle multiple transactions.

ADO.Net

Introduction

ADO.NET is a data access technology from the Microsoft .NET Framework that provides communication between relational and non-relational systems through a common set of components. ADO.NET is a set of computer software components that programmers can use to access data and data services from a database.

ADO.NET provides consistent access to data sources such as SQL Server and XML, and to data sources exposed through OLE DB and ODBC. Data-sharing consumer applications can use ADO.NET to connect to these data sources and retrieve, handle, and update the data that they contain.

Where to Use

In every managed code where we want to access data, we should use Ado.Net. It hides underlying implementations and provide with a cleaner model to access and manipulate data.

It is specifically for .net managed environment.

Below are few differences between ADO and ADO.Net

Windows Data Access Component

Complete picture

Windows Data Access Component

From above picture we can see that for every data access, ODBC and OLEDB are the core components which serve their specific purpose. Both components are equally important and required. However ADO.net have many benefits over ADO but it doesn’t mean that ADO is completely eliminated. ADO is still com based and should be used by com based application.

SNAC is also part of data access component which we’ve covered here

2

Sql Server Native Client (SNAC) – Introduction

Hello friends, Today we are going to cover an important aspect of Sql server i.e. connectivity. In this article we’ll cover some basic details of SNAC i.e. Sql Server Native Client.

In this article I’ve used few terms like 

Windows Data Access Components, WDAC, Microsoft Data Access Components, MDAC. Please note that all are same. These are synonyms.

What is Sql Server Native Client?

SNAC

SQL Server Native Client is a stand-alone data access application programming interface (API), used for both OLE DB and ODBC. This was introduced in SQL Server 2005. SQL Server Native Client combines the SQL OLE DB provider and the SQL ODBC driver into one native dynamic-link library (DLL).

It also provides new functionality above and beyond that supplied by the Windows Data Access Components (Windows DAC, formerly Microsoft Data Access Components, or MDAC). SQL Server Native Client can be used to create new applications or enhance existing applications that need to take advantage of features introduced in SQL Server 2005. These new features are multiple active result sets (MARS), user-defined data types (UDT), query notifications, snapshot isolation, and XML data type support.

Note: We’ll discuss about these new features in upcoming articles.

The SQL Server Native Client ODBC driver is always used in conjunction with the ODBC Driver Manager supplied with Windows data access components. The SQL Server Native Client OLE DB provider can be used in conjunction with OLE DB Core Services supplied with Windows data access components, but this is not a requirement. The choice to use Core Services are not depends on the requirements of the individual application (for example, if connection pooling is required).

While SQL Server Native Client uses components in Windows DAC, it is not explicitly dependent on a particular version of Windows DAC. You can use SQL Server Native Client with the version of Windows DAC that is installed with any operating system supported by SQL Server Native Client.

Latest version of native client is 11.

Why we need it?

When deciding whether to use SQL Server Native Client as the data access technology of your application, you should consider several factors.

For new applications, if you’re using a managed programming language such as Microsoft Visual C# or Visual Basic, and you need to access the new features in SQL Server, you should use the .NET Framework Data Provider for SQL Server, which is part of the .NET Framework.

If you are developing a COM-based application and need to access the new features introduced in SQL Server, you should use SQL Server Native Client. And if you don’t need access to the new features of SQL Server, you can continue to use Windows Data Access Components (WDAC).

In case you are upgrading existing or developing new COM-based (or native) applications that will target the new features of SQL Server 2005 then you’ll need SNAC.

If you don’t need any of the new features of SQL Server 2005, then you don’t need to use SQL Native Client, your existing OLE DB and ODBC code will work just fine.

Of course, if you have or are planning on moving to a managed code base for data access, then the ADO.NET data access classes of the .NET Framework is what you should use.

How can we deploy?

When deploying an application that is dependent on SQL Native Client, you will need to redistribute SQL Native Client with your application. SQL Native Client is a component of SQL Server 2005. Therefore, it is important to install SQL Native Client in your development environment and redistribute SQL Native Client with your application.

SQL Native Client redistributable installation program name is sqlncli.msi. This is available on the installation media and is available as one of the SQL Server 2005 Feature Pack components on the Microsoft Download site. Below are links for that.

Below are the links to download SQL Server 2012 Native Client

x86 Package
x64 Package 

This was very basic introduction of SNAC. We’ll cover more aspects of connectivity in upcoming sessions.

0

Sql Server Licensing & Differences

Hi Friends, as we’ve already discussed about various Sql server versions and editions, it’s time to discuss about various sql server licenses available and what the difference among those is.

You can check various Editions here.

There are two type of licensing options available with Sql server 2012:

  1. Core based licensing
  2. Server+CAL licensing model

Core Based Licensing

Physical Machine:

In previous versions, sql server licenses were based on per processor based. Means if you are having one processor and this processor is having two cores then one license is enough to meet your needs.

But this is not the case in Sql 2012. Now we have core based licensing. That means if we are having a processor with 2 cores then we’ll need 2 Sql licenses.

Core-based licenses are sold in two-core packs with a minimum of 4 core licenses required for each physical processor. This means that at a minimum you need to purchase two of these two-core packs for each physical processor occupying a processor socket.

Below is the matrix for license required according to cores. We know that 2 double core license is minimum purchase so one will have to purchase license for 4 cores at minimum. As number of cores increases, number of sql server licenses also increase accordingly.

You can notice, even if you license a single or dual core processor, you still have to buy enough two-core packs to license the minimum of four cores. With this four-core minimum license per physical processor requirement of SQL Server 2012, it makes more sense to have a single four-core processor than two dual-core processors.

The Per Core licensing option allows for unlimited users or devices to access SQL Server (both internally and externally). The major benefit of this option is that there is no requirement to try to quantify the number of users or devices accessing the SQL Server, be that direct or via indirect means using in between layers e.g. business logic layer.

Only Enterprise and Standard Editions of SQL Server 2012 are available for core-based licensing. 

Virtual Machine:

Let’s discuss how licensing will be required for virtual machines. Each virtual machine is treated as different machine and each core in these virtual machines will be considered as separate core.

Let’s take an example of one server:

These is one server. This server is having 2 VMs. One VM is having 2 Cores and second VM is having 6 cores.

  • Server-1
    • VM-1
      • Core-1
      • Core-2
    • VM-2
      • Core-1
      • Core-2
      • Core-3
      • Core-4
      • Core-5
      • Core-6

Now we know that for VM-1 we need sql server licenses for 2 cores but we know that we need to purchase minimum two 2-core packs i.e. we need to purchase licenses for 4 cores minimum.

So for VM-1 we need 4 sql server licenses.

In same way we need three 2-core packs for VM-2 which will give us 6 licenses. So to cover both VMs we need 10 licenses i.e. five 2-core packs licenses. 

Host server or server farm core licensing:

The total number of cores available on a host server or server farm can be licensed to maximize virtualization capabilities and take advantage of full computing power. To fully license a host server or server farm, Enterprise Edition core licenses along with Software Assurance are required.

There are different type of volume licencing options available:

Open Value License Program is the recommended program if you have a small to midsize organization with five or more desktop PCs and want to simplify license management, manage software costs, and get better control over your investment.

Open License Program is a Microsoft service that allows corporate, academic, charitable, or government organizations to obtain volume licenses for Microsoft products. It is ideally suited for companies with between 2 – 250 personal computers, but can accommodate organizations with up to 750 computers.

Select License Program was designed specifically for medium and large organizations with 250 desktop PCs or more that have mixed software requirements and want a simple, flexible, and affordable way to purchase the latest Microsoft technology on a “pay as you go” basis.

EA/SA (Enterprise Agreement/Software Assurance) Program is a volume licensing package offered by Microsoft. It primarily targets large organizations that have 500 or more personal computers. 

This type of licensing allows for an unlimited number of virtual machines and allows dynamic provisioning and de-provisioning of virtual processors.

Server + CAL Licensing

Physical Machine:

As name specified, in this model there will be one server license for server which will host Sql server. CAL is client access license which will be required for each and every user who is trying to connect to the Sql server.

These users may be connecting to Sql server directly by application installed on their desktops or these user can give call to any intermediate layer and this intermediate layer intern connect to the server. In both the cases same number of CALs are needed.

Now question may arise, why we need this licensing model.

The Server + CAL option is often the preferred licensing option when:

  1. Smaller SQL Server implementations exist and SQL is the supporting database for another application or smaller Intranet; and
  2. The number of users or devices accessing SQL Server can be quantified and Server + CAL is more cost effective than the Per Core model

Server + CAL Licensing for SQL Server 2012 is only available for the Standard and Business Intelligence Editions. Users need a CAL that is the same version or newer than the version of the licensed SQL Server they need to access. A SQL Server 2012 CAL can be used to access multiple licensed SQL Servers, including SQL Server 2012 Standard and Business Intelligence Editions. These new CALs can also be used for previous versions of SQL Server, including the Enterprise Editions of 2008 and 2008R2, for which the Server + CAL licensing model was still available.

Virtual Machine:

Now let’s discuss how Server+CAL licensing will be required for virtual machines. In this case also each virtual machine is treated as different machine but licenses are required for each VM, not for each core.

Let’s take above example once again:

These is one server. This server is having 2 VMs. One VM is having 2 Cores and second VM is having 6 cores. There are 20 users who are accessing these servers.

  • Server-1
    • VM-1
      • Core-1
      • Core-2
    • VM-2
      • Core-1
      • Core-2
      • Core-3
      • Core-4
      • Core-5
      • Core-6

So now there will be 2 sql server licenses required for both VMs. And 20 CALs are required for users who are accessing these servers.

Below is the matrix which shows which license is available with which edition.

Edition

Server + CAL

Core Based

Enterprise

NA

Yes

Business Intelligence

Yes

NA

Standard

Yes

Yes

Now you can think from your organization’s perspective, which one suite you.

 

0

SQL Server Downgrades – Rules & Restrictions

Hello Friends, Today we will discuss about rules according to which Sql server downgrades are possible. Like upgrades, these operations can be edition specific or version specific.

What is the need to downgrade?

But what is the need to downgrade? Well there is. Let’s say you have an application which is working fine with current edition\version of Sql server and then you decided to upgrade to new edition\version. You upgraded your database edition\version and then you saw that there are many issues exposed which could be new features related to security, availability etc. of new edition\version. And your application is not ready to handle those features without additional work.

Now to get things working, you’ll have to downgrade your Sql edition\version back to previous one and then decide upon the strategy to follow for future.

Again it is not possible to downgrade any edition, version without any rules or restrictions. There are some set of rules.

Let’s see how this works.

Edition Downgrades

If we want to downgrade Sql server edition then we need to first uninstall the higher edition and then install the lower edition of Sql server. There is no standard tool which can provide this downgrade out of the box. Uninstallation is the only option.

Now we may have few databases which were there in higher editions. What about those databases?

There are following steps to get these databases working on lower editions:

  1. To get those databases in lower editions, first we need to check if these databases are using any higher version specific features or not. A database created in SQL Server 2012 Enterprise Edition that uses enterprise edition features, such as Table Partitioning or Transparent Data Encryption (TDE), cannot be restored or attached to the SQL Server 2012 Standard Edition without first disabling these edition-specific features. To check this we can use below query.
SELECT [feature_name] FROM [sys].[dm_db_persisted_sku_features]

The [sys].[dm_db_persisted_sku_features] dynamic management view may return rows containing any of the following Enterprise or Developer Edition features:

  • Compression
  • Partitioning
  • Transparent Data Encryption
  • Change Capture

If no edition-specific features have been enabled in the user database, the [sys].[dm_db_persisted_sku_features] dynamic management view does not return any rows.

  1. Once we have list of all these edition specific features then we need to disable all these.
  2. Once we disabled all these features then we can back up the database.
  3. After installation of lower edition, we can restore\attach our database to lower edition and it should work.

Version Downgrades

It is same as edition downgrade, if we want to downgrade Sql server version then we need to first uninstall the higher version and then install the lower version of Sql server. There is no standard tool which can provide this downgrade out of the box. Uninstallation is the only option.

But there is one issue here. You cannot restore\attach those backups which were created in higher versions. The only option to get your database is, you’ll have to create new database in old version Sql and then run all DDL scripts, Data scripts in this newly created database to get everything back.

Following are the steps:

  1. In new version database, create DDL scripts for all objects i.e. tables, SPs, functions etc.
  2. Create all data scripts.
  3. Uninstall current version.
  4. Install older version which is required.
  5. Create new database and run all DDL scripts, DML scripts.

By using above flow we should be able to get our database in lower version.

Below are few points to consider and few points to understand while downgrading.

Database Compatibility Level:

Let’s discuss this with an example. Let’s say you are running your application in Sql server 2008 and now you decided to upgrade you Sql server to version 2012.  Did the upgrade and then you came to know that your application was using some features of Sql server 2008 which are no longer supported in sql server 2012.

Now in this case you need to change the application code so that it should stop using these un supported features of Sql server 2008 and start its alternative features of Sql server 2012. But by the time you make these changes into you application, you want your application to working. You don’t want your application to be in this broken mode.

In this case you will set compatibility level of your new database to Sql server 2008 compatibility level. This compatibility level ensures that it will support all those old features which were there in Sql server 2008. This way your application will not break and you’ll get time to upgrade application code so that it can start using new features of new version.

To see the compatibility level of each database, right click the database in SQL Server Management Studio and select Properties, then click the Options tab. See highlighted field below:

In the image above, I’m running a SQL Server 2008 database, which has compatibility level of 100, on a machine that is running an instance of SQL Server Express 2014 (see the Connection section on the bottom left, my instance name is SQLEXPRESS2014).

Another way to look at the compatibility level of each database on your server is to query the compatibility level column of the sys.databases table:

USE YourDatabaseName;
GO
SELECT compatibility_level
FROM sys.databases WHERE name = 'YourDatabaseName';
GO

The main reason for having compatibility levels is to allow for Backward Compatibility. Each compatibility level has its own set of features. You must program against those features when developing an application that uses the database. If you want to move your database to a newer instance of SQL Server but at the same time have it backwards compatible, you need to make sure the compatibility level stays the same. 

Below is the chart of compatibility levels of different Sql versions.

SQL Server 2012 supports only databases at compatibility levels 90, 100, and 110. Meaning it supports databases going back to SQL Server 2005 only.

Database Version:

Database compatibility levels are not the same as database version numbers. The database version number is an internal number associated with a specific structure of a database’s system tables containing metadata about various objects such as tables, columns, indexes, allocations, and details about the relational and physical structure of the database.

Below is the chart of different versions of Sql server databases.

To identify the version number of a specific SQL Server database, you can query the master.sys.sysdatabases table as follows

SELECT version
FROM master.sys.sysdatabases
WHERE name='AdventureWorks2012'

All databases attached\restored to a newer version of a SQL Server instance, upgrade automatically to the database version that the newer SQL Server instance supports. In the case of SQL Server 2012, all databases created, attached, or restored in a SQL Server 2012 instance convert to database version number 700.