Hi Friends, Today we are going to discuss about SQL query execution steps and all components which participates from start to end while performing this execution.
SQL Query Execution
Below is a very high level flow of information and control.
There are following components which participate:
- Query optimizer
- Query engine
- Storage engine
- Buffer manager
- Lock manager
- Resource monitor
Let’s discuss one by one about role of each of above listed components. However every component in itself is a bible but I am trying to give little detail of each.
Whenever we write and submit query for execution, then parser is the first component which starts the work. During the parsing stage SQL Server performs basic checks on the code i.e. looks for invalid SQL syntax, incorrect use of reserved words, column and table names, and so on.
If parsing detects any errors, the process stops and the errors are returned.
If parsing completes without errors, it generates a parse tree. Parser passes this tree onto the next stage of query processing i.e. Algebrizer. The parse tree is an internal representation of the query.
It does following tasks:
Step 1: Name resolution:
The algebrizer resolves all the names of the various objects, tables and columns in the query. It confirms that all objects exist and are visible in the security context of the user. This is where it checks the table and column names to ensure that they exist and that the user has access to them.
This step is important because the query may have aliases or synonyms, names that don’t exist in the database, that need to be resolved. When objects don’t exist in the database, SQL Server returns an error from this step, defining the invalid object name.
Step 2: Type derivation:
The algebrizer identifies, at the individual column level, all the data types for the objects being accessed. Determines the final type for each node in the parse tree.
Step 3: Aggregate binding:
It also determines the location of aggregates (such as GROUP BY, and MAX) within the query. This process is aggregate binding.
The algebrizer outputs a binary tree i.e. query processor tree. It passes this tree to the query optimizer. The algebrizer’s output includes a hash, a coded value representing the query. The optimizer uses the hash to determine whether there is already a plan generated and stored in the plan cache. If there is a plan there, the process stops here and that plan is used. This reduces all the overhead required by the query optimizer to generate a new plan.
The SQL Server Query Optimizer is a cost-based optimizer. It analyzes various execution plans for a given query, estimates the cost of each of these plans and then selects the plan with the lowest cost. Query Optimizer cannot consider every possible plan for every query so it actually has to do a cost-based balancing act, considering both the cost of finding potential plans and the costs of plans themselves. Therefore, it is the SQL Server component that has the biggest impact on the performance of databases.
The query engine (or query processor) takes the query plans which query optimizer finalized and, no surprise here, executes them. There are many functions which query engine performs, some of them being:
- It acts as a dispatcher for all commands in the sql query execution plan. It iterates through all the commands in the plan until the batch is complete.
- And it interacts with the storage engine to retrieve and update data from tables and indexes.
In a database engine, the Storage Manager (sometimes called a Storage Engine or Storage System) is responsible for all aspects of storage, transactional requirements, low-level concurrency, and indexing. It usually contains something like the following modules:
- The Buffer Manager/Buffer Pool
- The Lock Manager (for non-metadata locks)
- Low-level I/O Subsystem
- The Transaction and Recovery Manager
- Index Manager(s)
- The Base Table Manager
- Top-level APIs
The SQL Server Buffer manager monitors how SQL Server uses memory to store data pages. It monitors the physical Input/output as SQL Server reads and writes database pages. It also monitors the buffer cache by using fast non-volatile storage such as solid-state drives (SSD). Buffer manager pre-fetch data pages to improve sql query execution.
Lock managers implement the storage engine’s notion of locking. They are usually Table-level locks, Page-level locks (rare nowadays), or Row-level locks.
SQL Server Resource monitor was introduced in SQL Server 2008. This feature controls the consumption of the available resources, by limiting the amount of the CPU, Memory and IOPS for incoming sessions. It tries to prevent performance issues that are caused by resources high consumption.
The Resource monitor simply differentiates the incoming workload and allocates the needed CPU, Memory and IOPS resources based on the predefined limits for each workload. In this way, the SQL Server divides resources among the current workloads reducing the possibility of consuming all resources by single workload type.
SQLOS performs the following critical functions for SQL Server:
Scheduler and IO completion:
The SQLOS is responsible for scheduling threads for CPU consumption. Most threads in SQL Server run in cooperative mode. It means that the thread is responsible for yielding so that other threads can obtain CPU time. Most IO is asynchronous. The SQLOS is responsible for signaling threads when IO is completed.
SQL server is a multi-threaded application, so SQLOS is responsible for managing thread synchronizations.
Different components within SQL Server, example plan cache, CLR, lock manager etc request memory from the SQLOS. Therefore, the SQLOS can control how much memory a component within SQL Server is consuming.
Deadlock detection and management of the same.
Exception handling framework.
Hosting services for external components such as CLR and MDAC:
SQL Server will run threads that are associated with external component in preemptive mode. Preemptive mode allows the SQLOS to prevent runaway threads (threads which will not yield and allow other threads to get CPU execution). Also the SQLOS can keep track of the memory, these external components consume. For example, for CLR the SQLOS can invoke garbage collection if the CLR process is taking up too much memory.
As you can see, SQLOS seems to be a great resource management capability as it makes sure the SQL Server instance can be always up and running.
Above is very high level top view of each mentioned components involved in sql query execution. We’ll dive deep into these in upcoming articles.