SQL Server 2016 components Marcin Szeliga
[email protected]
Know SQL Server 2016 components › Introduction to the SQL Server Platform › Overview of SQL Server Architecture › SQL Server Services and Configuration Options
SQL Server Components › SQL Server includes a powerful relational database engine, but the scope of the SQL Server product is far broader than the database engine alone › Database Engine – It provides a high-performance, scalable relational database engine based on the SQL language that can be used to host online transaction processing (OLTP) databases for business applications and data warehouse solutions
› Analysis Services – It is an online analytical processing (OLAP) engine that works with analytic cubes and tables and is used to implement enterprise BI solutions for data analysis and data mining
SQL Server Components cont › Integration Services – It is an extract, transform, and load (ETL) platform tool for orchestrating the movement of data in both directions between SQL Server components and external systems
› Reporting Services – It is a reporting engine based on web services, providing a web portal and end-user reporting tools. It can be installed in native mode, or integrated with Microsoft SharePoint® Server
› Master Data Services – It provides tooling and a hub for managing master or reference data
SQL Server Components cont. › Data Quality Services – It’s a knowledge-driven data quality tool for data cleansing and matching
› StreamInsight – It provides a platform for building applications that perform complex event processing for streams of real-time data
› Full-Text Search – It is a feature of the database engine that provides a sophisticated semantic search facility for text-based data
› Replication – The SQL Server database engine includes Replication, a set of technologies for synchronizing data between servers to meet data distribution needs
SQL Server Components cont. › PolyBase – It is an extension to the database engine where you can query distributed datasets held in Hadoop or Microsoft Azure Blob Storage from Transact-SQL statements ™
› R Services – It is an extension to the database engine where you can execute scripts written in the open source R language and access their results from Transact-SQL statements
SQL Server Instances › It is sometimes useful to install more than one copy of a SQL Server component on a single server – You might want to have different administrators or security environments for sets of databases – Some of your applications might require server configurations that are inconsistent or incompatible – Your application databases might need different levels of service – You might need to support different versions or editions of SQL Server – Your applications might require different server-level collations
› Many SQL Server components can be installed more than once as separate instances
Default and Named Instances › Before SQL Server 2000, just one copy of SQL Server could be installed on a server system – SQL Server was addressed by the name of the Windows server where it was hosted – To maintain backward compatibility, this mode of connection is still supported and is known as the “default instance”
› Additional instances of SQL Server require an instance name that you can use in conjunction with the server name and are known as named instances – To access a named instance, client applications use the address ServerName\Instance-Name
SQL Server Editions › SQL Server 2016 is available in a variety of editions, with different price points and levels of capability › Enterprise - premium offering; provides the highest levels of reliability for demanding workloads › Business Intelligence - Adds advanced BI features to the offering from Standard Edition › Standard - Delivers a reliable, complete data management platform › Developer - You can build, test and demonstrate all SQL Server functionality › Express - Provides a free, entry-level edition suitable for learning and lightweight desktop or web applications › Microsoft Azure SQL Database - You can build database applications on a scalable and robust cloud platform
SQL Server Versions Release Name (SQL Server)
Version Number
Release Year
1.0
1.0
1989
1.1
1.1
1991
4.2
4.2
1992
4.21
4.21
1994
6.0
6.0
1995
6.5
6.5
1996
7.0
7.0
1998
2000
8.0
2000
2005
9.0
2005
2008
10.0
2009
2008 R2
10.5
2010
2012
11.0
2013
2014
12.0
2014
2016
13.0
2016
Know SQL Server 2016 components › Introduction to the SQL Server Platform › Overview of SQL Server Architecture › SQL Server Services and Configuration Options
SQL Server Engine Architecture › SQL Server comprises many components that work together › Three general categories of components exist within the database engine and are structured as layers – External protocols – Database engine – SQLOS
SQL Server Query Processing SQL Server stores table rows & columns
Pages read from disk - slow Pages read from cache – tres fast!
(Authors Table in pubs db No? Yes?has ~26rows and is Compile & approx 6kb total size) Execute..
All DB changes hardened in TLog
Execute..
Then, DB changes written to cache Pages can by dirtied multiple times Dirty pages later flushed to .mdf
Select * from authors where au_lname = ‘White’ update authors set au_fname = ‘Johnston’ where au_lname = ‘White’
au_id au_lname au_fname phone address city state update authors set au_fname = ‘Marj’ where au_lname = ‘Green’ 172-32-1176 White Johnson 408-496-7223 10932 Bigge Rd. Oakland CA
UPDATE UPDATE
Lookup Exec Plan in Proc Cache
Lookup Pages In Data Cache
Data Cache
Proc Cache
Buffer Manager
Execution Plan Found?
MTL
Physical Memory (RAM)
Table rows are stored on Disk in 8kb units, named “pages”. When loaded into memory pages are referred to as “buffers”
Data volume (HDD) Write ahead log (TLOG)
External protocols › Shared memory – Simple and fast, shared memory is default protocol used to connect from a client running on the same computer as SQL Server – Can only be used locally, has no configurable properties, and is always tried first when connecting from the local machine
› TCP/IP – The most commonly used access protocol for SQL Server – Enables you connect to SQL Server by specifying an IP address and a port number › Your internal name resolution system resolves the hostname part of the instance name to an IP address, and either you connect to the default TCP port number 1433 for default instances or the SQL Browser service will find the right port for a named instance using UDP port 1434
› Named Pipes – Developed for local area networks (LANs) and can be inefficient across slower networks such as wide area network – Uses TCP port 445
› VIA – It requires specialized hardware at both ends of a dedicated connection (deprecated)
Tabular Data Stream › TDS is a Microsoft proprietary protocol originally designed by Sybase that is used to interact with a database server – Once a connection has been made using a network protocol such as TCP/IP, a link is established to the relevant TDS endpoint that then acts as the communication point between the client and the server
› There is one TDS endpoint for each protocol and an additional one reserved for use by the dedicated administrator connection (DAC) – Once connectivity is established, TDS messages are used to communicate between the client and the server
› The network packet size is the size of the tabular data scheme (TDS) packets that are used to communicate between applications and the SQL Server Database Engine – The default packet size is 4 KB, and is controlled by the network packet size configuration option – When multiple active result sets is enabled, the user connection takes approximately (3 + 3 * num_logical_connections) * network_packet_size + 94 KB
SQLOS › SQL Server 2005+
› User mode layer between SQL Server and the Windows OS – It is an engine in the form of two DLLs (SQLDK.DLL and SQLOS.DLL)
› Takes care of Scheduling and Memory Management functions
CPU Usage by SQL Server › Windows uses pre-emptive scheduling of threads › One scheduler for every logical CPU created in SQL OS: – Manages the threads retrieved from Windows and assigns tasks to threads – Minimizes context switches through cooperative scheduling
› CPU availability can be configured without restart: – Schedulers can be enabled or disabled – CPU affinity mask can be set
› Tasks waiting on a resource are moved to a waiting list: – Wait type and time are recorded – Details are useful for monitoring and troubleshooting
Conections, Batches and Tasks › Connection is established when the user is successfully logged in – The user can then submit one or more Transact-SQL statements for execution – A connection is closed when the user explicitly logs out, or the connection is terminated
› Batch is a set of one or more Transact-SQL statements sent from a client to an instance of SQL Server for execution – It represents a unit of work submitted to the Database Engine by users
› Task represents a unit of work that is scheduled by SQL Server – A C++ function pointer – A batch can map to one or more tasks › For example, a parallel query will be executed by multiple tasks
– Executed by a worker from a queue (work queue)
Threads › Windows thread represents an independent execution mechanism › Fiber is a lightweight thread that requires fewer resources than a Windows thread and can switch context when in user mode – One Windows thread can be mapped to many fibers
› Worker thread represents a logical thread in SQL Server that is internally mapped (1:1) to either a Windows thread or, if lightweight pooling is turned ON, to a fiber – The mapping is maintained until the worker thread is deallocated either because of memory pressure, or if it has been idle for a long time – The association of a task to a worker thread is maintained for the life of the task
Schedulers › You can think of the SQL Server scheduler as a logical CPU used by SQL Server Workers – Provides method to allow “threads” (workers) be assigned a “CPU”
› In SQL Server each actual CPU (whether hyper threaded or physical) has a scheduler created for it when SQL Server starts – This is true even if the affinity mask option has been configured so that SQL Server is set to not use all the available physical CPUs
› In SQL Server 2008/2008R2/2012 each scheduler is set to either ONLINE or OFFLINE based on the affinity mask settings, and the default is that all schedulers are ONLINE › Aligned by NUMA nodes – Provides mechanism for affinity to NUMA and CPUs
Workers › A worker can be either a thread or a fiber that is bound to a logical scheduler, each worker is also associated with a single CPU
› Each scheduler is assigned a worker a limit based on the configured Max Worker Threads and the number of schedulers – Each scheduler is responsible for creating or destroying workers as needed – A worker cannot move from one scheduler to another, but as workers are destroyed and created, it can appear as if workers are moving between schedulers
Parallelism › Parallelism refers to multiple processors cooperating to execute a single query at the same time › SQL Server can decide to distribute queries to more than one task – – – –
Tasks can run in parallel Overall execution is faster Synchronization overhead is incurred Parallelism is only considered for expensive plans
› Max degree of parallelism defines how many CPUs can be used for execution of a parallel query – Can be overridden using the MAXDOP query hint
› Cost threshold for parallelism defines minimal cost for considering parallel plans
Overview of SQL Server Memory Management › Buffer pool is the main memory object of SQL Server: – Holds data cache – Provides memory for other SQL Server components – Is divided into 8 KB pages
› Because disk I/O operations can consume many resources and take a relatively long time to finish, SQL Server focuses on making I/O highly efficient › SQL OS automatically allocates as much memory as is needed: – Has a mechanism to prevent memory shortage on the system – Can be configured using min and max server memory options
SMP versus NUMA
Memory Clerks › Memory Clerks are the mechanism by which memory caches are used – Buffer Pool is far the largest consumer of memory SQL Server
› Whenever a memory consumer within SQL Server wants to allocate memory it needs to go through a memory clerk, rather than going straight to a memory node › There are generic memory clerks like MEMORYCLERK_SQLGENERAL, but any component that needs to allocate significant amounts will have been written to create and use its own memory clerk – For example: MEMORYCLERK_SQLBUFFERPOOL, MEMORYCLERK_SQLQUERYPLAN, MEMORYCLERK_LOCK_MANAGER, MEMORYCLERK_SQLCLR
Physical I/O and Logical I/O I/O Type
Description
Physical I/O
Physical I/O occurs when the requested page is not available in buffer cache and must be read from the data file into the data cache before the requested page can be supplied or when a changed page is written to the data file
Logical I/O
Logical I/O occurs when the requested page is available in the data cache
Dirty Pages › Lazy writing – The lazy writer is a system process that keeps free buffers available by removing infrequently used pages from the buffer cache – Dirty pages are first written to disk
› Eager writing – The eager write process writes dirty data pages associated with nonlogged operations such as BULK INSERT and SELECT INTO – This allows creating and writing new pages to take place in parallel ›
Calling operation does not have to wait until the entire operation finishes before writing the pages to disk
› Checkpoint – Periodically scans the buffer cache for buffers with pages from a specified database and writes all dirty pages to disk – Save time during a later recovery by creating a point at which all dirty pages are guaranteed to have been written to disk
› The lazy writing, eager writing, and checkpoint processes do not wait for the I/O operation to complete – They always use asynchronous (or overlapped) I/O and continue with other work, checking for I/O success later – This allows SQL Server to maximize both CPU and I/O resources for the appropriate tasks
Read-Ahead › The Database Engine supports a performance optimization mechanism called read-ahead – Read-ahead anticipates the data and index pages needed to fulfill a query execution plan and brings the pages into the buffer cache before they are actually used by the query – This allows computation and I/O to overlap, taking full advantage of both the CPU and the disk
› The read-ahead mechanism allows the Database Engine to read up to 64 contiguous pages (512KB) from one file – The read is performed as a single scatter-gather read to the appropriate number of (probably non-contiguous) buffers in the buffer cache – If any of the pages in the range are already present in the buffer cache, the corresponding page from the read will be discarded when the read completes – The range of pages may also be "trimmed" from either end if the corresponding pages are already present in the cache
Know SQL Server 2016 components › Introduction to the SQL Server Platform › Overview of SQL Server Architecture › SQL Server Services and Configuration Options
SQL Server Services and Configuration Options › SQL Server Services › Managing Service Accounts › Configuring Network Protocols
SQL Server Services › A SQL Server instance is made up of a number of Windows services › Only the services required to support the installed SQL Server features will be present › Many services are installed once per SQL Server instance – If a service is linked to an instance, the instance name will appear in brackets after the service name—SQL Server (MSSQLSERVER)
› Use SQL Server Configuration Manager to configure services
Managing Service Accounts › Assign service accounts during installation – After installation, configure SQL Server services by using the SQL Server Configuration Manager tool
› Choose an appropriate account type: – – – – – – –
Domain user account Local user account Local service account Local system account Network service account Managed service account Virtual service account
Configuring Network Protocols › Network protocols can be defined for both server and client components – 32-bit and 64-bit server settings and client settings are configured independently
› Aliases may be used to allow a single server instance to respond to multiple names
SQL SERVER 2016 COMPONENTS Introduction to the SQL Server Platform Overview of SQL Server Architecture SQL Server Services and Configuration Options