Home SQL Overview
Post
Cancel

SQL Overview

Overview

Structured Query Language (SQL) is designed for managing data held in a relational database management system (RDBMS). It is particularly useful in handling structured data, i.e. data incorporating relations among entities and variables. SQL Server is a product developed by Microsoft to manage relational databases, storing and retrieving data requested by other applications. There are various topics regarding SQL and cyber security ranging from the protection of databases, SQL Injection attacks and database manipulation. This post however, serves as a fundamental overview of the SQL database programming and server.

Database Concepts

A relational database is designed to organize a collection of data, typically stored and accessed electronically from a computer system or electronic device and is regularly referenced and altered by software applications. There are three generic styles of databases, with SQL focusing on the later, Relational Databases.

  • Hierarchical: Think directory structure
  • Flat-file: CSV, Excel, Delimited
  • Relational: MSSQL, MySQL, and more
    • Tables: Think an excel page
    • Columns: Data headers
    • Rows: Data values

SQL Sub-languages

Data Manipulation Language (DML)

DML is an SQL sub-language that deals with the manipulation of data records stored within the database tables. It does not deal with changes to database objects or its structure. Some commonly used examples of DML are as in the table below:

StatementDescriptionCode Example
SELECTRetrieve rows from database and enables select of one or many rows or columnsSELECT SalesOrderID, SUM(LineTotal) AS SubTotal
FROM Sales.SalesOrderDetail
INSERTAdds one or more rows to a table or viewINSERT INTO table_name (column1, column2)
VALUES (value1, value2)

INSERT INTO table_name
VALUES (value1, value2, value3)
UPDATEChanges existing data in a table or view 
DELETERemoves one or more rows from a table or viewDELETE FROM table_name
WHERE [condition];
MERGEPerforms insert, update, or deletes on a target table based on the results of a join with a source table 
BULK INSERTImports a data file into a database table or view 

Data Definition Language (DML)

Unlike DML above, DDL defines data structures and is used to update or modify objects rather than the data contained within them. The below table lists the more common DDL statements used:

StatementDescriptionCode Example
CREATEEnables creating an item in SQL Server. For example, database, table, view, users, index and moreCREATE DATABASE database_name
ALTEREnables modification of an existing object 
DROPDrops and existing objectDROP TABLE table_name
COLLATIONDefines the collation of a database or table column, or a collation cast operation when applied toa character string expression. Assigns certain characteristics to data. 
USEChanges the database context to the specified database (Switch between databases)USE database_name

Database Objects

Data Types

Data Types are an attribute that specifies the type of data an object can hold, such as date and time, binary strings, monetary data, integer data. Data Types have several characteristics:

  • Each column, local variable, expression, and parameter has a related data type.
  • Data type of the results is determined by the applying rules of data type precedence to data types of input expressions (handled by the SQL Server engine).
  • Collation of result is determined by rules of the collation precedence when result data type is char, varchar, text, nchar, ntext.
  • Precision, scale, and length of a result depend on precision, scale, and length of input expressions.
Data TypeStatementDescriptionStorage
Approximate NumericFLOATApproximate-number data type to store the mantissa of the float number in scientific notation and dictates the precision and storage size. The default value of n = 53.FLOAT[(n)]: (n = 1 to 24) 7 digits and 4 bytes
FLOAT[(n)]: (n = 25 to 53) 15 digits and 8 bytes
Binary StringsBINARYFixed-length binary data.BINARY[(n)]: (n = 1 to 8000) n bytes
Character StringsCHARFixed-length string data.CHAR[(n)]: (n =1 to 8000 bytes) n x 2 bytes
Date and TimeDATEDefines a date in SQL Server 
Exact NumericTINYINT, SMALLINT, INT, BIGINTExact-number data types that use integer data. Always use the smallest data type for your purpose!TINYINT: 0 to 255 : 1 byte
SMALLINT: -2^15 to 2^15 : 2 bytes
INT: 2^31 to 2^32 : 4 bytes
BIGINT: -2^63 to 2^63 : 8 bytes
MiscellaneousCURSORFor variables or stored procedure OUTPUT parameters that contain a reference to a cursor. 
Unicode Character StringsNCHARFixed-length string data.NCHAR[(n)]: (n = string length in byte-pairs, n = 1 to 4000) n x 2 bytes
 DECIMAL, NUMERICFixed precision and scale. Scale is number of decimals to the right of a decimal point (1 to 38 bytes with a default of 18)-10^38 + 1 to 10^38 -1
 SMALLMONEY, MONEYAccurate to a ten-thousandth. Use a period to separate monetary units such as cents.SMALLMONEY: 4 bytes
MONEY: 8 bytes
 REAL7 digits of precision making it identical to float(24).4 bytes
 NVARCHARVariable-length string data.NVARCHAR[(n)]: (n = string length in byte-pairs, n = 1 to 4000) n x 2 bytes
 IMAGEVariable-length binary blob data (images, documents, files, etc).IMAGE[(n)]: (n = 0 to 2^31 -1 bytes) 2GB
 VARBINARYVariable-length binary data.VARBINARY[(n)]: (n = 1 to 8000)
VARBINARY[(MAX)]: (n = 1 to 2^31 -1) 2GB, storage is the actual length of the data + 2 bytes.
 DATETIMEDate with time of day with fractional seconds and based on 24-hour clock. 
 DATETIME2Expanded datetime with larder date range, default fractional precision, and user-specified precision. 
 DATETIMEOFFSETDate and time of day that has the time zone awareness and based on 24-hour clock. 
 SMALLDATETIMEDatetime without seconds or fractional seconds. 
 VARCHARVariable-length string data.VARCHAR[(n)]: (n = 1 to 8000)
VARCHAR[(MAX)]: (n = 1 to 2^31 -1) 2GB, storage is the actual length of the data + 2 bytes.
 TEXTVariable-length non-Unicode data in the code page of the server2^31 -1 (2GB)
 ROWVERSIONExposes automatically generated, unique binary numbers within a database that are generally used a mechanism for version-stamping table rows. 
 HIERARCHYIDVariable-length, system data type that represents a position in a tree hierarchy. 
 UNIQUEGUIDIDENTIFIER6-byte GUID created from NEWID or NEWSEQUENTIALID functions or converting a string constant in a certain format. 
 SQL_VARIANTStores values of various SQL Server-supported data types. 
 XMLStores XML data 
 SPATIAL GEOMETRY AND GEOGRAPHY TABLESImplemented as a .NET common language runtime (CLR) data type for storing location specific data. 
 TABLEPrimarily used for temporary storage of a set of rows returned as the result of a table-valued function. 

Tables and Views

Tables

  • Tables contain all the data in the database in a row-and-column format
  • A table can have 1024 columns, and 30000 columns if using SPARSE
  • Assign properties to a table and columns to use compression (by row or page)

Views

  • Virtual table defined by a query
  • Similar to a table with named columns and rows that are produced dynamically when referenced
  • Acts like a filter on tables and can reference multiple tables and provide a level of security (access to pre-created views can be assigned to a user instead of giving a user access to tables)

Stored Procedures and Functions

Stored Procedures

1
(%DATABASE%\Programmability\Stored Procedures)

A subroutine available to applications that access a DB System, when creating a stored procedure CTL+SHIFT+M can be used to enter a templated view.

  • A group of one or more Transact-SQL (T-SQL) statements or a reference to a Microsoft .Net Framework CLR method.
  • Can accept input parameters and return multiple values as output parameters
  • Have programming statements that perform operations within the database
  • Return a status value indicating success or failure

Using stored procedures has a few benefits.

  • Reduces server/client network traffic by saving the procedures on the server and only return the result to the client.
  • Stronger security by containing stored procedures that have been validated and tested. Avoids SQL injection attackers where a client could manifest their own query.
  • Reuse of Code – Write a stored procedure once and use it with multiple applications instead of writing the same code into multiple applications.
  • Easier maintenance for applications, no changes in the client application
  • Improved performance, speeds up network utilization and server execution.

Functions

1
(%DATABASE%\Programmability\Functions)

Functions are routines that accept parameters, perform and action, and return the result of that action as a value.

  • Allows for modular programming, faster execution, reduction of network traffic, scalar function, table-valued functions and system functions to return needed information.

Stored Procedure Example

Data Manipulation

Data Selection

Selecting data is the most common statement used by applications to query data in the database. The below table displays the common statement and query clauses.

StatementDescriptionExample
SELECTRetrieves rows from the database and enables you to select one or more columns or rows from one or many tables.SELECT * INTO copyoftableName FROM tableName
FROMSpecifies the tables, views, derived tables, and joined tables using in DELETE, SELECT, and UPDATE statements.SELECT SalesOrderID, SUM(LineTotal) AS SubTotal
FROM Sales.SalesOrderDetail
GROUP BYClause that divides the query result into groups of rows, typically for performing aggregations.SELECT SalesOrderID, SUM(LineTotal) AS SubTotal
FROM Sales.SalesOrderDetail
GROUP BY SalesOrderID
HAVINGTypically used with GROUP BY in a SELECT statement as a search condition for a group or an aggregate.SELECT SalesOrderID, SUM(LineTotal) AS SubTotal
FROM Sales.SalesOrderDetail
GROUP BY SalesOrderID
HAVING SUM(LineTotal) > 100000.00
ORDER BYOrders result set of a query by the specific column list.SELECT SalesOrderID, SalesOrderItem
FROM Sale.SalesOrderID
ORDER BY SalesOrderID
UNIONCombines results of two or more queries into a single results set that has all rows of each query in the union. 
EXCEPTReturns distinct rows from left input query that aren’t output by right input query. 
INTERSECTReturns distinct rows that are output by both left and righty input queries. 
JOINDefines the way two tables are related in a query. 
TRUNCATERemoves all rows from a table or specified partitions of a table but doesn’t log row deletions making it much faster than DELETE if removing all rows in a table by using fewer system and transaction log resources.TRUNCATE TABLE TableName

Data Modification

Changes existing data in a table or view, covered in above section on DML.

Deleting Data

Used to delete existing records in a table, covered in above section on DML.

DELETE
Permissions are required on the target table as well as SELECT permissions if the statement contains a WHERE clause (and if no WHERE clause is used, all records will be deleted).
TRUNCATE
Is another statement used to delete rows and requires the ALTER permission on the specific table.

Inserting Data

Allows you to insert new records into a table. If adding values for all columns in a table, there is no need to specify the column names, otherwise order of values must match order of columns.

INSERT permission is required.

Data Storage

Normalization

Database design technique that makes databases more efficient. Resolves issues of data redundancy and improves storage, data integrity and scalability. Also helps reduce INSERT, UPDATE, and DELETE anomalies.

Normalization is structured under certain ‘Normal Forms’ as per the table below. Most organizations will design their database to 3NF.

StatementDescription
First Normal Form (1NF)Eliminated repeating groups. No duplicate records exist in table and no multi-valued attributes. All entries in the column are of the same data type.
Second Normal Form (2NF)Table is already 1NF, all partial dependencies are removed to another table(s).
Third Normal Form (3NF)Table is already 2NF, eliminate non-dependent columns and transient dependencies.
Fourth Normal Form (4NF)Already 3NF, no independent multiple relationships.
Fifth Normal Form (5NF)Already 4NF, isolate semantically-related relationships.

Keys

Keys are the set of attributes that used to identify the specific row in a table and to find or create the relation between two or more tables i.e keys identify the rows by combining one or more columns. The table below identifies the main three keys:

Key TypeDescription
Primary Key (PK)Uniquely identifies each record in a table.
– Cannot contain NULL values, must be UNIQUE
– Only one Primary Key per table, using single or multiple fields
Foreign Key (FK)Not just to a primary key constraint in another table, can reference columns of a UNIQUE constraint in another table.
– Can only reference tables within the same database on the same server
– Self-reference is allowed to reference another column in the same table
– Column-level constraint must list only one reference column and be the same data type
– Table-level constraint needs same number of reference columns and data types
Composite KeyA key that includes multiple columns. Can be designated as the primary key if the combination of the columns result in a unique composite value for every row in the table. Commonly used in tables designed with many to many relationships.

Database Administration

Authentication, Users, Roles, and Permissions

How to connect and login to access resources. There are multiple methods to authenticate.

1
%SQL_SERVER%\Security\Logins\
Authentication MethodDescriptionAdditional
Windows AuthenticationCommonly referred to as Integrated SecurityPreferred method as it provides a single sign-on for the OS and SQL Server. Windows Groups can be used to manage access.
SQL Server and Windows AuthenticationCommonly referred to as Mixed-mode SecuritySQL Logins allow application-specific security and mixes OS access.
– SQL Server Login: Server Role
– Database User: Role, Application Role, or Group

SQL Server Authentication

AccountDescription
SysadminSuperuser role for the Server
Database UsersWindows Logins
SQL Logins
Windows Groups (AD)
User without login
Database Owner (DBO)Database Owner, mapped to Sysadmin
GuestLogins without mapping are guest accounts. Recommended to disable.

SQL Server Accounts

RoleDescription
Db_ownerPerform all configuration and maintenance, including dropping a database
Db_securityadminModify role membership and manage permissions
Db_accessadminAdd or remove access to the database for Windows logins or groups, and SQL Server logins
Db_backupoperatorCan backup the database
Db_ddladminCan run any DDL command in a database
Db_datawriterCan add, delete, or change data in all user tables
Db_datareaderCan read all data from all user tables
Db_denydatawriterRestricts modification of data explicitly
Db_denydatareaderRestricts the reading of data from a database
PublicAssigned to all users

Database Roles

PermissionDescription
GRANTPositive privilege
DENYNegative privilege
REVOKENegates a GRANT or DENY
ALTERMake changes to database objects
BACKUPBackup Objects
CONTROLGrants full access to objects
CREATECreate Objects
DELETEDelete objects
DROPDrop databases or tables
SELECTSelect or read data from a table or view
ENDPOINTs (Instance-Level)Sets configuration for access to the SQL Server instance. TCP/IP or network restrictions.
ASSEMBLY and SERVICEs (Database-Level)ASSEMBLY references a .DLL, uploading an ASSEMBLY is the first step to allowing application roles and client access.
TABLEs, VIEWs, PROCEDUREs, and QUEUEs (Schema-Level)Allows the locking down of access to specific tables, views, etc.
(Column-Level)Restrict permissions to only specific columns

Permissions

There are many permissions that can be applied to an account in a granular access model. ~200 exposed permissions. Common permissions are as above in the Permissions table.

Maintenance Tasks

Considerations to take when maintaining a SQL Server instance are typically based in the following table:

TaskDescription
StatisticsUsed by the Query Optimize to compile an optimal execution plan. “Auto Update Statistics” is built-in and available to automatically do this.
IndexFragmentation naturally occurs due to inserts, updates, and deletes and can impact performance negatively. To control fragmentation, the following methods can be employed.
– Rebuild, reorganize, or disable-and-rebuild the index
– Database Maintenance Plans
– Custom Scripts in a SQL Agent Job
– Third party tools
Consistency ChecksCorruption can occur, typically based on IO (disk Input/output). There are built-in checks to assist in identifying database corruption.
– DBCC CHECKDB
– DBCC CHECKALLOC
– DBCC CHECKCATALOG
– DBCC CHECKFILEGROUP
– DBCC CHECKTABLE

Backups and Disaster Recovery (DR)

Backup TypeDescriptionCode Example
FullAll data in the database is backed upBACKUP DATABASE
DifferentialOnly backs up changed data from the last Full BackupBACKUP DATABASE
WITH DIFFERENTIAL
Transaction LogA log of all transactions carried out on the databaseBACKUP LOG
Copy-Only BackupAn ‘out-of-band’ backup that can be used to an immediate backup that doesn’t impact previously set backup strategies. 
This post is licensed under CC BY 4.0 by the author.