In this SQL tutorial, we will look at the most common and simplest task of a new DBA; creating your first database table. We will look at some do’s and don’ts while creating a simple table as well as adding constraints and schemas. We will also explain the steps that are required in the create table operation, what each step does and some pitfalls to avoid.
To complete the steps in this tutorial, you will need access to a SQL Server instance and SSMS. If you do not have them installed, refer to the articles "Install SQL Server 2019 Developer Edition" and "Install SSMS ver. 18" in this blog, in that specific order.
They can be found by clicking on the "Home" or "All Post" buttons on the menu bar.
Creating a Test Database
Before we can create a database table, we will first need a database to store it in; this leads to our first step in this tutorial, creating a sample database to work within. A database is an organized collection of data; it’s where we store our tables, views, data, etc. Tables are one of the most used objects in a database. There are only a few things you can do with a database without first creating a table. Consequently, we have the “create database” as our first step before diving right in to the CREATE TABLE statement.
There are several methods at your disposal for creating a database. Here, we will briefly cover the two simplest and most common ways of completing that task before moving to the main subject of creating a database table in SQL.
Hint: Option 2 is the preferred method for most DBA’s because it allows for the DBA to create default settings such as, the location of the database files, ownership, etc. that can be stored in a template and used repeatedly.
Option 1: Creating a database with SSMS.
Open SSMS and connect to your SQL Server instance. Once you are there, look in the “Object Explorer” and right click on “Databases” and select “New Database” from the drop-down menu (as shown in figure 1).
When the New Database dialogue box appears, as shown below in figure 2, type your new database name in the text box and click the “OK” button. For this tutorial, I have created a database named “MyTestDB”. You can name yours as you see fit.
Option 2: Creating a database with T-SQL.
This option is slightly more difficult, but it is the preferred method for almost all DBA’s. Open your SSMS (SQL Server Management Studio) and click on “New Query” in the tool bar (as shown in figure 3).
Now that we have a clean query window open, type the following command in the new window.
- SQL / T-SQL is not case sensitive.
Although you don’t need to use upper and lower case, it’s a good habit to get into to make your code more readable. Notice the case format in the code block above. We used uppercase letters for keywords and commands, and we used Pascal Case for the database name. You will want to follow that pattern in your DBA career.
It is highly recommended that you use either Pascal or Camel case when naming your database objects, such as tables, views, stored procedures, etc.
The Structure of a Basic Table
Tables are the basic structure of a database where data is stored. Often referred to as the heart of a database, tables are the one core element that almost all other objects are built upon. The database table itself is an object as well. So, what is an “object”? In its simplest form, an object is any SQL Server resource, such as tables, views, triggers, stored procedures, and so on. Much like Excel sheets, tables are divided into rows and columns. The intersection where a row and column meet is called a “cell” in Excel, but it’s called a “tuple” in SQL. The following image breaks down the core elements of a database table.
The red boxes are the columns.
The green boxes are the rows.
The blue boxes, where the columns and rows intersect are tuples.
Although database tables may look like an Excel sheet, there are a lot of differences between the two. One such difference is the “Normal Forms” of a database table that do not exist in an Excel sheet. Although there are basically two normal forms in Excel, referred to as “First Normal Form” and “Second Normal Form”, there are five levels of Normal Form that are used with SQL tables. Most of the time you will only be dealing with the “Third Normal Form”. We will not delve too far into Normal Forms in this SQL tutorial since we are primarily focusing on the basics of creating a database table. However, if you would like to learn more about Normal Forms, and you really need to, you can read these articles;
Data Modeling: Understanding First Normal Form in SQL Server
by K. Brian Kelley
SQL Server Logical Data Model
by Armando Prato
Create Table using T-SQL – Basic Syntax
Now that we understand that a database table is very similar to an excel sheet, on this level, let’s move on to understanding how to create a basic database table using T-SQL. In the code block below, we have the basic syntax for creating a table with 3 columns:Code Block 1
The sample provided in “Code Block 1” is just a generic structure of how to build a simple table. It’s the cornerstone on which all DBAs learn to create their first table. Notice the similarity to the code found in Code Block 2 below in lines 4 through 9.
In the following section we look at a simple CREATE TABLE script and break down its components line by line for an easy explanation of what each part does. We will use the code block in figure 5 as our training sample.
Breaking it down, line by line:
Line 1: The “USE” keyword is used to select the database we want to create the table on. In this case, it’s the “MyTestDB” database that we created earlier.
Line 2 and 9 The “GO” command is what is known as a block terminator. It simply terminates that block of code before starting the next block. SQL uses the word “GO”, not by accident, but as an indicator for SQL to “go” to the next step. You could change the word “GO” to anything you would like. For example, you could change it to “STOP” or “END”. As far as the wording goes, it’s simply a phrase to visually see the end of a code block. If you would like to change your block terminator to “Stop” or something else, there is a simple walk-through at the bottom of this tutorial in the section titled “A Quick Sidestep”.
Line 3: This line was left blank simply for readability’s sake.
Line 4: Here, we are calling the “CREATE TABLE” command followed by the “table name”. In this sample, we are naming the table “myEmployees” in order to simulate a possible real-world database table. Be very careful when naming your tables, columns, views, etc. The following FYI section provides some basic guidelines that you must follow when creating a table in SQL.
- When naming your tables, there are a few guidelines that you must adhere to.
1. Table names cannot contain blank spaces.
2. Table names cannot start with numbers.
3. Table names cannot start with any special characters such as @, #, $, %, etc.
4. Table names can contain upper and lowercase letters in any combination.
5. Table names can contain special characters and/or numbers as long as the name does not start with any special characters and/or numbers.
6. Table names should be descriptive of the type of data that will be stored in the table.
7. Table names should not be named with reserved keywords or data types.
The table name must be followed with an opening and closing parenthesis. Within those parentheses, you will list the columns that you need along with their data types. (More on data types later)
Like the table name, the column names, lines 5-7 in figure 5, should be descriptive as well; and as mentioned earlier, should not be reserved keywords or data types. For example, if you are creating a table to hold a list of employees, you would use these column names; “firstName”, or “fName”, and “lastName” or “lName”, etc.
Line 5: This column has the name of “colID”. As you grow in your DBA role, you will find that you will always want to have at least one column with a unique ID. Something that differentiates one row from another. Like in the table that would be created by the code block in figure 5, you may find that you have two employees named John Smith. The “colID” separates them by assigning a unique row number to each one.
After the “colID” name, on that same row, we have a “INT” value which is short for “integer”. This is our data type. Data types specify what type of data will be stored in that column. You really need to know your data types, at least the basic data types that you will use daily. Refer to the section “Data Types” later in this tutorial for a list of the most commonly used data types. That list in table 1 is the minimum amount of data types that you need to know explicitly.
Continuing on line 5, after the data type, we have a SQL command called “IDENTITY(1, 1)”. This is telling SQL to auto populate that tuple with a unique and sequential number for each row added to the table. The (1, 1) tells SQL to start with the number 1 and increase the next row value by 1. The first value is referred to as the “seed” and the second value is the “increment” value. You can set these values to anything you like. For example, you could set the seed at 5 and increment by 5 like this: (5, 5). This would number your rows as: 5, 10, 15, and so on. Optionally, you can set the seed to 10 and the increment to 1 (10, 1). To save yourself some time when creating database tables, you can omit the (1, 1) and call the IDENTITY command without the parentheses. This will set the default (1, 1) value to the IDENTITY command automatically.
Continuing out to the right on line 5, you see the words “NOT NULL”. NOT NULL simply means that you cannot have any NULL or empty values in that column. It is important to note that NULL is not a value of any sorts. NULL simply means that there is a placeholder for a value that may or may not come later. If we have a column called “middleName” in our employee table, you may find that some of the employees do not have a middle name. When you query that table, those employees will have NULL in place of their middle name in the result set. If you know a column is likely to contain one or more NULL values, like we have with the middle name column, do not set it as “NOT NULL”. Only set the NOT NULL on columns that must contain a value when a row is added to the table.
The last item on line 5 of the code block sample is what is known as a “constraint”. You will encounter many different types of constraints in your career as a DBA and the PRIMARY KEY constraint is one that you should use on every table you create. A PRIMARY KEY constraint defines a unique identification of each record (row) in a table. Each table should have a primary key and each table can have only one primary key. The primary key constraint uniquely identifies each record in a database table, so primary keys must contain unique values and cannot contain NULL. We will cover Primary Keys in more detail later in the section named “Creating Primary Keys”.
Line 6 and 7 “firstName VARCHAR(20)” and “lastName VARCHAR(20)”. Here we see that, like with the “colID” column, “firstName” and “lastName” will be the names of our next two columns. The most important lesson you can learn here is to not use any of the reserved keywords for SQL nor any of the command (or statement) keywords nor any data types to name your columns. For example, do not use the keyword “user”, instead name the column “userName” or “employee”. The same holds true for SQL data types like “DATE”. Instead, use something like “hireDate” for your column name. Remember, the column names should be descriptive, but they should also be short and to the point.
As a DBA, you will need to be very familiar with keywords as well as data types. The following link lists the reserved keywords provided by Microsoft Docs: Reserved Keywords (Transact-SQL)
Now, let’s look back at that VARCHAR(20) statement. From the lessons above, we now know this is our data type for that column. But what is a VARCHAR and what does the (20) represent? Let’s start by breaking it down into two sections. The first section, VARCHAR, (as mentioned earlier) is the data type. In this case it is a Variable Character shortened to VARCHAR. This simply means that this data type will hold characters. How many characters is represented by the (20) that follows it.
Since this is a “variable” character data type, the amount of storage used will vary depending on the length of value in that tuple. Although we have 20 characters reserved for this tuple (in the column firstName), we may only use 4 of those 20 if the first name is John. A CHAR datatype on the other hand, will consume the required hard drive space for all 20 characters in a CHAR(20) data type even if the name is only 4 characters long.
So, what is the reason for using a CHAR data type when I could simply use varchar for everything? It boils down to performance and drive space usually. If you know your column will only hold 2 characters in every row, like the abbreviated state name (TX, OK, CA, etc.), then for performance and drive space reasons it would be a good idea to use the data type CHAR(2). However, with a column that will hold varying character lengths, such as a person’s name, you would benefit from the VARCHAR data type.
Line 8: this line holds the closing parenthesis that opened on line 4 as well as the terminating ; semicolon. Although you don’t need the semicolon, as mentioned earlier, it’s a best practice in all your SQL commands.
As you may have noticed when creating the “myEmployees” table, there is a prefix of “dbo.” preceding the table name. You can see this in your Object Explorer, usually the left pane, in your SSMS instance that you have been working in. The “dbo” is referred to as the “schema” name. “dbo” is short for DataBase Owner. All tables in a database must be associated with a schema. To better understand how the SQL Server Instance, database, schema, and tables are related, on a simple level, refer to the diagram below.
By default, SQL Server assigns the “dbo” schema to all tables unless you specify a specific schema such as a built-in schema or a schema that you created in your database. Below is a list of default schema names that are pre-built with each database you create. (Sample provided when using SQL Server 2019)
You can easily create your own schema and assign your new tables to that schema. If you are going to create a schema for your database tables to be attached to, you really should create the schema prior to creating the tables. Below is the code for creating a schema. This will create a new schema named “hr” (Human Resources) on the database we are working with.Code Block 3
Schema names must follow the same naming convention as table names, see the section above titled “Create Table using T-SQL – Basic Syntax” and read the FYI tip.
If you created your table with the default “dbo” schema, or you created the table on the wrong schema by mistake, there is an option called “ALTER SCHEMA” to re-assign your table to a different schema after the table has been created.
Using the source code in Code Block 2, where we created the table “myEmployees” in the “dbo” schema, we can now change the table to the “hr” schema that we just created in the code block 3. This is where the “ALTER SCHEMA” statement comes in handy.
In this sample, we are “altering” the “hr” schema by transferring the “myEmployees” table from the “dbo” schema to our “hr” schema.
There are a number of constraints that we can assign to tables, views, etc. such as a foreign key constraint, unique constraint, check constraint, default constraint, etc. but for simplicity’s sake, we will only cover the primary key constraint and the foreign key constraint since this tutorial is primarily about the basics of creating a database table.
Creating Primary Keys
The primary key constraint uniquely identifies each record (row) in a table. Primary keys must contain unique values in each row and cannot contain any NULL values. A table can have only one primary key. The primary key can be assigned to one or more columns, although it is most often associated with just one column.
Okay, now think about that last statement, read it again and make sure you understand what it is saying. As a new DBA some years ago, this is one that confused me. First, I was told that a primary key is assigned to a specific column and that I can only have one primary key per table. So naturally I’m thinking (like some of you are now) that I can only have one primary key per table and that key can only be assigned to one column. So read that first section again and pay attention to exactly what it is saying. True, I cannot have more than one primary key per table, however, that primary key can be assigned to multiple columns. I just wanted to make sure you understood that primary key rule clearly, since it was not made clear to me when I first started out as a DBA. Now, moving on.
You can assign a primary key by using SSMS or a T-SQL query. Creating a primary key on a column automatically creates a corresponding unique clustered index or nonclustered index, if specified. If you are not familiar with clustered and nonclustered indexes yet, not to worry, they will be handled automatically by SSMS in these basic samples. When creating a primary key on a table, there are a several options available to you. In this tutorial, we will cover two of the most popular options.
You can allow SSMS to create a generic primary key by simply calling the PRIMARY KEY function in line with the column creation. In the sample below, we declare a primary key on the column “colID”.
Code Block 5
Although this is an easy way to acquire a unique key name for every key you create, it produces a random number generated key name that is sometimes difficult to work with; should you need to alter that key. Here is a sample of what that process will produce for a key name.
Code Block 6
You can skip the PRIMARY KEY creation option in line with the row creation completely and assign the primary key after the table has been created. We do this by using the ALTER TABLE command. Let’s recreate our sample table without creating a primary key during the table creation. To do this, we will first need to DROP the table as discussed earlier in the section “Dropping a SQL Table”Code Block 7
Before we insert any data into the table, we first want to add our constraints that we want associated with the table. In the following code block, we are creating a primary key constraint on the “colID” column.Code Block 8
Line 1: Here, we are telling SSMS that we want to “alter” the “myEmployees” table, this starts the alter table process.
Line 2: Next, we are adding a constraint and naming that constraint with a name of our choice. Breaking down the name we chose, we use “PK” to denote that this is a primary key, then we follow that with a two-part naming convention.
Part 1: (myEmployees) is the name of the table.
Part 2: (colID) is the name of the column.
This process will always give us a unique name for our constraints because you cannot have two or more constraints or tables of the same name in the same database. This naming convention is much easier to read and to understand what type of constraint we are looking at, on what table, and what column the constraint is assigned to. So, you can see why option 2 is preferred over option 1.
Line 3: we are simply assigning our constraint, in this case, a primary key, to the “colID” column on our table.
Line 4: This is our batch terminator that we learned about earlier in this tutorial. Although this is optional, you really want to get in the habbit of adding it to end of all you code blocks in SQL.
Creating Foreign Keys
The technical version:
In a relational database, a foreign key is a field (or collection of fields) in one table that references a primary key in another table. Foreign keys can have duplicate values within it and can also contain NULL values, but only if the column is defined to accept NULLs.
The non-technical version:
So, if you didn’t really understand anything in the previous paragraph, don’t worry, this paragraph will explain what all that guru jargon was trying to say. First, let’s tackle that “relational database” term. A relational database has data that is grouped or related to other data within that database. It may be in the same table or a different table. Either way, one piece of data is directly or indirectly related to another piece of data. Below are a couple of great articles that go in depth on the subject of what a relational database is. You’ll want to read the tutorial on this site called "Understanding Relational Database Management System (RDBMS)".
What is a Relational Database? by Eric Blinn
What is a Relational Database Management System by Nai Biao Zhou
A foreign key must be linked to either a primary key or a unique key whether it be on the same table or a different table. A foreign key can not contain a value that does not already exist in the primary key. Here is a diagram to help explain.
In the above diagram, we see that there is a relationship between the two tables. The “deptID” in the “myEmployees” table has a FOREIGN KEY constraint linking to the “depID” PRIMARY KEY in the “empDepartment” table. Below is the code I used to generate these two tables along with the primary and foreign key constraints.
Remember, the foreign key column cannot contain a value that is not in the primary key column that it is referencing. Trying to insert a wrong value will result in the following error message.
“Msg 547, Level 16, State 0, Line 32
The INSERT statement conflicted with the FOREIGN KEY constraint "FK__myEmploye__deptI__03F0984C". The conflict occurred in database "Mytestdb", table "dbo.empDepartment", column 'deptID'.
The statement has been terminated. “
Since this tutorial is primarily about creating tables in SQL Server, I’ll not go any further into the foreign key constraints. I just wanted to touch on the primary and foreign key constraints since they often play a critical role in table creation. You can find a great article on foreign keys here:
How to Create a SQL Server Foreign Key by Greg Robidoux
Create Table Using Another Table (Copy Table)
If you want to make a backup of a table or copy a table, you can do this by creating a new table with the SQL “SELECT” command. The following code will create a new table “myEmployeesBackup” and insert all the data and columns from the original table “myEmployees”. Remember, you cannot have two or more tables with the same name, each table name must be unique. This will be helpful if you want to make a copy of a production table for your training.
Once you have a working copy of your production table, you can test different SQL scripts against the copy without affecting the original table. If your “test copy” gets too botched up to work with, you can always drop the table and create a new one from the original. Try to keep this option as a last resort. It’s always better to try and resolve any issues that arise. It provides a better, longer lasting training / learning method for your DBA career.
In the code block below, we are going to make a copy of our “myEmployees” table for the sake of experimenting.
line 1: we are using the SELECT * statement to select all columns and rows from our original table.
line 2: we specify which table we want to copy to. In this case, we are creating a new table named “myEmployeesBackup”. Remember, you can’t have two tables with the same name in the same database with the same schema name.
line 3: we are telling SSMS which table we want to make a copy of.
- When creating a copy of a table using the method mentioned above, it will not copy any Keys, Constraints, Triggers, Indexes, etc. It will only copy the rows (data) and columns of the original table.
You can have two tables with the same name provided that the schema is different. For example, we can create a table named “myEmployees” with the schema name of “dbo” and we can also create another table named “myEmployees” with the schema “hr”. So, the two table names would look like this:
However, this is not a recommended practice as it could lead to confusion and possibly inserting, updating or deleting data on the wrong table.
Okay, earlier you said that I can’t have two tables with the same name. Now you’re telling me that I can. Confused? Don’t be.
Using our samples above, the table “myEmployees”, is what we refer to as our “table name” in our day-to-day life. In reality; our table name is one part of a three-part naming convention in SQL. Think in terms of your name. Most of us have a first name, middle name, and last name. Likewise, a database table has an “instance name”, a “schema name”, and a “table name”.
When looking at the table name of “myEmployees” on my personal computer, the full three-part table name would be: DBAWS03.dbo.myEmployees”.
Now, let's break that down. DBAWS03 is my SQL Server instance name, “dbo” is my schema name, and “myEmployees” is the specific table name.
As you would expect, your “instance name” will be different on your computer. As we mentioned earlier, you cannot have two or more tables with the same name in the same database. This is just a general statement that we often throw out to get the point across. We really should be more specific when boasting this comment during training and say that “you can not have two or more tables with the same name within the same schema”.
DROPING a SQL TABLE
To delete a table in T-SQL, you use the command “DROP TABLE”. For example:
Okay, so that’s the simple answer, but things can get a little more complicated if you have other tables referencing the table you want to drop. Let’s say we have an “empStats” (Employee Status) table that contains a foreign key reference to our “myEmployees” table. If we try to drop the “myEmployees” table, we will receive an error like the one below.
SSMS tells us that the table we are trying to drop is referenced by another table through a foreign key constraint. We have a few options here; we can drop the foreign key constraint on the “empStats” table, or we can drop the whole “empStats” table and then drop our “myEmployees” table. Likewise, any views, stored procedures, etc. that are referencing the table we want to drop will also have to be dropped prior to dropping our table.
Dropping a table will also remove all data, indexes, triggers, etc. for that table. If you re-create that table, you must re-build all the rules, defaults, triggers, etc. that were originally assigned if you want them back in play.
You can use DELETE tableName or TRUNCATE tableName as well as the DROP TABLE tableName. Using DELETE or TRUNCATE will not permanently remove the table from the database. It still exists until you DROP the table.
So, what is a data type? A data type in SQL Server is defined as the type of data that a tuple or column can store. It is the kind of data that an object such as integer, character, string, etc. will hold.
Table 1 lists some of the more common data types that you are most likely to use on a daily basis as a DBA. There are more than 35 data types associated with T-SQL. The 10 data types listed in table 1 are going to be the ones you will need to know explicitly. In most environments, they will be your daily drivers.
Here is a more exhaustive list provided by Microsoft Docs online:
Data Types (Transact-SQL)
You can find a more in-depth articles on data types at the following links:
SQL Server Data Types Quick Reference Guide by Joe Gavin
SQL Server Data Types Tips
Inserting Data into a Table
Okay, now that we know some basics of building a database table, it does us no good if we don’t know how to insert data. In this section we will take a quick look at the process of populating our test table with some data.
If you have “dropped” your test table “myEmployees”, let’s rebuild it now. In the following code block we are going to create the primary key in a different way than what was described in the above sections. Here we are putting the primary key constraint in the CREATE TABLE command, but it’s placed at the bottom after all the columns are defined. This option, like the ALTER TABLE command, allows us to give the primary key constraint a unique name of our choosing.
In the code block above, you may have noticed a new section of code that you are not familiar with. That’s the “IF OBJECT_ID…..IS NOT NULL” section. Basically, what this is doing, it’s telling SSMS to drop the table if it already exists. Remember, we can’t have two tables with the same name, else we will get an error returned.
Now that we have our database table back online, let’s add some data to it.
A couple of things to note about our INSERT INTO statement. Here we specified the schema and table name, this is a must do in the real world, but you can get away with dropping the schema name in our test environment.
However, it’s best practice to get in the habit of calling the schema name even in a test scenario.
Secondly, notice that we didn’t list the “colID” column. That’s because, as you may recall, that column is auto populated by SSMS each time we insert a new row.
The character values inserted into the table must be within single quote marks. Double quote marks will return an error. Numeric data types will not need the single quotes unless you use them as string characters. For example, if I want to insert 2021 as a string character, you will place that value inside single quotes. ‘2021’ This is fine for some numeric values, but not if you want to add, subtract, multiply, divide, etc. with that value. You cannot do math functions on a string character.
In this SQL tutorial, you have learned how to use the Transact-SQL (T-SQL) CREATE TABLE statement to create a new table in a database as well as how to create a database. You also learned what a schema is and how it is tied to the database table. This tutorial also introduced you to data types and provided a little insight as to which ones are commonly used in a DBA’s daily tasks of creating database tables. Now you have opened yourself up to a world of possibilities in the DBA community, continue practicing on database table creation and modification and never stop learning.
A Quick Sidestep
If you want to change the wording of the batch terminator to something else, such as “END” for your own personal visualization, you can do so quite simply without affecting the performance of your SQL commands (code blocks) or their order of execution. To change the “GO” terminator to something else, such as “END”, open SSMS and from the toolbar, click on “Tools” then scroll down and click on “Options” as shown in figure 6 below.
When the “Options” dialogue box appears, expand the “Query Execution” option in the left pane (see figure 7) and select “SQL Server”. In the right pane, change the word “GO” to whatever you are comfortable with.
- While it is not necessary to terminate your blocks of code with a semi-colon and the “GO” terminator, it is a good habit to get into. It not only makes your scripts easy to read, but it can also prevent possible errors when running multiple blocks of code in succession.