Part 1:- Learning T SQL for beginners- SQL, Tables and Nulls

Learning SQL is easy when you have a good understanding of the basics. Everything else is based on the basics and it’s been the same since RDBMS (relational database management systems) have been launched. In this series of posts I hope to provide you with a good understanding of the basics and help you start your journey to becoming a database professional.

This series is meant to be bare bones and covers the minimum amount of topics you need to be aware of – in order to be able to start working with relational databases. If you find words like SQL Server, RDBMS etc. difficult to digest, don’t worry. It will make sense as we progress. Let’s start from the words SQL, it stands for Structured Query Language. Often SQL is pronounced Sequel (see – quell). Time to break it down even further, starting with Language.

Let’s Talk

Language is a set of words used to define objects (like a place or person) and actions (like walking, talking). In the context of what we are discussing the object is a database, table etc. and the Action is querying.

This brings us to the next word. Query is the action of requesting information. This means we have a language with a set of words used to request information. Our next question should be from where? Where are we requesting this information from? This takes us to the final word Structured. Structured basically means anything with a definite form. Something with a dependable shape. One of the most dependable shapes we have is that of a table. A fixed set of rows and columns in which you can store data. Does this remind you of an excel sheet. It should.

So what we have learnt so far is that SQL is structured query language and it’s a set of words used to request information from a structured ( well defined) table.

Before we start looking in to the words we can use to query the information (like select, where etc.) , let’s look at the table now.

Table for two?

There are a few things that are immediately obvious when we try to define a table. Every table has to have the following components

  • Every table has to have a name
  • Every table has to have at least 1 or more columns
  • Every table has to have at least 1 or more rows
  • Every Column should store the same type of information
  • If its store the same type of information in a column the column should have a fixed DATA TYPE.

The first three are needed simply in order to let the computer know what we are searching for, think of it like the address of a house. Without this information the postman wouldn’t know where to go and even if he is picking / dropping mail at the right place. Whenever I say table I want you to imagine an excel sheet. A natural way we go about creating an excel sheet is to have the column name on the first row. Each column name defines a specific type of information, as shown below.

Let’s explore the above table created within excel a bit more closely.

The first column has an Id with some kind of incremental data i.e. the first row has an id of 1, the second and id of 2 etc. This is an important part of the table but it needs a proper introduction so let’s keep it aside for now. The second column contain the name of a person. We know this because we have named the column “Name”. Simply by looking at the column name I can take an educated guess what kind of data is stored in it. Naturally an advantage of following this approach, is I know to expect a name in the rows belonging to this column. If I find anything that doesn’t look like a name in this field (the point of intersection of a row and column- also called cell) I can assume the data is bad. However the data in this column cannot be “strongly typed”.

Wait, what the heck is strongly typed?

What kind of characters would you expect to find in a Name? Your immediate answer may be alphabets. This is often the case. For example my name is Jayanth J Kurup. In additional to alphabets you can also see 2 spaces. But what about King Richard III? Still alphabets. Unless you write it as 3rd. How about James Jr. Now we have a period in the name. What if I use a nickname like “Hurricane” Carter? Now we have quotes too. As you can see a field like name is difficult to nail down. We assume it is alphabets but it’s not mandatory. Don’t even get me started on Elon Musk’s kids Name. Such pieces of data can’t be strongly typed. Because the type of data doesn’t fit neatly into one clear definition like text, date or number.

On the other hand if I say today is 2021-30-30. You know I am talking about a date and you know almost immediately the date I just mentioned in incorrect. Information that can be well defined is strongly typed. For example if I say the price of an apple is BG4% instead of 12.43 you know right away that I have given you bad information without having to investigate if the information I gave you plausibly correct.

If you noticed row 3 of the above screenshot you will have seen the same behavior for Date of Birth. Because Jan 2021 cannot possible have 34 days. So excel automatically understood this and stopped formatting the data as a date (check row 1 and 2). In this case I have violated the fourth condition of my table. The column doesn’t contain only dates. Well?? Actually it does contain dates – I just entered a bad date here.

So far we have created a table , added rows and column , named the columns so that we can understand what kind of information is stored in it and defined the nature of the data as being either text, date or number ( there is more to this , but we will talk about it in the next post). Let’s look at the column for phone number.

What are you thinking about?

In the above screenshot you can see that row 3 doesn’t have a phone number, instead a text called NULL shows up. So what is NULL? The best way I can define null for you is when your partner asks you what you are thinking and you say nothing, but they persist that you had to be thinking of something and you insist nothing, why? Because your mind was literally blank. In database talk NULL is undefined, this means it has not specific value and therefore it’s like asking a blind person what their favorite color is. It has no clear answer simply because there is no definition for it. Often people mistake NULL for empty string. An empty string is a hidden character often implemented by inserting ” into the field. In excel this would be similar to a blanks cell. Unfortunately while a blank space makes sense to a human being because we a visually looking at the data to a computer a blank space is also a value. This is an important concept so I am going to try an explain this once again using a different approach.

Imagine I asked you the question, Do you think I am a genius? You could answer Yes, in which case I would save the value “Yes” in the Table, you could answer No, in which case I would store the value “No” in the table. Or you could choose to remain silent because the question is so outrageous you won’t dignify it with an answer. In this case I would save a “NULL” in the table to indicate you didn’t answer. Similar to how in the above screenshot Jill chose not to give out her number.

Let’s recap before we move on the next section

  • We now know what SQL is used for.
  • We know the basic stuff a table is made off
  • We understand the use of datatypes ( text , date , number) and the role it plays in validating the data
  • We have a basic idea of the purpose of NULLS

In the next section we explore the tables in a bit more detail before we start with our first SQL Keyword.

Leave a Reply