Intro To Databases

Last updated: 5/12/2021

What is a database ?

A database (commonly referred to as DB) is simply a repository for data. This essentially means that anywhere you can store data can be called a database. I know what you’re asking yourself – so is my smart phone a database because it stores data? My answer to you is NO.

With the significant advancements made in technology, we quickly see that there are variety of software’s created to fit specific needs. For instance, we have internet browser software’s such as Chrome, Firefox, IE, etc. Likewise, we also have database specific software’s such as Oracle Database, Microsoft SQL Server, PostgreSQL, etc.

So to be accurate in todays technology world we must elaborate on our previous definition of a database – a database is an organized collection of data, typically stored electronically in a computer system and is usually controlled by a database management system/software (also know as a DBMS)

Who manages a database ?

The person(s) responsible for managing database(s) is called a Database Administrator (also known as DBA). The DBA is a member within a companies IT department. The DBA is tasked with:

  • provisioning database (installation, creation, and configuration of DB)
  • capacity planning (ensuring the DB has adequate resources)
  • high availability (fault tolerance solutions to keep the DB always online)
  • proactive monitoring (close eye on the DB to resolve warnings before they become issues)
  • performance tuning (ensuring optimal DB performance)
  • security (enforcing DB security – authentication, access, audit)
  • automation (leveraging scripts to automate your DBA tasks)
  • backup + recovery (capturing backup of DB and restoring it incase of failures or as needed)

Why are databases needed ?

Databases are crucial to modern economies because it allow organizations/companies to stores critical details about the company such as employee records, transactional records, salary details, etc. A DB stores data whereby users can access it for information.

A DB stores and manages large amounts of data on a daily basis. This would not be possible using any other tool such as a spreadsheet – it cannot handle the workload. Modern DBMS are designed to not only store data, but to also address current security + technology concerns.

Let’s paint a real life picture. Your health records are stored in some hospitals database(s). Your financial records are stored in some banks database(s). What will happen to your health records or bank account if their corresponding databases were not available? The answer is simple, you would be very upset because the hospital would no longer have crucial details pertaining to your health and likewise your bank account will be empty. If data is “considered the new gold”, then the database can be considered as a vault with the DBA serving as the capable security guard.

What’s a transaction ?

It is generally accepted within the IT world that a database transaction represents some action/work performed against the database. These actions or workloads are executed by end-users via applications, which are then run against the db. See image below.

Let’s see a real world example. You log into your bank account to view a specific transaction. You’ve essentially performed some action against the database via your banking app/website because app/website will now query the database inorder to fulfill your request. This is a very simple example, however there are different types of db transactions/workloads … we’ll get into that in later chapters.

Transactional vs Analytical Databases

Databases are built for either transactional workloads or analytical workloads. A transactional database is known as Online Transaction Processing (OLTP) and an analytical database is known as Online Analytical Processing.

OLTP database workloads encompass the day-to-day business transactions that are occurring in real-time, such as purchases being made by large numbers of customers.

On the other hand, OLAP database workloads are intended for business intelligence (BI) and data mining, such as when an analyst wants to look at an aggregate of purchases over a specified time period. OLAP databases are also referred to as data warehouses.

Data is generated by end-users via OLTP systems. This data is then extracted from OLTP db, transformed as needed, and finally loaded into OLAP db so that it can be used by business decision making systems. This process is known as ETL (Extract, Transform, Load). See the image below.

What’re the different IT environments ?

You may have heard IT folks talking about the different environments they support. These environment align with the life-cycle of the system. We have three generally accepted environments – Development (Dev), Testing (Test), and Production/Operations (Prod/Ops). Please note that some companies may include other environments (such as pre-production or staging) based on the complexity/nature of their systems and life-cycles.

Here’s my simple explanation for the three general environments:

  • Dev environment is where we build stuff. We gather requirements and build the system to meet a certain specification. DBA’ supporting a dev environment should be familiar with database development practices.
  • Test environment is where we test stuff. We perform various test scenarios and test cases such as functionality test, performance test, penetration + vulnerability test. DBA’s supporting a test environment should be familiar with capturing metrics from test runs as well as providing fix/recommendations for issues identified during testing cycle.
  • Prod environment is where we “go live”. Whatever we’ve built + tested is made available for end-user consumption. DBA’s supporting this environment should be familiar daily administration tasks of the database (check out the dba duties). Also, most production systems have a 24×7 service level – meaning that most production DBA’s have an on-call rotation incase of after hours support issues.

It’s a well known fact that prod environments are the most critical to and have immediate impact on a companies business operation. Just think how catastrophic it would be if your banks production database crashed. Such a situation obviously has an immediate and greater business impact than your banks dev/test database crashing.