Hello Data

Lesson 1

As many may wonder- especially people outside tech- what do mean by “ data” so, what is Data ? data are facts or distributed values that are collected by observations and measurements. A bright person would question “ isn’t that a long way to say information ? “ But an ultra-brighter person would know that data are the raw facts. On the other hand information are the meaningful facts therefore we can consider information to be the data that was put together to form a single truth. 


Cool ? no! we have to introduce what is knowledge - dear reader a good habit of a future analyst is to question everything .. including yourself .. probably - . Knowledge is defined as "sense or understanding acquired from a well-organized body of information or facts" 

Example : 

*wisdom = admitting that orange cats are the best :D

Types of Data 

Data has 2 main types that divide into other categories : quantitative and qualitative. 

  • Quantitative Data is data that can be counted or measured in numerical values. The two main types of quantitative data are discrete data and continuous data.

    1- Discrete data is a numerical type of data that includes whole, concrete numbers with specific and fixed data values determined by counting. For example, number of fingers. 

    2- Continuous data includes complex numbers and varying data values measured over a particular time interval. For example, temperature. 

  • Qualitative Data is non-numeric information. It can be texts, sounds, images, videos. 

Where can we find data

How many times have you gone to the supermarket needing one specific thing and come back home getting everything except that one thing :) .. so you decide to write your grocery list on a piece of paper or in your smartphone notes. 


In a similar manner we don’t keep the data floating in the air assuming that we are an ultra omega super heroes who knows it and remember it always .. we use word document and excel sheets. 


Data are stored in files which are computer resource for recording data in a computer storage device, primarily identified by its file name.


Data files comes in various types, some are documents, PDF, excel, and like any content in the web a JavaScript Object Notation (JSON) which is a standard text-based format for representing structured data based on JavaScript object syntax. It is commonly used for transmitting data in web applications.


Each file type have its own format which is indicated at the end of the file name- also called file extension- . 

how’s the poster, I made it by my self ;D .. multi talent اجل ها ..  sorry :D

Lets imagine a library building, it would contain : 

The library > Books > Chapters > Words 


Exactly like data we can say 

Data repository > Database > Dataset > Data

Notebooks comes in types like drawing notebooks, math notebooks, etc. each was created to serve a purpose. so does data repository types. 


Data repository types are : 

  1. Relational database

  2. Data warehouse

  3. Data lake

  4. Data mart

  5. Operational data store


Relational database management system (RDBMS) Data is stored in row-based tables using normalization, primary keys, foreign keys and constraints to ensure the reliability of the data. Structured Query Language (SQL) is used to find, access and manipulate the data - read, update, delete, create-.  


Data lake is a centralized repository designed to store, process, and secure large amounts of structured, semistructured, and unstructured data. - المكب :D -


Data mart is a data storage system that contains information specific to an organization's business unit. For example, stock market prices for 2022.


Operational data store (ODS ) is a type of database that's often used as an interim logical area for a data warehouse. ODSes are designed to integrate data from multiple sources for lightweight data processing activities such as operational reporting and real-time analysis.


Data Warehouses are a decision support database that is maintained separately from the organization’s operational database. It support information processing by providing a solid platform of consolidated, historical data for analysis.

A great cheat sheet created by insideinfo

When we speak about data in the mention of databases we must note the way it was stored. Take your wardrobe as an example, are the clothes organized based on seasons? or are they just folded and put all together so the wardrobe is semi-organized, or your brain is damaged and you throw your clothes and اللي يصير يصير.


We can classify data as following : 

  1. Structured data is when data is in a standardized format, has a well-defined structure, complies to a data model, follows a persistent order, and is easily accessed by humans and programs. This data type is generally stored in a database. For example, Excel files or SQL databases.

  2. Semi-structured data doesn’t follow the tabular structure associated with relational databases or other forms of data tables but does contain tags and metadata to separate semantic elements and establish hierarchies of records and fields. For example, emails and HTML. 

  3. Unstructured Data doesn’t have an information model and isn’t organized in any specific format. Some samples of unstructured data are displays, images, text documents, PDF files and videos.

This lesson is all about fundamentals of database concepts and terms so I really recommends studying more in this field from this book that I have studied in college. 

Also its necessary to learn how to code in SQL which in my opinion one of the easiest programming lang. 



Lesson 1 ends here if you decide to continue reading .. I warn you the rest is my overthinker self sharing some هواجيس