Data Structures: From Arrays to Hash Table

Erdeniz Tunç
7 min readApr 15, 2023

--

Data structures is a concept in computer science that refers to how data is stored and organized. Let’s say we have a lot of data and we want to process this data on our computer. However, instead of storing these data individually, we need to store them in a certain order. Here data structures are structures designed to store and manipulate data in an organized manner.
For example, suppose we have a set of data. Instead of storing these data individually, we can store the data as an array. Arrays are a data structure used to hold data of the same type together. In this way, we can process faster and more efficiently than storing data individually.
To give another example, let’s consider a phone book. There is a name and phone number for each contact in the phone book. Instead of storing this data individually, we can use a data structure that unifies and connects each person. The name of this structure is a structure called “linked list”. In this way, we can search for contacts in the phone book more quickly and efficiently.
In short, data structures are a structured method for storing, organizing, and manipulating data. These structures are designed to process data more quickly and efficiently.

Arrays

  • Arrays is a data structure that allows fixed size data to be stored.
  • Dynamic arrays, on the other hand, are an array type whose size can be adjusted dynamically.

Linked Lists

They are structures that allow us to keep data without having to be side by side. We do not need to open a new area in memory for the new element. In this scenario, unlike Array, the data may be distributed in the memory, but the last element has to give its address to the previous element.

Linked List and Array Comparison:

Array:

  1. Data is stored in sequential memory blocks.
  2. Elements can be accessed directly.
  3. It provides a more organized layout in memory and provides faster access. Reaching any element takes the same time (Random Access)
  4. Its dimensions are static, that is, its dimensions cannot be changed while the program is running. Suitable for static situations.
  5. Adding or removing elements may require resizing, which can be costly.
  6. Arrays take up less memory because they only hold elements.

Linked List:

  1. Data is stored in a linked structure called nodes.
  2. Nodes contain a data item and the address (pointer) of the next node.
  3. In order to go to the element we want to reach in the Linked-List, we need to visit the connected elements.
  4. Its dimensions are dynamic, that is, its dimensions can be changed while the program is running. It is better for use in cases where Add-Subtraction is excessive.
  5. Accessing the elements requires going through them, which can be a time-consuming process.
  6. Linked-Lists take up more memory because they hold addresses together with the element.

Adding and Deleting Elements in Linked List

Adding Element:

  • A new node is created and data item assignment is made.
  • For the append operation, the next pointer of the new node is replaced with the pointer of the next node, and the pointer of the new node is assigned to the next pointer of the previous node.
  • If the first element is to be added, the new node is placed at the beginning of the list and the head pointer is assigned to the new node.

Element Deletion:

  • The pointer to the previous node of the node to be deleted indicates the next node of the node to be deleted.
  • The previous pointer of the next node of the node to be deleted indicates the previous node of the node to be deleted.
  • If the first node is to be deleted, the head pointer is assigned to the next node of the node to be deleted.

Stack

Stack is a data structure that works according to last-in, first-out (LIFO) logic. That is, data is extracted starting with the most recently added item. Operations on the stack are called push (add) and pop (subtract). Data can be manipulated simply by accessing or modifying the top item. Stack is often used in situations such as function calls and process stacks.

To give a non-computer example of Stack, let’s say we pile up the dishes and form a pile. When the dishes pile up and it’s time to wash, the plate we put last becomes the plate we wash first. Push can be described as putting plates at the top of the stack, Pop can be described as taking plates to wash from the top of the stack.

Queue

Queue is a data structure that works according to the first-in, first-out (FIFO) logic. That is, the data is subtracted starting with the item added first. Operations in the queue are called enqueue (add) and dequeue (subtract). Data can be processed only by accessing or modifying the item added first. Queue is often used in situations such as transaction requests, transaction queues and online sorting.

To give an example of a queue outside the computer world, the queue analogy can be given as an example. If we see the Stack as a pile of dirty plates, we can think of the Queue, for example, as a market row. With this logic, first in, first out. For example, the person who goes to the cash register to buy a cookie in the market and then creates a line behind him will be the first person in and out of the line. The Enqueue operation can be described as the entry of a new person into the queue, and the exit of the person who came to the queue in Dequeue.

Hash Table

Arrays have a 0-based indexing. Although some programming languages have 1-based indexing, in general, 0-based indexing is used. Hash table is a data structure used for storing key-value pairs and providing fast access to data. Each key is mapped to an index using a hash function and values are assigned to these indexes. This makes searching and finding data much faster and more efficient. Hash tables generally provide high performance in search, insertion and deletion operations, and due to these features, they are used especially in large data sets.

For example, you can use a hash table to store the names and prices of the dishes on a restaurant’s menu. Each dish name is mapped to an index using a hash function and its price is assigned to that index. In this way, customers can quickly find out the price by naming the dish. For example, price information mapped to a specific index for the name “Hamburger” is stored in that index of the hash table and can be presented to the customer. This process can be done quickly thanks to the hash table.

In summary, we pass the existing data through a function and index it. We call this function hash function, and the array structure we combine with this function, Hash Table.

Hash Function

  1. If the keys we send are different but giving us the same results, it is not a hash function.
  2. If the keys sent to the function are the same but the result is different, it is not a hash function.
  3. The size of the array used for the Hash Table should be equal to the number of results given.

Hash Colliison

If the same number is generated from two different Hash Function values, this situation is called Collison.

While hash functions try to match different inputs to different hash values, sometimes different inputs can be mapped to the same hash value. In this case, the performance of the hash table may decrease and the search operations may be slowed down. As the number of collisions increases, the speed of finding what we are looking for decreases.

It should be a quality hash function to be less likely to encounter the collision problem. In this way, we obtain an efficient Hash Table.

To give a simple example, a hash function takes the names of students and maps them to a class number. However, the names of students named “John” and “James”, who are two students in the same class, can be mapped to the same hash value. In this case, the performance of the hash table degrades and it becomes difficult to match the student information correctly.

--

--

Erdeniz Tunç
Erdeniz Tunç

Written by Erdeniz Tunç

I share my notes. Especially in Product Management

No responses yet