Author: cyrus2281

Algorithms

Data Structures + Algorithms = Programs

Content

Recursion

top

Recursion: When you defined something in terms of itself.
a function that referese to itself inside it's fucntion

Good for tasks that have repeated subtasks, eg sorting and searching

Recursion Risk

Recursion can cause stack overflow

3 steps

  1. Identify the base case
  2. Identify the recursive case
  3. Get closer and closer and return when needed. Usually you have 2 returns

Anything that can be done recursivly can be done iteratively
Recursion can help the code be dry and more readable

Recursion are best for searching and sorting in trees

Tail Recursion

top

Tail recursion is a recursion that each stack is indepent, and does not rely on the response of the calling self.

If the programming language engine supports tail recurssion, it can have a space big o of 1.

Because the engine would remove the stack after it sees that it no longer it needs it.

But tail recursions are harder to implement, and rarely supported.

Sorting

top

2 Types:
Comparison sort
Non-Comparsion Sort



Comparison sort

Bubble Sort

top

Each pair compare to each other, the bigger one will be passed and it continue for the rest

Time Complexity Big O of n^2
Space Complexity Big O of 1
Usage: Rarely used, more educational aspects

Selection Sort

top

Goes through the list and mark the lowest value and bring it to beginning, and repeat

Time Complexity Big O of n^2
Space Complexity Big O of 1
Usage: Mostly used for teaching purposes

Insertion Sort

top

Time Complexity Big O of n^2
Space Complexity Big O of 1

best algorithm for nearly sorted lists
Usage: Small list

Merge Sort

top

Time Complexity Big O of n log n
Space Complexity Big O of n

uses divide and conquer method
Usage: For huge files that can be stored in memory

Quick Sort

top

Time Complexity Big O of n log n (worst case O(n^2) if pivot is the smallest or the largest)
Space Complexity Big O of log n

uses divide and conquer method
Usage: Most popular algorithm, but the worst case is very expensive

Tim Sort

top

Timsort is a hybrid stable sorting algorithm, derived from merge sort and insertion sort, designed to perform well on many kinds of real-world data. It was implemented by Tim Peters in 2002 for use in the Python programming language. The algorithm finds subsequences of the data that are already ordered (runs) and uses them to sort the remainder more efficiently. This is done by merging runs until certain criteria are fulfilled

Time Complexity Big O of n log n
Space Complexity Big O of n

uses a hybrid of merge and isertion sort
Usage: Python's default sort algorithm

Topological Sort

top

This sort algorithm is used for Directed Acylic Graphs (DAG)

It sorts nodes with least Indegrees to the most

Indegree: indegrees are the incoming vertexes to each node

How this works is that it first create an indegree list for all nodes. Then we start by removing nodes which have an indegree of 0. When removing a node we should also reduce the indegree of all other nodes depending on that. We keep going till all nodes have reach indegree of zero and removed.

Warning: if the graph is not acylic (there is a loop in it) we won't be able to remove all nodes. This can also be used to see if there is a loop in a graph

Non-comparsion Sort

top

uses the way data is stored in memory (binary)
It works only with integers in a small range

Examples are:

  1. Radix Sort : time complexity if O(nk), space O(n+k)
  2. Counting Sort: time complexity if O(n+k), space O(k)

Choosing a the most proper sorting algorithm for the following cases:

1 - Sort 10 schools around your house by distance: insertion sort

2 - eBay sorts listings by the current Bid amount: radix or counting sort

3 - Sport scores on ESPN: quick sort

4 - Massive database (can't fit all into memory) needs to sort through past year's user data: merge sort

5 - Almost sorted Udemy review data needs to update and add 2 new reviews: insertion sort

6 - Temperature Records for the past 50 years in Canada: radix or counting quick or sort insertion

7 - Large user name database needs to be sorted. Data is very random.: merge or quick sort

8 - You want to teach sorting for the first time: Bubble, selection sort

Searching

top

top

Sequentially goes through items can compare each item with the searching value

Time Complexity of big O of n

top

requires a sorted list

Time Complexity of big O of log n

top

Time Complexity of big O of n

for graph traversal

goes down through the tree level by level

for previous tree the order would be: 9 4 20 1 6 15 170

Cons:
Shortest path
Closer nodest
Prons:
More Memory

Depth First Search in 2-D array:

order of search: UP -> RIGHT -> DOWN -> LEFT


Breadth First Search in Graphs:


image-2.png

top

Time Complexity of big O of n

for tree traversal

goes down through one branch till there is no more childern, then goes back up and goes to the other childern

for previous tree the order would be: 9 4 1 6 20 15 170

Cons:
Less Memory
Does path exist?
Prons:
can get slow


101
| \
33 105
3 Types:
inorder - 33 101 105 (after left)
preorder 101 33 105 (at beginning)
postorder 33 105 101 (after right)

Breadth First Search in 2-D array:

order of search: UP -> RIGHT -> DOWN -> LEFT


Depth First Search in Graphs:


image.png

Choose between BFS and DFS:

If you know a solution is not far from the root of the tree : BFS

If the tree is very deep and solutions are rare : BFS

If the tree is very wide : DFS

If solutions are frequent but located deep in the tree : DFS

Determining whether a path exists between two nodes : DFS

Finding the shortest path : BFS

Weighted Methods

top

Bellman-Ford:

for solving shortest path because it can accommodate negative weight,
but it has a high time complexity of O (n^2)

Dijkstra:

It can not accommodate negative weight, but it is faster

Dynamic Programming

top

it is an optimization technique
If you have something you can cache, dynamic programming is the best chioce

A method of solving problems by breaking it down into a collection of sub-problems,
and then solving each of them only once,
and storing their solutions in case next time the same sub-problem occurs

Memoization ~ Caching

Bellman Ford

top

This method depends on dynamic progamming, and it is used for optimzation problems.

Find the shortest path from one node to the others. This method can handle negative weights, but will not work on a cycle witha a negative weight


N: Number of the nodes, E: the number of the edges

Time: O(N . E)

Space: O(N)

Algorithm:



Backtracking

top

Smilar to dynamic programming, but back tracking is used for

In backtracking there can be multiple solutions


There are two steps to backtracking:

  1. Add: Updating the situation
  2. Decision: Validating teh situation
    • True: Recursion: Going into the next step
    • False: Remove: Removing the current state

Greedy

top

For optimazation problems

It is used to find the path with the maximum or minimum value

Greedy method always goes for the biggest/smallest value

Dijkstra's Algorithm

top

A greedy method algorithm, used for optimzation problems.


Find the shortest path from one node to the others.
This method can not handle negative weights, because we do no go back to a vertex once we pass it


N: Number of the nodes, E: the number of the edges

Time: O(N + E Log E)

Space: O(N + E)

Algorithm:



Problem Solving Techniques

top

Caching (Hash Map)

top

We use a hash map with look up of O(n) to store the item we previously have seen

2 Pointer Technique

top

We traverse through an array from both ends till the two pointers reach each other.

Sliding Window

top

sliding window is a sub-list that runs over an underlying collection. I.e., if you have an array like

Floyd's Toroise and Hare

top

It is an effient algorithm for cycle detection in Linked List.

Needs 2 pointers. One (Tortoise) which moves at 1 step per iteration,
and Two (Hare) which moves 2 steps per iterations

The moment these two lap on the same node, we know we have a cycle

Then we create 2 new pointers at the begining and at the node where we found we have a cycle

We iterate from each point to the next node by one step till they lap on the same node, that node is where the loop started

Divide & Conquer

top

Solving a problem by splitting it into smaller problems and then solving each of them.

Divide & Conquer must have 3 characteristics:

  1. Multi-branched recursion
  2. Breaks a problem into multiple smaller but same sub-problems
  3. Combines the solutions of sub-problems it into the solution for the original problem

example:

References

top

Instructors

Aditional Resources