Why Algorithms?

Many companies quiz you on these
On the job, knowledge of algorithms beyond the basics is unnecessary
It does tend to be a common staple of eng interviews

Typical Interview process

Phone screen with recruiter
Maybe two technical phone screens
Often a coding test
In office interviews

Algorithms interview advice

Often there's an obvious 'naive' solution
I start by describing that, in case I crash and burn trying to think of something better
I then start thinking about how the constraints of this particular problem might let me do better
Talk through the problem so the interviewer sees what you're thinking
These problems are expected to be tough; the interviewer will talk to you as you discuss the problem, so listen for hints

find_item

Find an item in a list
See solution
We'll be answering the question: How long does this solution take?

Big-Oh (Asymptotic Complexity)

Expresses solution time as a function of problem complexity
Problem complexity usually means the number of items
Big-Oh is not about absolute solution time (5sec, 10min, 1hr)
Big-Oh is about scalability: how the solution time scales as the problem instance gets harder

find_item complexity

What is the problem complexity a function of?
The number of items in the list, which we call n
As a function of n, how long does this algorithm take?
In the worst case, we must consider every item in the list
There are n items
We say the time complexity is O(n), or linear

Big-Oh (cont.)

We usually mostly care about the worst case time complexity
Time complexity is not measured in time, per se, but the number of primitive operations
- Primitive operations are basic math (+, -, *, /), comparison, swapping
- We assume all primitive operations take constant time, which we call O(1)
If our algorithm takes proportionally n constant time operations, we call it O(n)

Asymptotic behavior

We don't care about the exact number of operations, but the scalability of a solution
- Not about fast or slow, so much as will your algo explode when I throw big problems as it
For instance, O(n) time complexity means that if you double your problem size, your solution time will approximately double
Asymptotic behavior is about behavior in the limit
- For "large enough" problem instances, an asymptotically superior solution will eventually be faster

Asymptotic behavior (cont.)

We throw away all constants
- If algo A always takes twice as long as algo B, they have the same asymptotic time complexity
We throw away low order terms
- O(n^2 + n) = O(n^2)
- We care about the dominant behavior; for big enough problems, the lower order terms are negligible relative to higher order terms
- Ex: when n=100, n^2=10,000

find_min

What parameter expresses the problem size?
What is the time complexity?
Hint: time complexity is normally in loops, since loops will take longer as you loop through more stuff

find_two_min

What parameter expresses the problem size?
Still n; 2 is a fixed number
How does this compare, by wall clock, to find_min?
How does this compare, in Big-Oh, to find_min?
Both loop over n items
find_two_min does about twice as much work per item
But more importantly, they both do a constant amount of work per item

find_k_min

What parameters express the problem size?
Both n, the number of elements, and k, the number of mins to find
Outer loop: still n times
How much work inside the loop?
min_items.each_with_index.max hides a loop of size k
Complexity: O(nk) time
We would say this algorithm is linear in both n and k

find_zero_sum_pair

Doubly nested for loop, each considers all n elements
Means O(n^2) time complexity
As a practical matter, O(n^2) is not particularly scalable, quickly becomes intractable
Algorithms that, worst-case, consider every pair, are all O(n^2)
But there's a trick!

find_zero_sum_pair2

Two unnested for loops
How much work is done inside the for loops?
Sneak peek: O(1) (amortized) work
Total work: O(n)

Data structures

The most primitive operations are really basic
Data structures are the bigger building blocks
Ex: vector, list, hash and tree maps, heap
Those are all the data structures you need to know for interviews
On the job is even easier: heaps rarely come up

Vectors and lists

Operations:
index: find the _i_th item
push_front: put an element on the front
push_back: push an element to the back
Vectors/lists have different time complexities for these operations
You'll choose one over the other based on what you're going to do with it

List

A list is a chain of nodes
Each node holds an item, and possibly a reference to a next node
You iterate through the list by starting from the head, or first node, and following the references
index: takes time worst-case O(n) to visit the last element, because you must visit every node
push_front: O(1), since you just create a new head node, and link it to the old head
push_back: it already takes O(n) time to traverse to the end
There's a list variant that holds a reference to both front and back of the list, in which case both push_front and push_back take O(1); index still is O(n) (why?)
Question: how would you insert an item into the middle of a list?

Vector

index was slow for lists because we had to follow every link
A vector holds all its items in a single block of memory (store), one next to the other
Each entry has a fixed width, or size, which is the number of bytes to store the object
If the store starts at memory address p, then the _i_th element is at address p + (i * size(item))
Any address can be loaded in O(1) time, so index is O(1)
It's not vital to know the details of a vector implementation, but a little understanding will help you remember its performance

Vector mutations

Because of how computers work, it is difficult to resize the store
We have to allocate an entirely new store, then copy everything over
Resize has what time complexity?
If the store is at full capacity, and we want to push_front or push_back, we have to do a resize
To avoid resizes, it's typical to double the store size on every resize
So if we have to copy n elements on this resize, we increase the store size to 2n, which buys us another n insertions before the next resize
Which brings us to amortized time complexity: which is O(1)

Sets

A set has three basic operations:
Insertion
Lookup
Iteration
Two major kinds: tree set and hash set

Tree set

A tree is like a list, but a parent can have multiple children
Nodes with no children are called leaves; the parent of everything is the root
The relationship between parent-child and siblings defines the tree properties
In a tree set, each parent has at most two children; a left and right child
The left child must be less than the parent, the right child is greater than the parent
How would you search/insert into the tree?

Tree set performance

Operations are defined by the depth of the tree
Trivial worst case: every node has one child, O(n) depth
Best case: every node has two children; how deep?
O(log(n))
algorithms that consider half the items at each step are typically logarithmic
Self-balancing trees do some magic to ensure the tree depth is O(log(n))
you don't need to know how these work; just that they exist

Hash set

A hash function maps an item to an almost random number
The hash id is not guaranteed unique, but hash collisions should be rare
A hash set has a store, each element in the store is called a bucket
Values are mapped to buckets by their hash id, modulo the size of the store
Buckets may contain more than one item, if hashes collide
As we add more items buckets fill up

Hash set performance

It is typical to keep the hash set at a fix load, or num_items / num_buckets
Periodic resizes take O(n) time, but we repeat the doubling trick
Hash calculation takes O(1), bucket lookup takes O(1)
On average, bucket contains load items, which is average O(1)

Sets vs maps

Both a tree set and hash set can be transformed into a map
Maps keys to values
Store the keys as usual, and keep an extra reference to the value

A quick overview to common time complexities

O(1); "constant" time; ideal; ex: lookup in a hash map
O(log(n)); "logarithmic" time; fast; ex: search in a sorted list
consider half the items each time
O(n); good; ex: search in an unsorted list, consider every item
O(nlog(n)); slow; ex: sorting a list
O(n**2); "quadratic" time; bad; ex: considering every pair
O(2**n); "exponential" time; intractable; ex: consider every subset
O(n!); "factorial" time; worst-case; ex: consider every permutation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

outline.md

outline.md

Why Algorithms?

Typical Interview process

Algorithms interview advice

find_item

Big-Oh (Asymptotic Complexity)

find_item complexity

Big-Oh (cont.)

Asymptotic behavior

Asymptotic behavior (cont.)

find_min

find_two_min

find_k_min

find_zero_sum_pair

find_zero_sum_pair2

Data structures

Vectors and lists

List

Vector

Vector mutations

Sets

Tree set

Tree set performance

Hash set

Hash set performance

Sets vs maps

A quick overview to common time complexities

Files

outline.md

Latest commit

History

outline.md

File metadata and controls

Why Algorithms?

Typical Interview process

Algorithms interview advice

find_item

Big-Oh (Asymptotic Complexity)

find_item complexity

Big-Oh (cont.)

Asymptotic behavior

Asymptotic behavior (cont.)

find_min

find_two_min

find_k_min

find_zero_sum_pair

find_zero_sum_pair2

Data structures

Vectors and lists

List

Vector

Vector mutations

Sets

Tree set

Tree set performance

Hash set

Hash set performance

Sets vs maps

A quick overview to common time complexities