How to solve the Two Sum problem? by Danny Reina
Brute Force Solution & Optimized Solution
Intent
It seems as if the two num sum problem is the obligatory algorithm practice starting point for any software engineer who wishes to pass a technical interview. Be that as it may, I wish to offer quick guided breakdowns of how one can solve some of these algo problems that may be presented in your next technical interview. The benefit is twofold, it helps me cement my understanding of the subject, and two, it hopefully gives you a better understanding of the solutions. Stay tuned to more articles of mine on algo practice.
The Problem/Approach
Difficulty: easy
Classically, the two sum problem asks the engineer to write a function that takes in two inputs, an array of integers and a target integer, and said function should return an array of two integers whose sum matches the target integer. I advise that you ask the interviewer clarifying questions, actually, you must ask clarifying questions for every technical challenge so you can avoid edge cases. Some clarifying questions you should ask for this specific problem are:
- does the array input have any repeating integers?
- what should the function return if there are no numbers whose sum matches the target sum?
- is there a possibility of a boundary case that I should account for?
- should I solve this with the best-optimized solution?
- does the array have to be sorted or does the solution array have to be sorted?
- what language should I code this in? (this should’ve already been figured out before you even entered the interview)
After the clarification is given, proceed to verbally I repeat verbally tell the interviewer what you will do to solve the problem. Try to explain each step in as much detail as possible. The way how I like to practice is imagining that the interviewer is my co-worker and I am guiding him or her on how to solve a problem that we are facing at work. Use that approach and you’ll be on the path towards a positive end result.
Conceptual Approach/Brute Force Solution
The hardest part of an algo problem is developing the problem-solving skills for algo problems. To help, I want to offer a transcript of my mental thoughts as I approach the two sum problem.
First, after receiving feedback from my clarifying questions, I recognize that I have an array of unique inputs that do not need to be sorted. If I have an array, I know I will need to traverse the array to find two numbers whose sum matches the target sum. Then how do we, in JavaScript, traverse array data structures. We can do so with a for loop. Great, what's next?
Second, we have to find two numbers whose sum equals the target sum. How do we do this? Well, since the for loop limits our access to only one element at a time and we need to compare it to another number in the array then how do we get that second number? We can do so by traversing the array again with another for loop but we start the loop at the index that starts immediately after the first for loop. That way we have access to two elements at the same time and now we can do the necessary math to satisfy the condition.
Third, we have to check if the sum of our first number & second number matches the target sum. If they do then we return them in an array. Let's also assume that the function should return an empty array if no two numbers summation equals the target sum. How do we do this? I would like to think of the keyword check as a sign that we have to create a conditional statement so we will use an if condition and initialize a new array that contains our solution. If there is no solution then the for loops will traverse the entire array and we terminate the function with a return statement that returns an empty array.
Time/Space Analysis
Before reading on, I highly encourage a thorough understanding on how to identify time-space complexity from an algorithm. There are many resources dedicated to this subject, I will link another great resource that I used here.
We see that we used a nested for loop in this algo to get our solution, therefore, from our asymptotic analysis, we can get this big O notation for our time complexity O(n * n) => O(n²) quadratic time. It is worth mentioning that if we had employed two separate for loops our asymptotic notation would be O(2n) => O(n) linear time. The space complexity analysis would be O(1) because we are always returning either an array of two elements or an empty array, basically fixed-size data structures. Therefore, we have a time complexity of O(n²) and a space complexity of O(1). Are we done? Not quite.
Optimized Solution
In an ideal world, you would have time to present a brute force solution and an optimized solution. A brute solution does not care for efficiency in problem-solving which can lead to issues. For example, our brute force solution can lead to increased load times for users which can result in lost revenue. Our manager would definitely want us to optimize our algorithm. How?
First, we need to have a proper in-depth understanding of data structures. These include stacks, queues, linked lists, hash tables, binary trees, etc. For an in-depth tutorial, I would highly recommend this course. For efficiency sake, I will tell you that we can use a hash table to improve our time complexity at the cost of using up more space since we will be using the hash table to store numbers. I will explain.
It’s okay to feel as if you are hitting an impassable brick wall right now trying to figure out how do you use a hash table to improve our solution. It is completely natural and very common encountering this wall when trying to solve algos, the only remedy is continued consistent practice, believe me, it’s the ONLY WAY.
Second, we use a hash table. A hash table is essentially an object in JavaScript, which means, we could use all the powers of a JS object such as bracket notation which fortunately has a time complexity of O(1) constant time which is very fast. But why/how do we use it for this problem? Well, if we think about it, eventually we will have to traverse the array with a for loop that is unavoidable, but what if we take the current number in the for loop and subtract it from the target sum and then check if that number is stored in our hash table. Sure but how does that look like?
Third, okay but the hash table is empty….yes, but we can use bracket notation to store elements inside our hash table.
Awesome! But what if we wanted to store the potentialMatch
number in the hash table instead of the num variable? Sure, but then we would have to change our condition because we need two integers to complete our formula num + potentialMatch = targetSum
.
Fourth, so the final optimized solution would return a new array containing our two integers that equal to the target sum.
Our time & space complexity for this solution is O(n) linear time & space because we are traversing the array at a growing rate of time depending on the size of the array and when we reach our solution. Space complexity is O(n) because we are creating a new data structure, our hash table, which increases in size depending on how far we traverse our input array. And since linear time is much better than quadratic time, the solution above would be our best-case scenario.
Final Thoughts
Algo practice is hard, do not be discouraged if it took you a few read-throughs to fully understand the solutions presented in this article. I was in the same position you are now and a few days later I attained a level of expertise on this problem to where I felt comfortable teaching its solution to you. I only attained this level of comfort through practice and yes at one point I did almost forget the entire solution but I blame it on the Christmas break I gave myself :)
Stay tuned for more algo breakdowns that I will be publishing for the months to come.