Easier Interview Question

Python Coding Interviews: Tips & Best Practices James Uejio 09:31

In this lesson, you’ll use what you learned in the previous sections to solve an easy real world interview question. Easy does not mean trivial, but it means something you might find in a coding challenge or internship interview or a warm up question.

Here’s the question:

Python
      
    
def majority_element_indexes(lst):
    '''
    Return a list of the indexes of the majority element.
    Majority element is the element that appears more than
    floor(n / 2) times.
    If there is no majority element, return []
    >>> majority_element_indexes([1, 1, 2])
    [0, 1]
    >>> majority_element_indexes([1, 2])
    []
    >>> majority_element_indexes([])
    []
    '''

The entire solution will be available at the end of the course. Try to find a solution to the problem yourself and then watch the video for the solution.

You also heard about Big O analysis, which is a way to analyze the speed and memory usage of a function or block of code. See Wikipedia Time complexity for more information.

00:00 In this video, you’ll use what you learned in the course so far to solve an easy interview question. What I mean by easy doesn’t mean trivial—it means something you might see in a coding challenge, or maybe a internship question, or a warm-up question.

00:16 Let’s get started. Here’s a function, majority_element_indexes(). It takes in a list and returns a list of the indexes of the majority element.

00:26 The majority element is defined as the element that appears more than floor(n / 2) times, where n is the length of the list. So for example, in a list with five elements, the majority element has to appear more than 5 / 2—floor() would be 2 times. It has to appear more than 2 times.

00:44 So, 3, 4, or 5 times. If there is no majority element, return an empty list. Here’s the test case: majority_element_indexes([1, 1, 2]). So, the majority is 1, because 1 appears twice, and the indexes that 1 appears are [0, 1].

01:00 Make sure to go through all test cases and make sure that you understand them and ask any questions. Also, it’s good to write some more test cases to make sure that you understand what the function’s supposed to do.

01:12 Writing another test case majority_element_indexes() of maybe a list [1, 2]—which has no majority element—should return the empty list, and verify with the interviewer that this is the case, and then, maybe, an empty list—returns an empty list.

01:29 So now, I recommend to pause the video and try this problem on your own. It will really make a difference when you try the problems yourself, instead of just looking straight at the solution.

01:39 I will go over the solution in five, four, three, two, one. So, let’s get started. First, I write some comments with my thought process because then I can refer back to them when I’m coding. For example, I know I’ll probably have to find the majority element, so # find majority element.

01:57 Then, # if there is no majority element, return [],

02:04 and then, # find the indexes of the majority element,

02:09 and # put them in a list. So, to find the majority element, you’ll need to find how frequent each number appears. Well, how do you do that?

02:18 Whenever you think of counting the frequencies, you should think of the Counter class.

02:25 Call it count—you could mention in the interview that you’re just going to use an abbreviated name, because it will be quicker to type—and then Counter(lst).

02:34 Now, probably print(count), run the doctest, make sure that the Counter class is doing what you want, which it looks like it is. [1, 1, 2]—1 appears twice, 2 appears once.

02:50 And then to find the majority element, you could sort the keys by their count in descending order so that the majority element appears at the beginning of the list.

03:00 So something like top_elems = sorted(count.keys(), key=lambda x: count[x]). Just print(top_elems), make sure that it’s in the right order.

03:18 I don’t think it will be. And then look here, so, [1, 1, 2]. Expected: [0, 1] Got: [2, 1].

03:27 So, it actually did it in ascending order by count, because 2 appears once and 1 appears twice. 2 is here and 1 is here.

03:34 So if you do -count, then it goes [1, 2], because the most frequent number appears first, followed by the second most frequent number, et cetera.

03:44 So now, ideally, the majority element will be top_elems[0]. So moving to step two, how do you know if there is no majority element? While scrolling at the top, the majority element is defined to be […] "the element that appears more than floor(n / 2) times." So, if count[maj_elem] <= n // 2 (floordiv 2), then you know that this condition is not True, because the count of the majority element is less than or equal to n // 2. Then, return [].

04:19 Another way of doing this that might make a little bit more sense to match this exactly verbatim is doing something like if count[maj_elem] > n // 2, then # code here. Otherwise, return [].

04:35 So, this is more explicit. This is saying, okay, let’s keep doing our code, because we know that there is a majority element. Otherwise, we don’t know that there’s a majority element.

04:43 So, it’s good to write comments when you’re coding—something like # there exists a majority element. Okay, so now that we know there is a majority element, what do we need to do? Find the indexes of the majority element, put them in a list.

04:57 Whenever you have to put something in a list and maybe loop through something and just put them in—list comprehension should come to mind. Do this all in one line. Return the indexes, so i for i, elem in lst because we need access to the element to check if

05:16 it’s equal to the majority element. So i for i, elem in lst if elem == maj_elem. Now let’s save, run it. It fails. Let’s see—list index out of range.

05:28 Actually, they all fail. 'n' is not defined. Okay. len(lst).

05:35 Okay, two failed. cannot unpack non-iterable int object. Oops. This should be enumerate(). So, actually forgot to use something that you learned a couple videos ago, but basically, if you want to get the index and the value, you need to use enumerate(). Now, one failure. list index out of range. top_elems, here. Okay, well it’s because we’re passing in a empty list. It goes down here, it goes in here, tries to access the zeroth element of an empty list.

06:06 So, you could put the empty list check anywhere. I like to put it just at the top because then it’s sort of like a, “We know what the solution is, don’t need to do anything else.” Let’s run it.

06:17 Let’s just delete this code to make it a little cleaner. Okay, so now that it seems to work, let’s do a runtime analysis of what’s going on. The Counter class should take O(n) because it just has to loop through and build a dictionary. top_elems—so, sorted().

06:33 sorted() always takes n log n to sort anything, where n is the length of the iterable you’re passing in, and the key—it doesn’t affect the runtime, so this takes O(n log n).

06:46 Here, this is constant time, this is constant time, and this is n. I guess, technically, this is not n log n because n would be the length of the lst, while here we’re iterating over the count, and so if numbers appear more frequently, this length will be smaller than lst, but let’s just leave it as n log n, as sort of an upper bound of what the length of this iterable could be.

07:13 The interviewer might ask you, “Okay, is there any way to speed this up?” I mean, n log n is not bad and n is not bad, so maybe they’ll just say, “Okay, this is fine,” and move on. But in order to speed this up, there’s a way to do this in O(n) time. Instead of sorting it, we could just loop through all the counts, find the top count, and then find the element that corresponds to that top count.

07:35 So, comment this out. Something like top_count = max(count.values()).

07:44 And then something like the majority element—instead of being top_elems, which we don’t have anymore—is just list comprehension, elem for elem, count in count.items—let’s make it cleaner, like this—if count == top_count,

08:05 and then get the zeroth index, there.

08:11 This is safe to do because this top_count will get the max of count.values(), and so that means there will always be a count here, and then looping through this guarantees that because we’re looping through the same dictionary, that this list will have at least one element.

08:26 It could have multiple elements that have this top_count and then just get one of them—it’s fine because this check right here is going to guarantee that there is only one majority element. Because if there were two majority elements, then the count of the majority element would not be greater than half of the list. Now, looking at runtime, this runs in O(n)—again—n being sort of an upper bound of what the possible length of count.values could be. And this is just a for loop, so this would be O(n).

08:56 So, O(n) plus O(n) is still O(n), and then we have O(n), here, and so this function now runs in O(n) time, instead of before, O(n log n).

09:07 So, this was just a video to show you that using Counter, using sorted(), using max(), and using enumerate() and list comprehensions all are very common to see in many interview questions.

09:18 It will show the interviewer that you know Python syntax and you know built-in modules. In the next video, you’ll go through a more medium-level problem using some of these concepts as well.

AlekS on April 24, 2020

my two solutions, second pretty similar to yours.

ls = [6, 8, 1, 1, 2, 3, 8, 5, 5, 8, 8, 8, 8, 8, 8, 8] ls = [1, 1, 2] ls = [] ls = [2, 2, 2, 1, 1, 3, 2, 2] ls = [1, 3, 2, 2, 4, 5]

def majority_element_v0(lst):

if len(lst) == 0:
    return []

create list of tuples: (list_elem, index)

lst_dic = [(val, key) for key, val in list(enumerate(lst))]

dic = defaultdict(list)

create dict with keys as unique elements of list and values as list of #indexes of unique elements

for key, val in lst_dic:
    dic[key].append(val)

find element of dict satisfying condition if len(val) > #floor(len(lst) / 2)]

res = [[key, val] for key, val in dic.items() if len(val) > floor(len(lst) / 2)]

if len(res) == 0:
    return []
else:
    return res[0]

print(majority_element_v0(ls))

def majority_element_v1(lst):

if len(lst) == 0:
    return []

m_elem = [key for key, val in Counter(lst).items() if val > floor(len(lst) / 2)]

if len(m_elem) == 0:
    return []

return [m_elem[0], [i for i in range(len(lst)) if lst[i] == m_elem[0]]]

print(majority_element_v1(ls))

James Uejio RP Team on April 26, 2020

Hi @AlexSch thank you for sharing! Great way to use list comprehensions. There are definitely many ways to do this and as long as you comment your code and the interviewer can follow your solution (and the runtime is reasonable/optimal), you’re good to go!

belushkin on April 27, 2020

Hello fellows, Here is my solution, I hope it is right:

from collections import Counter

def majority_element_indexes(lst):
    '''
    >>> majority_element_indexes([1,1,2])
    [0, 1]
    >>> majority_element_indexes([1,2])
    []
    >>> majority_element_indexes([])
    []
    >>> majority_element_indexes([1,1,2,2,2,2])
    [2, 3, 4, 5]
    >>> majority_element_indexes([6, 8, 1, 1, 2, 3, 8, 5, 5, 8, 8, 8, 8, 8, 8, 8])
    [1, 6, 9, 10, 11, 12, 13, 14, 15]
    >>> majority_element_indexes([2, 2, 2, 1, 1, 3, 2, 2])
    [0, 1, 2, 6, 7]
    >>> majority_element_indexes([1, 3, 2, 2, 4, 5])
    []
    '''

    n = len(lst) // 2
    cnt = Counter(lst)
    return [index for index, element in enumerate(lst) if cnt[element] > n]

James Uejio RP Team on April 27, 2020

@belushkin it looks right!

James Uejio RP Team on April 27, 2020

For anyone while we upload video descriptions, I talk about Big-O analysis which is a way to analyze the speed and memory usage of a function or block of code. See Wikipedia Time complexity for more information.

Pygator on April 30, 2020

Didn’t understand why you flipped - sign for the key fcn.

danielzlacky on May 2, 2020

Hey, here is my solution, which should run O(n) in worst case scenario

from collections import defaultdict


def majority_element_indexes(lst):
    majority_count = len(lst) // 2
    elm_dict = defaultdict(list)
    for i, v in enumerate(lst):
        elm_dict[v] += [i]
        if len(elm_dict[v]) > majority_count:
            return elm_dict[v]
    return []

James Uejio RP Team on May 8, 2020

Hi @Pygator I flipped the sign because by default the sort method sorts in ascending order (so [1, 2, 3]) but in my head I wanted the largest element at the front. However you are right I could have accessed it just with [-1] instead.

@danielzlacky great concise solution without using Counter!

Marcus Reaiche on May 15, 2020

Thanks for the great course! Here is my solution. I used the most_common method of the Counter instance. Hopefully the method was implemented in a smart way so that it runs in O(n) when passing the argument 1.

from collections import Counter

def majority_element_indexes(lst):
    """
    Return a list of indexes of the majority element.
    Majority element is the element that appears more than floor(n / 2) times.
    If there is no majority element, return [].
    >>> majority_element_indexes([1, 1, 2])
    [0, 1]
    >>> majority_element_indexes([1, 2])
    []
    >>> majority_element_indexes([])
    []
    """
    if lst:
        most_frequent_elem, n = Counter(lst).most_common(1)[0]
        if n > len(lst) // 2:
            return [k for k, x in enumerate(lst) if x == most_frequent_elem]
    return []

rickfencl on June 15, 2020

You said yourself that the the floor calculation constrains the majority element. Since we just want the index of where the majority elements appear in the input, I don’t see the point of finding the max number of occurrences. Or actually determinine what the majority element is. This whole exercise is just a one-liner:

print([i for i, v in enumerate(lst) if Counter(lst)[v] > len(lst)//2])

James Uejio RP Team on June 27, 2020

@Marcus Reaiche great solution and using the built in Counter methods.

@rickfencl That is a cool one line solution! I would probably not come up with that during an actual interview, and also you want to make sure that your interviewer can follow the logic.

MichaelKareev on July 12, 2020

As Pythonic as it gets. The idea is to create a defaultdict with a list as a value and append to the list the frequencies and the indices. Then filter out on frequencies and return only the indices.

def major_element(nums):
    if len(nums) == 0: return []
    if len(nums) == 1: return [0]
    thr = len(nums) // 2
    d = defaultdict(list)
    for n_ix in range(len(nums)):
        if nums[n_ix] not in d:
            d[nums[n_ix]].append(1)
            d[nums[n_ix]].append([n_ix])
        else:
            d[nums[n_ix]][0] += 1
            d[nums[n_ix]][1].extend([n_ix])
    return [v[1] if v[0] > thr else [] for k,v in d.items()][0]

a5zima on July 19, 2020

Hi, is it right solution?

return [x for x in range(len(lst)) if lst.count(lst[x]) > len(lst) / 2]

squeakyboots on May 8, 2021

Great comments! They took me down a rabbit hole =D

The one liners were fun, but when I checked them out more closely they are both at least O(n^2) right?

Marcus’s solution was the fastest when I used a list of 128000 values. It surprised me that it was all that much faster than danielzlacky’s since they’re both O(n). They seemed to be the same speed when running only 2000 values but Marcus’s pulled away pretty well once the length of the list increased significantly. I also like how concise and easy to read his solution is.

Some observations I got from diving on this exercise:

count() appears faster than Counter but if the solution calls count() more than ~5 times Counter seems to be better. If I knew my data was only 2-3 of the same values repeating count() might offer a small performance advantage, but more unique values make Counter come out on top.
count() is pretty significantly faster than trying to implement a Python loop that does the same thing. This made me abandon trying to take advantage of some kind of short circuit approach to stop counting and return [] when it’s clear there won’t be enough of any value. From Googling it the reason seems to be because count() uses C which is quite a bit faster than Python.
enumerate() and other generators are O(1) to call: it took me 0ns to make an enumerate object regardless of the data size. This was unintuitive for me at first, but made sense after I thought more about what a generator is. Iterating over the resulting generator adds the time complexity. This highlighted to me that I should probably use the tools that result in a generator to iterate across something where possible rather than make a new list or dict using a list comprehension. Or use () instead of [] or {} when doing the comprehension, which makes a generator. I’ll be paying more attention when that scenario comes up in the future.

riancvb on July 1, 2021

Hello James! Thanks for your great content.

I wrote a test that does not pass using your solution, e.g.:

Failed example:
majority_element_indexes([1, 1, 2, 3, 1, 4, 4, 4, 4])
Expected:
[5, 6, 7, 8]
Got:
[]

I did an implementation that probably is not the O(n) complexity (actually idk how to measure this), but at least it works for more use cases, here it is:

maj_indexes = []
if(math.floor((len(lst) - 1) / 2) == 0):
    return []
else:
    # var with majority number and how many times it is in lst
    maj_elem = Counter(lst).most_common(1)
    for i, x in enumerate(lst):
        if(maj_elem[0][0] == x):
            maj_indexes.append(i)
return maj_indexes

mdwikira on Sept. 14, 2021

My solution. From what I gather fairly simillar to ones already posted.

def major_element_indexes(lst):
    if not lst: return []
    counted = Counter(lst)
    threshold = len(lst) // 2
    return [i for i, v in enumerate(lst) if counted[v] > threshold]

mdwikira on Sept. 14, 2021

Also there is the question : is the majority element one that satisfies the condition of count > len(lst) // 2 or one that has the most occurences in the list ? These are two different cases, I think.

pdcpython on Feb. 16, 2022

def majority_element_indexes(lst):
    counter = Counter(lst)
    l = [i for i in range(len(lst)) if counter[lst[i]] > len(lst) // 2]
    return l

Hello, I’ve landed to this solution. It works for the three doctest lists, but I don’t really know if it works for every list. Can you have more than one majority element?

Best Regards

zulfiiaditto on June 5, 2022

Hi there. I am posting my solution here. Please let me know if it is wrong.

def majority_element_index(lst):
    newLst = []
    for i, k in enumerate(lst):
        if lst.count(k) > (len(lst)//2):
            newLst.append(i)
    return newLst

mito on Sept. 13, 2022

Hi James,

The description says:

def majority_element_indexes(lst):
    '''
    Return a list of the indexes of the majority element.
    Majority element is the element that appears more than
    floor(n / 2) times.
    If there is no majority element, return []
    >>> majority_element_indexes([1, 1, 2])
    [0, 1]

For test case majority_element_indexes([1, 1, 2]), should it return [0] instead of [0, 1] because element "2" does not appear more than floor(n/2) times? Thanks.

BlueniarB on March 9, 2023

Hi James,

Could we use:

top_elem = max(count.keys(), key=lambda x: count[x])  # O(n)

Instead of:

    top_count = max(count.values()) # O(n)
    maj = [
        elen for elem, count
        in count.items if count == top_count
    ][0] # O(n)

Full code:

from collections import Counter
def majority_elements_indexes(lst):
    """
    Return a list of the indexes of the majority element.
    Majority element is the element that appears more than
    floor(n/2) times.
    If there is no majority element, return []
    >>> majority_elements_indexes([1, 1, 2])
    [0, 1]
    >>> majority_elements_indexes([1, 2])
    []
    >>> majority_elements_indexes([])
    []
    """
    # Find majority element
    # if there is no majority element, return []
    # find the indexes fo the majority element,
    # put them in a list
    if len(lst) == 0:
        return []

    count = Counter(lst)  # O(n)
    # top_elem = sorted(count.keys(), key=lambda x: -count[x])[0]  # O(nlogn)
    top_elem = max(count.keys(), key=lambda x: count[x])  # O(n)
    if count[top_elem] > len(lst) // 2:
        return [idx for idx, el in enumerate(lst) if el == top_elem]  # O(n)
    else:
        return []

andrewodrain on March 16, 2023

Thanks bro! I finally feel like I am off the beginner-intermediate plateau and finally moving forward again. Quick question, how does one know when they can rate themselves as an intermediate programmer or an advanced programmer? Thanks again.

Become a Member to join the conversation.