Python Random Fact: Dictionaries Preserve Insertion Order Starting with Python 3.7

Quiz 1

Consider “387. First Unique Character in a String” from LeetCode.

Given a string s, find the first non-repeating character in it and
return its index. If it does not exist, return -1.

Constraints:

- 1 <= s.length <= 105
- s consists of only lowercase English letters.

Imagine you wrote the following solution (it may differ a bit from yours, but let’s use this as the starting point):

class Solution:
    def firstUniqChar(self, s: str) -> int:
        seen = set()
        unique = {}
        for i, char in enumerate(s):
            if char not in seen:
                seen.add(char)
                unique[char] = i
            else:
                if char in unique:
                    del unique[char]
        if not unique:
            return -1
        return next(iter(unique.values()))

Since Python 3.7, dictionaries preserve insertion order as a language guarantee. That means next(iter(unique.values())) returns the index of the first inserted key. But what if you cannot rely on that guarantee?

Answer

You need to preserve the order explicitly. A simple approach is to count the occurrences of each character, then scan the string from the beginning and return the first character with count 1.

class Solution:
    def firstUniqChar(self, s: str) -> int:
        char_counts = {}
        for char in s:
            if char not in char_counts:
                char_counts[char] = 1
            else:
                char_counts[char] += 1

        for idx, char in enumerate(s):
            if char_counts[char] == 1:
                return idx
        return -1

Special Quiz

Can you optimize the above solution? Its time complexity is already O(N) because we must examine each character at least once. Where else can we improve?

Answer

English letters only

According to the problem constraints, the string contains only lowercase English letters. Therefore, we can store the counts in a fixed-size array of length 26, which gives us O(1) additional space.

class Solution:
    def firstUniqChar(self, s: str) -> int:
        char_counts = [0] * 26
        base = ord("a")
        for char in s:
            char_counts[ord(char) - base] += 1

        for idx, char in enumerate(s):
            if char_counts[ord(char) - base] == 1:
                return idx
        return -1

Remove the second traversal

In the above solution, we traverse the string twice. If the string is very large, the second traversal may become a noticeable cost.

To avoid that, we must keep the insertion order ourselves. Here is an example solution that uses a doubly linked list to track the order of unique characters.

class UniqueCharNode:
    def __init__(self, char: str, idx: int):
        self.char = char
        self.idx = idx
        self.next: "UniqueCharNode | None" = None
        self.prev: "UniqueCharNode | None" = None

class UniqueCharList:
    def __init__(self):
        self._head = None
        self._tail = None
        self._nodes_by_char: dict[str, UniqueCharNode] = {}
    
    def _remove_node(self, node: UniqueCharNode):
        if node.prev is not None:
            node.prev.next = node.next
        else:
            self._head = node.next
        if node.next is not None:
            node.next.prev = node.prev
        else:
            self._tail = node.prev
        del self._nodes_by_char[node.char]
    
    def _insert_between(
        self,
        prev_node: "UniqueCharNode | None",
        node: UniqueCharNode,
        next_node: "UniqueCharNode | None",
    ):
        if prev_node is not None:
            prev_node.next = node
            node.prev = prev_node
        else:
            self._head = node
        if next_node is not None:
            next_node.prev = node
            node.next = next_node
        else:
            self._tail = node
        self._nodes_by_char[node.char] = node
    
    def append(self, char: str, idx: int):
        new_node = UniqueCharNode(char, idx)
        self._insert_between(self._tail, new_node, None)
    
    def discard(self, char: str):
        node = self._nodes_by_char.get(char)
        if node is None:
            return
        self._remove_node(node)

    def first_index(self) -> int:
        return self._head.idx if self._head else -1


class Solution:
    def firstUniqChar(self, s: str) -> int:
        unique_chars = UniqueCharList()
        seen = set()

        for idx, char in enumerate(s):
            if char in seen:
                unique_chars.discard(char)
            else:
                unique_chars.append(char, idx)
                seen.add(char)

        return unique_chars.first_index()

Closing Thoughts

Even with these optimizations, the original dictionary-based approach stays fastest in Python because it leverages an existing guarantee with almost no extra bookkeeping. Still, the counting and linked-list techniques make it easy to port the idea to languages without ordered maps and can outperform other workarounds there. Knowing when you can trust the runtime and when you must recreate guarantees yourself is the real skill.