Short Encoding of Words — Trie Suffix Dedup

Sanjeev SharmaSanjeev Sharma
1 min read

Advertisement

Problem

Encode a list of words in a reference string. Each word must appear as a suffix followed by '#'. Minimize the total reference string length.


Approach — Reversed Trie Leaf Count

A word needs separate encoding only if it's NOT a suffix of any other word. Build trie of reversed words; sum lengths of words at leaf nodes.

Time: O(n * L) | Space: O(n * L)


Solutions

Python

class Solution:
    def minimumLengthEncoding(self, words: list[str]) -> int:
        words=list(set(words))
        root={}
        for w in words:
            node=root
            for c in reversed(w):
                if c not in node: node[c]={}
                node=node[c]
        def dfs(node, depth):
            if not node: return depth+1
            return sum(dfs(child, depth+1) for child in node.values())
        leaves=[node for node in [{} for _ in words]]
        return sum(len(w)+1 for w in words if
                   all(w not in word or w==word for word in words))

Python — Set approach (simpler)

class Solution:
    def minimumLengthEncoding(self, words: list[str]) -> int:
        good=set(words)
        for w in words:
            for k in range(1,len(w)):
                good.discard(w[k:])
        return sum(len(w)+1 for w in good)

Complexity

  • Time: O(n * L²) | Space: O(n * L)

Advertisement

Sanjeev Sharma

Written by

Sanjeev Sharma

Full Stack Engineer · E-mopro