929. Unique Email Addresses
Summary
Given an array of strings emails, where we send one email to each emails[i], return the number of different addresses that actually receive mail.
Rules:
If you add periods '.' between some characters in the local name part of an email address, mail sent there will be forwarded to the same address without dots in the local name. Note that this rule does not apply to domain names. For example, "[email protected]" and "[email protected]" forward to the same email address.
If you add a plus '+' in the local name, everything after the first plus sign will be ignored. This allows certain emails to be filtered. Note that this rule does not apply to domain names. For example, "[email protected]" will be forwarded to "[email protected]".
Approach
Use a state machine to normalize each email address, and store unique normalized addresses in a hash set.
Function
Function normalize(email): states = LOCAL_NAME, AFTER_PLUS, DOMAIN state = LOCAL_NAME normalized = [] for ch in email: if state == LOCAL_NAME: if ch = '.': continue elif ch = '+': state = AFTER_PLUS elif ch = '@': normalized <- append ch state = DOMAIN else: normalized <- append ch elif state == AFTER_PLUS: if ch == '@': normalized <- append ch state = DOMAIN else: # domain part normalized <- append ch return join all characters in normalized
unique = {} for e in emails: normalized = normalize(e) if normalized not in unique: unique.add(normalized)
return the number of elements in unique
Complexity
Time Complexity: O(NW), where N is the number of emails and W is the average email length. For each email, we normalize it by traversing all characters and building the normalized result, which takes O(W). Space Complexity: O(NW).