Given a chemical formula
(given as a string), return the count of each atom.
An atomic element always starts with an uppercase character, then zero or more lowercase letters, representing the name.
1 or more digits representing the count of that element may follow if the count is greater than 1. If the count is 1, no digits will follow. For example, H2O and H2O2 are possible, but H1O2 is impossible.
Two formulas concatenated together produce another formula. For example, H2O2He3Mg4 is also a formula.
A formula placed in parentheses, and a count (optionally added) is also a formula. For example, (H2O2) and (H2O2)3 are formulas.
Given a formula, output the count of all elements as a string in the following form: the first name (in sorted order), followed by its count (if that count is more than 1), followed by the second name (in sorted order), followed by its count (if that count is more than 1), and so on.
Example 1:
Input: formula = "H2O" Output: "H2O" Explanation: The count of elements are {'H': 2, 'O': 1}.
Example 2:
Input: formula = "Mg(OH)2" Output: "H2MgO2" Explanation: The count of elements are {'H': 2, 'Mg': 1, 'O': 2}.
Example 3:
Input: formula = "K4(ON(SO3)2)2" Output: "K4N2O14S4" Explanation: The count of elements are {'K': 4, 'N': 2, 'O': 14, 'S': 4}.
Note:
formula
will be in the range [1, 1000]
.formula
will only consist of letters, digits, and round parentheses, and is a valid formula as defined in the problem.Intuition and Algorithm
\nWrite a function parse
that parses the formula from index i
, returning a map count
from names to multiplicities (the number of times that name is recorded).
We will put i
in global state: our parse
function increments i
throughout any future calls to parse
.
When we see a \'(\'
, we will parse whatever is inside the brackets (up to the closing ending bracket) and add it to our count.
Otherwise, we should see an uppercase character: we will parse the rest of the letters to get the name, and add that (plus the multiplicity if there is one.)
\nAt the end, if there is a final multiplicity (representing the multiplicity of a bracketed sequence), we\'ll multiply our answer by this.
\nComplexity Analysis
\nTime Complexity: , where is the length of the formula. It is to parse through the formula, but each of multiplicities after a bracket may increment the count of each name in the formula (inside those brackets), leading to an complexity.
\nSpace Complexity: . We aren\'t recording more intermediate information than what is contained in the formula.
\nIntuition and Algorithm
\nInstead of recursion, we can simulate the call stack by using a stack of count
s directly.
Complexity Analysis
\nIntuition and Algorithm
\nWhenever parsing is involved, we can use regular expressions, a language for defining patterns in text.
\nOur regular expression will be "([A-Z][a-z]*)(\\d*)|(\\()|(\\))(\\d*)"
. Breaking this down by capture group, this is:
([A-Z][a-z]*)
Match an uppercase character followed by any number of lowercase characters, then ((\\d*)
) match any number of digits.(\\()
match a left bracket or (\\))
right bracket, then (\\d*)
match any number of digits.Now we can proceed as in Approach #2.
\nIf we parsed a name and multiplicity ([A-Z][a-z]*)(\\d*)
, we will add it to our current count.
If we parsed a left bracket, we will append a new count
to our stack, representing the nested depth of parentheses.
If we parsed a right bracket (and possibly another multiplicity), we will multiply our deepest level count
, top = stack.pop()
, and add those entries to our current count.
Complexity Analysis
\nAnalysis written by: @awice. Approaches #1 and #2 inspired by @zestypanda. Java solution for #3 by @jianchao.li.fighter.
\n