I'm trying to write a regex to replace all spaces that are not included in quotes so something like this:
a = 4, b = 2, c = "space here"
would return this:
a=4,b=2,c="space here"
I spent some time searching this site and I found a similar q/a ( http://stackoverflow.com/questions/79968/split-a-string-by-spaces-in-python#80449 ) that would replace all the spaces inside quotes with a token that could be re-substituted in after wiping all the other spaces...but I was hoping there was a cleaner way of doing it.
-
I consider this very clean:
mystring.scan(/((".*?")|([^ ]))/).map { |x| x[0] }.joinI doubt gsub could do any better (assuming you want a pure regex approach).
From Romulo A. Ceccon -
This seems to work:
result = string.gsub(/( |(".*?"))/, "\\2")Gene T : if you get into single- and double-quoted strings, you need to match opening and closing quote marksFrom Borgar -
try this one, string in single/double quoter is also matched (so you need to filter them, if you only need space):
/( |("([^"\\]|\\.)*")|('([^'\\]|\\.)*'))/From Senmiao Liu -
It's worth noting that any regular expression solution will fail in cases like the following:
a = 4, b = 2, c = "space" here"While it is true that you could construct a regexp to handle the three-quote case specifically, you cannot solve the problem in the general sense. This is a mathematically provable limitation of simple DFAs, of which regexps are a direct representation. To perform any serious brace/quote matching, you will need the more powerful pushdown automaton, usually in the form of a text parser library (ANTLR, Bison, Parsec).
With that said, it sounds like regular expressions should be sufficient for your needs. Just be aware of the limitations.
rjmunro : What is the 'correct' solution for this case?From Daniel Spiewak -
Daniel,
The space between double-quote and 'here' is NOT in quotes in your example.
From Senmiao Liu
0 comments:
Post a Comment