Monday, April 11, 2011

Why doesn't my variable interpolate correctly when I build up a Mysql query?

I am trying to write a regex expresstion in mysql from a Perl program. I want to have query such as this:

WHERE a.keywords REGEXP '[[:<:]]something[[:>:]]'

However, in Perl when I make this query I am getting error when concatenating:

for($i=0;$i<$count;$i++){
    $where = $where . "'[[:<:]]$andkeywords[$i][[:>:]]' ";  #errors

Where as this does not give me an error:

for($i=0;$i<$count;$i++){
    $where = $where . "'[[:<:]] $andkeywords[$i] [[:>:]]' ";  #no error

In the 'no error' code notice that there are extra spaces. But if I have extra spaces then I do not get the resuls I want because in the DB there are no 'extra spaces'.

From stackoverflow
  • I've never really trusted the autoreplacment of variables in strings like that. You may want to consider explicitly doing the concatenation you want like this:

    for($i=0;$i<$count;$i++){
        $where=$where . "'[[:<:]]" . $andkeywords[$i] . "[[:>:]]' ";
    

    EDIT: As ephemient points out the generally accepted way to do this inline is

    for($i=0;$i<$count;$i++){
        $where=$where . "'[[:<:]]${andkeywords[$i]}[[:>:]]' ";
    

    Personally I find the first way more readable but as with all things Perl, TIMTOWTDI

    ephemient : I'm getting cognitive dissonance from seeing "officially correct way" and "TMTOWTDI" in the same post ;)
    Mykroft : Is generally accepted better? :-p
  • The reason in this case is that "$andkeywords[$i][[:>:]]" is being interpreted as a multi-dimensional array, and :>: is not a valid array index.

    I personally prefer Mykroft's approach, but you could also achieve the same result by escaping the final opening bracket as so:

    $where=$where."'[[:<:]]$andkeywords[$i]\[[:>:]]' ";
    
  • Just for completeness sake, this works too:

    for ($i = 0; $i < $count; $i++) {
        $where .= "'[[:<:]]${andkeywords[$i]}[[:>:]]' ";
    }
    

    ${blah} isn't valid outside of a string, but inside of a interpolatable string, it's equivalent to $blah.

    I would have thought that this pattern is more common than the other answers, though... after all, how else do you want to type "foo${var}bar"? Obviously "foo$var\bar" doesn't work, since \b is a recognized escape sequence.

  • <Obligatory security moan>

    Please use a DBI parameter for each regex value instead of interpolating it. Why?

    1. There are no longer any constraints on what characters are allowed. Currently, if any element of @andkeywords contains a quote, backslash or special regex character, things will break. E.g. the keyword "O'Reilly" will cause a database error.
    2. People won't be able to construct malicious keywords to reveal information they shouldn't see or wreak havoc. (Imagine if a user entered "'; drop database;" as a keyword.) This is called an SQL injection attack, and the web is rife with poorly coded websites that are susceptible to them. Don't let yours be one of them.

    Even if @andkeywords is not populated from user-entered data, it takes almost no extra effort to use DBI parameters, and your code will be safe for use in future unknown environments.

    </Obligatory security moan>

    Dave Sherohman : Thank you, jrh, for going beyond answering "how do I interpolate this string" to add "but you really shouldn't be interpolating a string there in the first place".
  • It would be helpful if you would include the text of any error messages.

    Something tells me that

    for($i=0;$i<$count;$i++){
        $where=$where . "'[[:<:]]" . $andkeywords[$i] . "[[:>:]]' ";
        ...
    }
    

    Could be simplified to

    for (@andkeywords) {
        $where .= qq('[[:<:]]${_}[[:>]]' );
        ...
    }
    

    Or perhaps

    $where .= join ' ', map { qq('[[:<:]]${_}[[:>:]]') } @andkeywords;
    

0 comments:

Post a Comment