2

I'm trying to make a "sed-replace"-function with arbitrary input as argument, but it doesn't work well. Let me illustrate the problem, first by showing the input file (a simplified file):

$ cat /tmp/makefileTest
#(TEST CASE 1) bla bla line 1, relatively simple:
CFLAGS += -Wunused # this is a comment

#(TEST CASE 2) bla bla line 4, uses some expansion
cp $(OBJ_DIR)/$(EXE_NAME) /tmp/somewhere

#(TEST CASE 3) bla bla line 7, here is a complicated line ending in weird characters:
cd $(SOME_UTIL_BIN); ./somecommand $(BUILD_DIRECTORY_PATH)/$(OBJ_DIR)/\$\^

So I want to apply some custom stuff to this input file (every time I "git pull"), meaning I have a pull-script that checks out a clean copy and then I have a script that should do the necessary modifications, on top of the latest version. The following method below is used on testcase 1 and testcase 2 shown above, the problem however is that it involves a LOT of manual work, hence I call it the "tedious method". I take the input line, modifies and the sed-function should do the necessary replacement(s):

$ cat /tmp/testFunctionTedious.sh 
#!/usr/bin/env bash

# The old, tedious method, first defining input-file with test-cases:
inFile='/tmp/makefileTest'    

#    ----==== TEST-CASE 1 FROM THE INPUT FILE ====----
charsFoundFromGrep=$(grep -in 'CFLAGS += -Wunused # this is a comment' "$inFile" | wc -c)
if [ "$charsFoundFromGrep" = "0" ]; then
    echo "Custom makefile modification (CFLAGS += -Wunused # this is a comment) NOT found, doing nothing!"
elif [ "$charsFoundFromGrep" = "41" ]; then
    echo "Custom makefile modification (CFLAGS += -Wunused # this is a comment) found and will be applied..."
    sed -i 's/CFLAGS += -Wunused # this is a comment/CFLAGS += -Wall # here I changed something/g' "$inFile"
else
    echo "ERROR: Unhandled custom makefile modification (CFLAGS += -Wunused # this is a comment), please fix..."
    exit 1
fi

#    ----==== TEST-CASE 2 FROM THE INPUT FILE ====----
# Notice below that I need to escape $(OBJ_DIR) and $(EXE_NAME), not to
#  mention the two forward slashes in the "sed"-line, it's definately not just "plug-and-play":
charsFoundFromGrep=$(grep -in 'cp $(OBJ_DIR)/$(EXE_NAME)' "$inFile" | wc -c)
if [ "$charsFoundFromGrep" = "0" ]; then
    echo "Custom makefile modification (cp \$(OBJ_DIR)/\$(EXE_NAME)) NOT found, doing nothing!"
elif [ "$charsFoundFromGrep" = "43" ]; then
    echo "Custom makefile modification (cp \$(OBJ_DIR)/\$(EXE_NAME)) found and will be applied..."
    sed -i 's/cp \$(OBJ_DIR)\/\$(EXE_NAME)/cp \$(OBJ_DIR)\/\$(EXE_NAME_NEW)/g' "$inFile"
else
    echo "ERROR: Unhandled custom makefile modification (cp $(OBJ_DIR)/$(EXE_NAME)), please fix..."
    exit 1
fi

I'm trying to learn to make a better/smarter method and learning about bash variable expansion/substitution and handling of special characters. To make things more efficient, I've tried to create the following script and here's where things get too complicated:

$ cat /tmp/testFunction.sh 
#!/usr/bin/env bash

# The method I struggle with and ask for help with, first defining input-file with test-cases
inFile='/tmp/makefileTest'

# *** Defining a sedReplace-function below ***
#   First arg: Search (input) string
#   Second arg: Replacement (output) string
#   Third arg: Expected number of characters using 'grep -in "$1" "$inFile" | wc -c)',
#      this is just to ensure the line I'm going to run sed on didn't change, otherwise
#      output and error involving the input message (hence the string comparison that
#      relates argument 3 with grep from argument 1 (the input string).
sedReplace(){
    # sed -i 's/$1/$2/g' "$inFile"
    charsFoundFromGrep=$(grep -in "$1" "$inFile" | wc -c)
    if [ "$3" == "$charsFoundFromGrep" ]; then
        # Getting the line below right is REALLY difficult for me!
        execLine="sed -i 's/$1/$2/g' \"$inFile\""
        # Printing the line, so I can see it before executing the line:
        echo "$execLine"
        # Executing the line if ok (disabled as it doesn't work at the moment):
        #$($execLine)
    else
        echo "ERROR: Unhandled custom makefile modification (expected: $1)), please fix..."
        exit 1
    fi
}

# And below the function is used (1st arg is input, 2nd arg is sed-
#   output and 3rd arg is grep comparison word count):

#    ----==== TEST-CASE 1 FROM THE INPUT FILE ====----
sedReplace 'CFLAGS += -Wunused # this is a comment' 'CFLAGS += -Wall # here I changed something' 41

#    ----==== TEST-CASE 2 FROM THE INPUT FILE ====----
#sedReplace 'cp $(OBJ_DIR)/$(EXE_NAME)' 'cp $(OBJ_DIR)/$(EXE_NAME_NEW)' 43

#    ----==== TEST-CASE 3 FROM THE INPUT FILE ====----
# Once the above 2 cases work, here's the last test-case to try the sedReplace function on (the hardest, I imagine):
# And here grep don't work, due to the special characters
#sedReplace 'cd $(SOME_UTIL_BIN); ./somecommand $(BUILD_DIRECTORY_PATH)/$(OBJ_DIR)/\$\^' 'cd $(SOME_UTIL_BIN); ./someOTHERcommand $(BUILD_DIRECTORY_SOMETHING_ELSE)/$(OBJ_DIR)/\$\^'

You'll easily see that the last script doesn't work. I've tried to google and lot about similar problems, but can't find it. I don't know how to finish my sed-function. That's what I'm asking about help for. Qualified and interested people should be able to run the scripts and input-file exactly as shown here and I look forward to see if anyone can solve the problem.

Okay Dokey
  • 108
  • 16
  • Shouldn't "$s3" be "$3"? – schrodingerscatcuriosity Nov 19 '19 at 17:43
  • Sorry, you're right. I've changed it, thanks. – Okay Dokey Nov 19 '19 at 20:52
  • "I have many different problems with this" please try to ask about a single, specific issue at a time - what does "make the script work with all kinds of characters" mean, exactly? – steeldriver Nov 19 '19 at 22:39
  • Run your script through shellceheck and it will tell you things such as to double quote "$1" and "$2" – WinEunuuchs2Unix Nov 19 '19 at 23:06
  • @steeldriver : I've rephrased the whole question, inserted additional comments. Is it better now? I hope, thanks... – Okay Dokey Nov 20 '19 at 00:04
  • @WinEunuuchs2Unix : Thanks a lot for the reference to spellcheck.net, this is new to me. But I've reformulated the whole question and made some changes, inserted extra comments etc. It should be more clear now, where I'm stuck. I don't think spellcheck.net can help me with understanding how to fix my script (the one I call /tmp/testFunction.sh)... I hope I didn't misunderstood you, otherwise please let me know. – Okay Dokey Nov 20 '19 at 00:07
  • Excellent pa4080, thanks! I've upvoted and accepted the answer. I'll just test if there's anything I've missed and possibly ask a follow-up question later. This seems very useful and I can learn a lot from studying it, exactly the reason why I put a bounty on this. Thanks! – Okay Dokey Nov 26 '19 at 11:01

2 Answers2

1

Here is a modified version of your script, that works well only with the first test case:

#!/usr/bin/env bash

inFile='/tmp/makefileTest'

sedReplace(){
    charsFoundFromGrep="$(grep -in "$1" "$inFile" | wc -c)"

    if [ "$3" == "$charsFoundFromGrep" ]; then
        # 1. The single quotes inside double quotes are threat as regular characters
        # 2. During the assignment, the variables $1, $2 and $inFile will be expanded
        # 3. The variable $execLine will have the following value:
        #    sed -i 's/CFLAGS += -Wunused # this is a comment/CFLAGS += -Wall # here I changed something/g' '/tmp/makefileTest'
        execLine="sed -i 's/$1/$2/g' '$inFile'"

        # We need 'eval' to convert the variable to a command in this case,
        # because the value of the variable contains spaces, quotes, slashes, etc.
        eval "$execLine"
    else
        echo "ERROR: Unhandled custom makefile modification (expected: $1)), please fix..."
        exit 1
    fi
}

sedReplace 'CFLAGS += -Wunused # this is a comment' 'CFLAGS += -Wall # here I changed something' '41'

In the above example is used the command eval, recently we discuses its usage, pros and cons within the last part of this answer and the related comments. It is a good idea to avoid the usage of eval if it is possible, so here is my next suggestion:

#!/usr/bin/env bash

sedReplace(){
    # 1. Note we do not need to store the output of the command substitution $()
    #    into a variable in order to use it within a test condition.
    # 2. Here is used the bash's double square brackets test [[, so
    #    we do not need to quote the variable before the condition.
    #    If the statement after the condition is not quoted the (expanded) value
    #    will be threat as regexp. Currently it is treated as string.
    if [[ $3 == "$(grep -in "$1" "$inFile" | wc -c)" ]]
    then
        # 1. Note the double quotes here.
        # 2. The sed's /g flag is removed, because, IMO, we don't need it in this case at all.
        sed -i "s/$1/$2/" "$inFile"

    else
        echo "ERROR: Unhandled custom makefile modification (expected: $1)), please fix..."
        exit 1
    fi
}

# here are used double quotes in case the value comes from a variable in the further versions
inFile="/tmp/makefileTest"

sedReplace 'CFLAGS += -Wunused # this is a comment' 'CFLAGS += -Wall # here I changed something' '41'

The above example still works only with the first test case. For the rest test cases we need to use grep -F in order to threat the pattern as fixed string (references). Also we need to replace some characters within the searched string/pattern before use it with sed (probably there is more elegant solution, but I couldn't found it). The third thing we need to do is to change the sed's delimiter from / to any character that is not used within our strings - in the example below is used :.

In addition I would include the name of the input file as positional parameter too, and would assign the positional parameters to local variables in order of easy reading.

Here is the final solution (uncomment -i to do the actual changes):

#!/usr/bin/env bash
sedReplace() {
    local the_searched_string="$1" the_replacement="$2" the_lenght="$3" the_file="$4"

    if [[ $the_lenght == "$(grep -inF "$the_searched_string" "$the_file" | wc -c)" ]]
    then
        the_searched_string="$(sed -r 's/(\^|\$|\\)/./g' <<< "$the_searched_string")" # replace some special characters within the searched string by any char '.'
        sed "s:$the_searched_string:$the_replacement:" "$the_file" #-i
    else
        echo "ERROR: Unhandled custom makefile modification (expected: ${the_searched_string})..."
        exit 1
    fi
}

inFile="/tmp/makefileTest" 

# Test all cases:
echo -e '\n\n# --- Test case 1 -----'
the_string='CFLAGS += -Wunused # this is a comment'
sedReplace "$the_string" \
           'CFLAGS += -Wall # something is changed' \
           "$(wc -c <<< '2:'"$the_string")" \
           "$inFile"
echo -e '\n\n# --- Test case 2 -----'
the_string='cp $(OBJ_DIR)/$(EXE_NAME) /tmp/somewhere'
sedReplace "$the_string" \
           "${the_string} # something is changed" \
           "$(wc -c <<< '5:'"$the_string")" \
            "$inFile"
echo -e '\n\n# --- Test case 3 -----'
the_string='cd $(SOME_UTIL_BIN); ./somecommand $(BUILD_DIRECTORY_PATH)/$(OBJ_DIR)/\$\^'
sedReplace "$the_string" \
           "${the_string} # something is changed" \
           "$(wc -c <<< '8:'"$the_string")" \
           "$inFile"

Probably, according to your needs, you can use sha256sum (or some other check sum tool) instead of wc -c for more strict line check:

#!/usr/bin/env bash
sedReplace() {
    local the_searched_string="$1" the_replacement="$2" the_lenght="$3" the_file="$4"

    if [[ $the_lenght == "$(grep -inF "$the_searched_string" "$the_file" | sha256sum)" ]]
    then
        the_searched_string="$(sed -r 's/(\^|\$|\\)/./g' <<< "$the_searched_string")" # replace some special characters within the searched string by any char '.'
        sed "s:$the_searched_string:$the_replacement:" "$the_file" #-i
    else
        echo "ERROR: Unhandled custom makefile modification (expected: ${the_searched_string})..."
        exit 1
    fi
}

inFile="/tmp/makefileTest"

# Test all cases:
echo -e '\n\n# --- Test case 1 -----'
the_string='CFLAGS += -Wunused # this is a comment'; the_line='2'
sedReplace "$the_string" \
           'CFLAGS += -Wall # something is changed' \
           "$(sha256sum <<< "${the_line}:${the_string}")" \
           "$inFile"
echo -e '\n\n# --- Test case 2 -----'
the_string='cp $(OBJ_DIR)/$(EXE_NAME) /tmp/somewhere'; the_line='5'
sedReplace "$the_string" \
           "${the_string} # something is changed" \
           "$(sha256sum <<< "${the_line}:${the_string}")" \
            "$inFile"
echo -e '\n\n# --- Test case 3 -----'
the_string='cd $(SOME_UTIL_BIN); ./somecommand $(BUILD_DIRECTORY_PATH)/$(OBJ_DIR)/\$\^'; the_line='8'
sedReplace "$the_string" \
           "${the_string} # something is changed" \
           "$(sha256sum <<< "${the_line}:${the_string}")" \
           "$inFile"

Update: Because the searched strings are really complicated, here is an example how to calculate the delimiter dynamically (note the 4th test case):

#!/usr/bin/env bash
sedReplace() {
    local the_searched_string="$1" the_replacement="$2" the_lenght="$3" the_file="$4" d="$5"

    if [[ $the_lenght == "$(grep -inF "$the_searched_string" "$the_file" | wc -c)" ]]
    then
        the_searched_string="$(sed -r 's/(\^|\$|\\)/./g' <<< "$the_searched_string")" # replace some special characters within the searched string by any char '.'
        the_expression="s${d}${the_searched_string}${d}${the_replacement}${d}"
        #echo "$the_expression"
        sed "$the_expression" "$the_file" #-i
    else
        echo "ERROR: Unhandled custom makefile modification (expected: ${the_searched_string})..."
        exit 1
    fi
}

get_delimiter() {
    unset delimiter

    for d in '/' ':' '#' '_' '|' '@'
    do
        if ! grep -qoF "$d" <<< "$the_string"
        then
            delimiter="$d"
            break
        fi
    done

    if [[ -z $delimiter ]]
    then
        echo 'There is not appropriate delimiter for the string:'
        echo "$the_string"
        exit 1
    fi
}

inFile="/tmp/makefileTest"

# Test all cases:
echo -e '\n\n# --- Test case 1 -----'
the_string='CFLAGS += -Wunused # this is a comment'
get_delimiter
sedReplace "$the_string" \
           'CFLAGS += -Wall # something is changed' \
           "$(wc -c <<< '2:'"$the_string")" \
           "$inFile" "$delimiter"

echo -e '\n\n# --- Test case 2 -----'
the_string='cp $(OBJ_DIR)/$(EXE_NAME) /tmp/somewhere'
get_delimiter
sedReplace "$the_string" \
           "${the_string} # something is changed" \
           "$(wc -c <<< '5:'"$the_string")" \
            "$inFile" "$delimiter"

echo -e '\n\n# --- Test case 3 -----'
the_string='cd $(SOME_UTIL_BIN); ./somecommand $(BUILD_DIRECTORY_PATH)/$(OBJ_DIR)/\$\^'
get_delimiter
sedReplace "$the_string" \
           "${the_string} # something is changed" \
           "$(wc -c <<< '8:'"$the_string")" \
           "$inFile" "$delimiter"

echo -e '\n\n# --- Test case 4 -----'
the_string='/:#_|@'
get_delimiter
sedReplace "$the_string" \
           "${the_string} # something is changed" \
           "$(wc -c <<< '8:'"$the_string")" \
           "$inFile" "$delimiter"

Here is another version of the above:

#!/usr/bin/env bash
sedReplace() {
    local d the_searched_string="$1" the_replacement="$2" the_lenght="$3" the_file="$4"
    # the content of this function could be placed here, thus we will have only one function
    get_delimiter "$the_searched_string"

    if [[ $the_lenght == "$(grep -inF "$the_searched_string" "$the_file" | wc -c)" ]]
    then
        the_searched_string="$(sed -r 's/(\^|\$|\\)/./g' <<< "$the_searched_string")"
        the_expression="s${d}${the_searched_string}${d}${the_replacement}${d}"
        sed "$the_expression" "$the_file" #-i
    else
        echo "ERROR: Unhandled custom makefile modification (expected: ${the_searched_string})..."
        exit 1
    fi
}

get_delimiter() {
    # define an array of possible delimiters, it could be defined outside the function
    delimiters=('/' ':' '#' '_' '|' '@' '%')

    for delimiter in ${delimiters[@]}
    do
        if ! grep -qoF "$delimiter" <<< "$1"
        then
            d="$delimiter"
            break
        fi
    done

    if [[ -z $d ]]
    then
        echo "ERROR: There is not appropriate delimiter for the string: ${1}"
        exit 1
    fi
}

inFile="/tmp/makefileTest"

# Test all cases:
echo -e '\n\n# --- Test case 1 -----'
the_string='CFLAGS += -Wunused # this is a comment'
sedReplace "$the_string" \
           'CFLAGS += -Wall # something is changed' \
           "$(wc -c <<< '2:'"$the_string")" \
           "$inFile"

echo -e '\n\n# --- Test case 2 -----'
the_string='cp $(OBJ_DIR)/$(EXE_NAME) /tmp/somewhere'
sedReplace "$the_string" \
           "${the_string} # something is changed" \
           "$(wc -c <<< '5:'"$the_string")" \
            "$inFile"

echo -e '\n\n# --- Test case 3 -----'
the_string='cd $(SOME_UTIL_BIN); ./somecommand $(BUILD_DIRECTORY_PATH)/$(OBJ_DIR)/\$\^'
sedReplace "$the_string" \
           "${the_string} # something is changed" \
           "$(wc -c <<< '8:'"$the_string")" \
           "$inFile"

echo -e '\n\n# --- Test case 4 -----'
the_string='/:#_|@%'
sedReplace "$the_string" \
           "${the_string} # something is changed" \
           "$(wc -c <<< '8:'"$the_string")" \
           "$inFile"
Glorfindel
  • 971
  • 3
  • 13
  • 20
pa4080
  • 29,831
  • Accepted the answer an marked upvoted, excellent post. Just to answer if I really need the "grep -n" and "wc -c": I prefer to use it because it's stricter than only grep -q. It's not bulletproof however. For instance it detects if a line is moved from line 5 to line 15 (the first takes a single character, the latter 2 characters). In this place I would receive an error/warning, forcing me to do a manual check, if every is ok - which is fine for larger file modifications (but not for minor changes). I'm sure one could use even more advanced methods, but this check is sufficient for me. Thanks! – Okay Dokey Nov 26 '19 at 12:07
  • I found a minor thing regarding the line: "echo "ERROR: Unhandled custom makefile modification (expected: ${the_regexp})..."" -> I just replaced it with: "ERROR: Unhandled custom makefile modification (expected: "$1")..." - maybe you would want to update it, just for the reference (if you agree)? – Okay Dokey Nov 26 '19 at 13:23
  • Thanks, @OkayDokey! I've fixed the typo and also added an idea for more strict line check. – pa4080 Nov 26 '19 at 16:50
  • Thanks @pa4080! A very good idea about the strict sha256 in combination with grep -in (the line number). This tells exactly not only if the line changed - but also if the line has moved (due to insertion/deletion in other places)... I can also add that I had a little problem with sed and ":" as separation character. I switched to "%" percent, which also caused me problems. Finally I switched to a vertical bar "|" as sed separation character. It works for now... Not sure if a situation will ever arise where this also causes problems, not sure if there's a better sep-character? Anyway, thanks! – Okay Dokey Nov 27 '19 at 09:34
  • Hello, @OkayDokey, I've got an idea that probably can solve the problem with the delimiter - please check the last update, where the delimiter is calculated dynamically, depending on the searched string. – pa4080 Nov 28 '19 at 06:55
  • Hello @pa4080, oh, sorry I almost forgot to reply. I just ran your 2 scripts, but initially thought you forgot to add a 4th test case to the cat /tmp/makefileTest. But now I realize you probably intentionally did it this way and I've studied the new get_delimiter-function - it's very very nice... I just want to ask you: Does this mean, that in theory we can construct a line, that will prove that even version 2 isn't completely 100% "bulletproof"? I've awarded the bounty, but this was/is also something I'm very interested in knowing (but otherwise, it could be a separate question). Thanks! – Okay Dokey Nov 29 '19 at 15:37
0

This doesn't necessarily answer the question directly but you may want to look at other similar attempts to simplify sed, one such example is sd.

  • That is very interesting, thanks. However I really prefer to use common gnu/linux tools, so I don't have to download things from e.g. github and partly I also asked the question because I wanted to learn how to do it the "proper way". I can see the question is downvoted, I'll reformulate the question and post an updated version of the question shortly, thanks... – Okay Dokey Nov 19 '19 at 23:33