Tuesday, June 17, 2014

Ubuntu - Gnu parallel - It's awesome

GNU parallel is a shell package for executing jobs in parallel using one or more nodes. If you have used xargs in shell scripting then you will find it easier to learn GNU parallel,
because GNU parallel is written to have the same options as xargs. If you write loops in shell, you will find GNU parallel may be able to replace most of the loops and make them run faster by running several jobs in parallel.

To install the package

sudo apt-get install parallel

Here is an example of how to use GNU parallel.

If you have a directory which is having large log files and if you need to compute no of lines per each file and get the largest file. You can do it efficiently with GNU Parallel and it can utilize all your cpu cores in the server very efficient way.

In this case most heavier operation is calculating the number of lines of each file, instead of doing this operation sequentially we can do this operation parallely using GNU Parallel.

Sequencial way

ls | xargs wc -l | sort -n -r | head -n 1

Parallel way

ls | parallel wc -l | sort -n -r | head -n 1

This is only one example, like this you can optimize your operations using GNU parallel. :)

Friday, May 30, 2014

Shell script edited on windows - Issue when executing on linux

I faced the above issue when I'm trying to execute the script after doing some editing on windows. Those two issues are due to BOM character and the carriage return (\r) present in the file.

  • BOM (byte order mark) character -  This is a Unicode character used to signal the order of bytes in a text file or stream.
  • Carriage return (\r) -  Editors use in windows needs '\r' and '\n' both the characters together to interpret as new line, which is ‘\r\n’. But unix understand only (\n).

These above characters use in windows, but unix shell scripts won't understand those characters. Because of that you might face issues when running a bash script editted in windows. To fix this you need to remove those characters. This is how you can do it.

BOM character issue
You might see following issue from the first line of the script.

": No such file or directory1: #!/bin/bash" - If you get such kind of issue from 1st line of your script then you can cross-check the script and if there is no any visible issue. you can run the
following command.

$ head -n 1 your_script.sh | LC_ALL=C od -tc
$ 0000000 357 273 277   #   !   /   b   i   n   /   b   a   s   h  \r  \n

In the output if you can see "357 273 277" sequence, then this is the BOM character. So you need to remove it.

* Open the script using vim
* Type this and enter in the first line - ":set nobomb" - this will remove the BOM character from ur file.
* save the file and close - :wq

Carriage return issue

Carriage return present in the script might throw this issue.

"$'\r': command not found"
"syntax error near unexpected token `$'do\r''"

To fix this you need to remove the \r characters from your script. Use any unix way to replace \r character with empty string.

* String replace using sed command

$ sed -i 's/\r//g' your_script.sh

* String replace using perl

$ perl -pi -e 's/\r//g' your_script.sh

Now the script is ready to run in unix :)