One doesn't usually think too hard about the performance of shell scripts. But if you scripts do a lot of work, chances are you can make them run several time faster!
The thing with traditional shell scripts is that they usually are made up of many calls to external utilities that do very specific things, like 'sed', 'grep', 'awk', 'cat', 'dog', 'head', 'tail', 'strings', and even 'perl'. It usually doesn't matter, because they're just running a couple of commands together as a convenience. So, not too many people think about the performance impact of using different unix utilities, and piping output to and from them. I think a lot of people will be surprised at how expensive it is.
The following is done in the traditional way of piping out put to unix utilities in order to do processing.
File: pipes.sh
c=0 for f in /dev/* ; do group=$(ls -l $f | cut -d ' ' -f 4) if [[ $group == audio ]] ; then ((c+=1)) fi done echo $c audio devices
Produced the following output:
$ time ./pipes.sh 7 audio devices real 0m7.307s user 0m2.736s sys 0m4.364s
Note that for each file in the /dev directory, 2 executables were run with 1 pipe.
So, about 7 seconds. OK, well, I'm not really sure how slow that is yet, because I don't have much to compare it to. It seems fine, right? It's not really that long!
Well, here's the same functionality with only a single external command and pipe:
c=0 ls -l /dev | { while read -a line ; do group=${line[3]} if [[ $group == audio ]] ; then ((c+=1)) fi done echo $c audio devices }
And the results:
$ time ./nopipes.sh 7 audio devices real 0m0.131s user 0m0.076s sys 0m0.044s
I would just like to point out that the version using pipes took 55 times longer! Using pipes to external utilities created a performance degradation of 5,500%!!!
The lesson is, doing as much work as possible in the local process, using only the scripting language itself, can drastically improve the performance of your scripts.
No comments:
Post a Comment