Source code swear count

After some discussions in our development team I was curious about how much we swear in our source code, especially considering that there is a complete log available on the Linux Swear Count.

I downloaded the script from Vidar and realized it was indeed rubbish 😉 . I used a little excerpt from the script to create my script which would count swears in every file found in a directory. It uses basic Linux commands like awk, grep, …

# P_W999 - 2013 - This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
# This script comes as-is, with no guarantees or warranties and it is used at own risk.
# v0.1 - 2014-04-30
# v0.2 - 2014-05-01 : change 'find' so files with spaces in it work too /+ no more suffix in awk match part
# * Initial release
# Small script based on (Vidar Holen) to calculate the number of swears (and some other words) that can be found in your source code.
# The script takes a single parameter; the folder where your source code is located. So for example ./ ./trunk will lookup all nasty words in the trunk folder and subfolder.
# The script will look for exact matches with spaces before and after the word (otherwise 'fix' will also match 'prefix' )

# Check input
if [[ "$SRCPATH" == "" ]] ; then
	echo "You must pass the source code folder as first parameter"
	exit 1

# Define paths to ignore
echo "Script will ignore files with following patterns $IGNORES" | sed 's/\\|/, /g'

# Magic
find $SRCPATH -type f -name '*.java' -o -name '*.jsp' -print0 | xargs -0 -n 1 | grep -v -e $IGNORES | xargs cat | awk '
        BEGIN {
            w="fuck fucking shit love piss fire bastard crap crappy goto bullshit xxx todo fixme temporary bastard bug fix";
            print "Looking for words: " w
	    n=split(w,t," ");
            for(i=1; i<=n; i++) {
            for(k in c) {
                do {
                    a=index(f," " k);
                    a1=index(f,"//" k);
                    a2=index(f,"," k);
                    if(a!=0 || a1!=0 || a2!=0) {
			# print the lines found
			gsub(/^[ \t]+/, "", $0);
			print k ": " $0
                    else break;
                } while(1);
	# Print stats
        END {
            print "lines " w;
            printf "%d ",lines;
            for(i=1; i<=n; i++) printf "%d ",c[t[i]];
            printf "\n";


2 thoughts on “Source code swear count

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s