Just thought I’d document this as its likely to be the sort of thing that others might need. I was doing a bit of a tidyup of our javascript files on the global justice site New Internationalist and I needed to know which javascript files were still in use. Its a recursive grep for .js files right. Well almost. I used the little-known -o parameter for grep, along with -h to make sure I only got a list of .js files enclosed in quotes -- I didn't care which files referenced them. I then removed the quotes, which had to be there for greppage with tr. Next I piped the output through sort -u to get me unique occurences. I might have used -c if I’d needed to know how often files were referenced. Finally I dumped the whole lot into a file. Job done.
$ grep -rhoE "[A-Za-z\.-_]+.js" /var/www/mysite/ | tr -s '"' '\0' | sort -u > 2011-04-05-mysite-all-js-files.txt