What's in my requirements.txt

Published on May 14, 2017, 11:19 a.m.

programming python unix

It's Sunday, and tomorrow is our scheduled monthly python meetup in Memphis, and it's one of those month's where I've been busy and I haven't done a good job of finding a speaker. So, that mean's I've got to pull something together at the last minute. While racking my brain for a quick-and-easy topic, I thought, "I wonder what python packages I'm using most?"

So, I ran this nifty monstrosity of a command, and here's the results. I also tweeted it out:

find . -name "requirements*txt" -exec cat {} \; | awk -F "==" '{ print $1 }' | sort | uniq -c | sort -rn | head -n 20

— Brad Montgomery (@bkmontgomery) May 14, 2017
$ find . -name "requirements*txt" -exec cat {} \; | awk -F "==" '{ print $1 }' | \
 sort | uniq -c | sort -rn | head -n 20
  23 ipython
  17 Django
  15 psycopg2
  11 requests
  11 pytz
  11 gunicorn
   9 ipdb
   9 django-extensions
   9 django-debug-toolbar
   8 six
   8 python-dateutil
   8 Pygments
   8 Jinja2
   7 tablib
   7 pandas
   7 docker-py
   7 Pillow
   7 MarkupSafe
   7 Flask
   7

It's really not that surprising, since I do a lot of work with Django, and I use a fair number of the same dependencies everywhere. If course, this is only for projects that I still have in my home directly, which is a fair representation of the last couple year's worth of work.

Let's break that command down a little, though:

find

I'll admit that I always forget how find works. Luckily this developerworks article is really good. I should probably book mark that.

I wanted to find all my requirements.txt files, even those that I may have named requirements_dev.txt or requirements_prod.txt. Usually all my projects just have a requirements.txt. This command will cat any matchign files.

find . -name "requirements*txt" -exec cat {} \;

awk

I then pipe that result into awk. I'm 99% sure most of my requirements files pin versions exectly, so the following awk command will take something like Django==1.9.1 and keep the Django part.

awk -F "==" '{ print $1 }'

sort & uniq

Then, sort the results, use uniq -c to count them, then sort again numerically (-n) in reverse order (-r):

sort | uniq -c | sort -rn

head

... and keep the top 20 results.

head -n

Apparently I also just have blank line in many of my requirements file. ¯_(ツ)_/¯

Anyway, hope you've enjoyed this little snippet. Run it on your projects and see what you get!


Update: @jackdied says you should use tail!

@bkmontgomery Drop the reverse on the sort and use tail!

— Jack Diederich (@jackdied) May 14, 2017
comments powered by Disqus