Python Tooling for the Busy Programmer (Part 1) 🔧
Python is one of the most popular programming languages in the world thanks to its expressive syntax, rich standard library, and friendly community, which make it pretty versatile and useful for a wide variety of applications and use-cases – from simple scripting and automation, to web development, to data science and artificial intelligence, to cyber-security, and much more… This means there is a large ecosystem of tools and libraries that solve many different problems for developers as well as end-users. However, the size of the ecosystem also leads to more difficulty for novice programmers to find their way around.
To help with this issue, I decided to make my own (not so short) list of tools and packages that I find useful in my day-to-day programming with Python. I chose to focus this list on software that helps with setting up a development environment, managing and building Python projects, enhancing code quality, and boosting productivity and workflow efficiency. As a result, it doesn’t include packages that would normally be considered a direct dependency to build the core functionality of any given piece of software (like web frameworks for example), as I think those depend very much on the requirements of each individual project. Before we dive in, let me quickly introduce the language and its ecosystem to the complete beginners that might be reading this…
A quick tour of Python
Source: xkcd
Python is an open source, community-driven programming language. The main governing body that supports the development of the language itself, its community, as well as most of the infrastructure around them, is the Python Software Foundation (PSF). The development of the language is conducted through Python Enhancement Proposals (PEPs), which define Python’s features and specification, document design decisions, and collect community input on important issues. The main implementation of the language is CPython, which as the name suggests is written in the C programming language, but there are many other Python implementations serving different purposes: to cite a few, there is the MicroPython interpreter targeting micro-controllers and other constrained environments, PyPy which makes use of just-in-time (JIT) compilation for better performance, Jython which runs on top of the Java Virtual Machine (JVM), and most recently RustPython.
Python’s open model means that it can be freely packaged and made available to the public, in a way that’s very similar to Linux distributions. The most popular of these alternative distributions is Anaconda, which is geared towards scientific computing use-cases, and comes with a lot of extra utilities. That being said, there is very little that you can’t achieve with the standard Python distribution or by downloading a third-party tool or library as required. Most of these packages are publicly available on the Python Package Index (PyPI), and you can easily install them on your machine using pip
, the package manager that come with Python.
Beyond Python’s “batteries included” design and philosophy (a.k.a. the “Zen of Python”), I think one of the killer features of the language is its vibrant community, producing all sorts of ressources to guide programmers of all backgrounds and skill levels through their Python journey. From bite-sized articles and tutorials, to books, courses, podcasts, and more. But if you’re more specifically interested into finding useful Python tools and libraries, I suggest you keep an eye on awesome-python.com (as well as awesomedjango.org and djangopackages.org for Django-specific packages).
A not so short list of Python tools
Okay. Now that you have a general idea about how huge the Python ecosystem is, let’s dive in!
Managing Python versions with pyenv
The first step to start working with Python is to install it on your system. If you’re on Windows, you need to download the installer from the official website and install it as you would any other piece of software, going over each step in the installation process. The big downside of this approach is that you need to manage which Python versions are installed manually. If you’re on a UNIX-like system like Linux and MacOS on the other hand, you probably already have python installed on your system, but it’s still advised to have your own Python installation rather than relying on the version that comes bundled with your OS / distribution (otherwise, you risk messing up your environment or even your entire OS in some cases).
This is why I recommend using pyenv
or its Windows port pyenv-win
to install and manage Python versions. That way, you can seamlessly install / uninstall any Python version you want on your system, have different versions installed at the same time, and even set up your environment to use different versions depending on the scope (global, per project / folder, or per shell). The advantage here is the control you gain over how your environment is set up and how and when it is updated, in addition to greater compatibility with software that requires some specific Python version to work. You can also take advantage of pyenv
if you want to work on a project that supports multiple Python versions – but honestly, there are better solutions (which I’ll talk about in the second part of this series) if your project requires this. In any case, there is an excellent tutorial on how to use pyenv
on RealPython.com which I recommend you read if you’re interested.
Managing end-user applications with pipx
Now what if you’re not really interested in doing any Python development yourself, and just want to download an .EXE file from the STUPID F*CKING SMELLY NERDS that did the hard work? Well in that case, the package you want to install is considered an end-user application, and you probably want to install it in an isolated environment to avoid any dependency conflicts with other software installed on your computer. You also want to add it to your $PATH
variable so you can call it from anywhere on your system.
To that end, it’s better to use pipx
as a package manager in place of pip
, with pretty much the same set of commands you can use with both tools. The difference, though, is that pipx
automatically creates a virtual environment where the application and its dependencies are installed independently of the reset of your system, then adds a symbolic link to the application on your $PATH
. This way, you can install, use and update your Python applications without worrying about where they’re installed or about managing dependencies. One such tool you might enjoy if you like playing with databases is halequin
, a terminal-based SQL IDE. Another one might be sherlock
, a tool to hunt down accounts by username across social networks – just don’t insult the developers please!
Source: Reddit
Managing dependencies and virtual environments with uv
So you actually want to do some Python development and you’re wondering how to quickly get started with a project. One of the easiest ways to set up a project and manage dependencies without a fuss is uv
. It’s another Python package manager that’s designed to be a drop-in replacement for pip
and pip-tools
workflows, which means you only need to list your project’s dependencies in a requirements file (requirements.in
traditionally, but also works with other ways to specify dependencies), execute the uv pip compile
command to generate the complete dependency tree for your project, and finally call uv pip sync
to pull these dependencies so you can import them in your code. uv
takes care of creating a virtual environment in the conventional .venv/
folder for your project (if it doesn’t detect any), which is considered a best practice in Python world so you don’t break dependencies for your project or other parts of your system. Last but not least, uv
is heavily inspired by Cargo (the Rust package manager), and is itself written in Rust – which makes it perform blazingly fast™!
Linting and formatting with ruff
One of the easiest way to enhance the quality of your code is to use a linter, a tool that checks your code against a predefined set of rules to avoid common mistakes and pitfalls and follow best practices. ruff
is one such tool from Astral, the people behind uv
which I mentioned earlier. What is great about ruff
is that it can easily replace a dozen other tools like flake8
, isort
, pydocstyle
and many others, and comes with over 800 built-in rules. It also doubles as an auto-formatter with a drop-in compatibility with black
(the official Python auto-formatter), and can apply many fixes to your code automatically. It implements a Language Server Protocol (LSP) [the mechanism that enables many of the features you’re used to in an Integrated Development Environment (IDE), like auto-complete, go-to-definition, etc.] for Python, which means it has great support for all kinds of code editors, including VS Code and Neovim. Oh, did I mention that it’s written in Rust? So you can expect it to run BLAZINGLY FAST™, I’m telling you – to the point where you might wrongly think it’s not working sometimes!
Checking data types with mypy
You probably already know that Python is a dynamic language, which means the type of your variables, expressions, etc., is inferred from the values you assign them and checked at runtime, rather than compile time like in statically typed languages. Python is also strongly typed, which means most type errors will break your application since it doesn’t perform implicit type conversions to force things to work when they shouldn’t like JavaScript does for example. This is in fact a good thing as it reduces undefined behavior in your code and makes errors more obvious instead of hiding them – thus making bugs easier to detect. But did you know Python also supports gradual typing? That means you can add type annotations to your variables, return values, function signatures, etc., to declare what type of data you expect.
In Python, these type declarations are called “type hints” and it’s important to know they’re just that (hints), which means they will not prevent your application from running (and breaking if there’s a problem), but it doesn’t mean they’re useless – quite the opposite in fact! You just need to use a type checker like mypy
to bear the fruits. mypy
will scan your codebase and check if you’re manipulating data correctly, and alert you if it finds any problems in that regard, so you can detect and fix many code defects even before running your program, thus eliminating many kinds of issues that would normally surface only at runtime – or worse, in production! It also improves your code editing experience since your IDE can know which type of data you’re working with more precisely and provide some features to make your life easier based on that information. mypy
is the most mature and supported type checker in the Python ecosystem with many integrations, but there are alternatives like pyright
which you might prefer. There’s little difference between the two, so the choice is one of personal preference, the most important being that you start incorporating type hints into your projects.
Validating data with pydantic
Now that you see the benefits of type annotations, you might want to take things further and validate data that flows through your application at runtime. To do this, you would normally have to include a lot of checks and guard clauses, perform data conversions, and handle exceptions related to typing where appropriate in your codebase – which honestly no one likes to do as it can add a lot of clutter to your logic. But fear not, because this is where pydantic
shines! It uses the standard type hinting syntax to validate data against a schema you can define in a DataClass, TypedDict or otherwise. You can even define your own validators to customize the validation behavior to your specific needs. pydantic
can also serialize data to other formats and produce a JSON Schema to easily integrate with other tools. Plus, it too is written in Rust… so do I really need to say it?!
Source: Reddit
Performing security checks with bandit
In software engineering, working code is not enough: it also needs to be well designed, readable, correct, stable, and perhaps most importantly, secure! There are many threat actors out there that would happily take advantage of your software if it benefits them. This is especially true for mission-critical software or that that is widely used. This is why you should always strive to enhance the security of your code and reduce the attack surface of your applications. Luckily, there are tools that can help achieve this goal by scanning your codebase and report security-related problems. bandit
is a package that helps you identify security issues in your projects and fix potential vulnerabilities before they become exploits. By using bandit
your can eliminate most common security issues that you might introduce in your codebase by accident. Of course, it’s in no way a guarantee against attacks, simply because there’s no such thing in cyber-security: software security is an endless struggle between good and bad actors, where you – a good actor presumably – can only try to increase the cost (in terms of time, compute, power…) for potential attackers to target you, keep up to date with new exploit techniques and vulnerabilities, and put reasonable protection mechanisms around your projects. In that regard, bandit
is certainly not enough for absolute security, but it’s a pretty good first line of defense I really advise you look into, if only to pick up basic security habits.
To be continued…
Wow! This blog post turned out to be longer than expected even with just a few tools being listed, but I really wanted to explain things as simply as I could, and include some context as to why you might want to use some of these tools in your Python projects. So I’ll just keep the rest for the next article, but in the meantime, I hope my explanations are clear enough and that you could at least find something useful to you. If not, I’d be happy to get your feedback.
Thanks for reading!
Sincerely,
Oussama