I’ve spent the better part of way too long, learning to python. I quite enjoy programming in python (as long as I can rein in my urge to make new classes) and the ready availability of advice§ is a bonus, but it’s the stuff that goes with it that irks me.
The irking starts when you try and install python. There are – I counted – twenty-seven million ways to do it. Every time you google “how to install python”, you’ll get another article or youtube video with a completely new method. At first, it doesn’t matter*: you download python – the latest version, because why wouldn’t you? – and everything works nicely.
Then you want to actually do something, which usually means installing a package. So you go back to google, back to the twenty million ways to do anything. Some of them even work. You end up installing various bits of software named after animals**, then something requires you to install C++ and suddenly you are trying to compile things in programming languages you were hoping to avoid when you took up the programming languages you were hoping to avoid when you opted for python. A bit more google fu and you realise that there is a Better Way†.
The Better Way means you have to uninstall python, or install a different version because the package you want only works with a version of Python that, at this point, is a dried-out snake skin covered in dust. Then you realise that one of the animal-themed programmes you foolishly chose has hijacked what the computer’s idea of python is.
So you uninstall everything††. And start again. Older. Not wiser.
After a few weeks of this, you find your own way of doing things that sort of works and you have to resist the urge to write a very definitive blog post about how this is definitely the way you should install python. You know that you’d only be adding to the utter chaos, and yet, the temptation is strong. Even then there’s a nagging feeling that you still have no idea how all of this works.
You tentatively try installing a few packages and it seems to be working OK, but writing code in Notepad isn’t working for you. In some of the videos, you remember fancy text editors that made all the different bits of code exciting different colours with a Matrixy black background. There are a bunch of different options, so you download a promising one and, realising it’s already installed, you fire it up. However, this one won’t talk to the version of python you have installed. It’s pining for an installation of python you removed*** a hundred years ago.
You try a few more, then you realise, you have to uninstall everything and start again.
Now you have your IDE and python. You can install packages and miraculously you haven’t yet expended all your energy so you want to do something useful with it all. Being a conscientious programmer you want to use version control and you want to test your code.
Version control is a no brainer: it’s git or nothing. You start reading about git and your spidey sense starts tingling when everyone wants to explain how git works. 90% of the time they want to offer you the “simple” explanation, or explain it in a window of time that even the busiest individual could eke out of their day. This might seem rather innocent – helpful, even – but remember: no one ever wants to tell you how windows works or taps (say), they just show you how to do what you want to do. The spider sense starts throbbing like a three day migraine when they say, “but you don’t need to understand graph theory” to use it. Eventually, you realise that Git’s just one of those things you have to use because everyone else uses it****.
The beauty of git is that you can claim to use version control while largely carrying on as you did before version control. You make a big change to your code and everything goes wrong, so you want to go back a step. In days before git, you’d just look in the directory called “archive_7_new_TODAY” and copy the files you wrecked. With git all you have to do is copy the current contents of the directory to “archive_7_new_TODAY” and utter the git command you wrote on a post-it note that copies everything from the repository, overwriting your mistakes. Congratulations, you are using version control.
It should be that easy, but it’s not. First, you forget (or suppress the memory) that you downloaded git before and that you set up your credentials for a github repository that you have long since forgotten. Every time you try to get some code from github, no matter what it is and no matter what it claims to be doing, it downloads the exact same ancient repository. This time, uninstalling and reinstalling git don’t work, so you have to search for the pesky file somewhere on the file system. Deleting this fixes things, or, at least, allows you to get deeper into trouble…
You want to write tests. Once again, there are options – pytest and unittest chiefly. Your notes from the python course you attended recommend unittest so you go with that, but you soon find out that, in the aeon of time that has since elapsed, everyone started using pytest because it is the Better Way. Everyone that is, except every single question and answer on stack overflow. Nevertheless, you can muddle through. That is, until you get to mocking, at which point half the answers to questions just recommend rewriting your code completely anyway, the other half don’t work and the remaining half trick you into using pytest-mock, which adds a layer of complexity to everything†††.
By this point, you may even have some code that you want other people to use although it’s possible that in the intervening time several civilisations have risen and fallen. Whenever in time you find yourself though, responsible sharing means documentation. As with everything else, advice – particularly that given to beginners – is all over the place. There are numerous different ways of setting documentation up (in a number of different ad hoc markdown languages) and no one ever says why you might want to do it that way, or whether you are compelled to. Even then, it’s not clear how documentation works with everything else. If you use sphinx***** it spams a whole bunch of output into your repository, which seems like it needs to be tidied up. Git starts highlighting all the automatically generated files in suspicious I-don’t-know-what-this-is red. You have an explore, and there are plenty of others. The IDE has generated a bunch, python has cached stuff everywhere, pytest has been busy. There are builds and eggs and a whole bunch of other stuff, which is when you find out about .gitignore.
.gitignore tells git to ignore things. I found some example gitignore files for python. They are hilariously long.
And finally, there’s the packaging – tying the whole project up with a little bow and a thoughtful handwritten note. As with every other aspect of this process, there’s a long queue of people who will solemnly tell you *this* is the way to do it and you will, regretfully, have to reorganise all your code once again. You also have to add a bunch of stuff. What you have to add can be pieced together only slowly in response to baffling absences. Why is none of my code accessible? Where did the data files go? No, really, where the hell did they go? What do you mean dependencies? The answer to some of these questions is to put an empty file in each directory. This feels like a rather handy stand in for the whole process. Unlike git, which needs to be told to ignore almost everything, packages ignore everything as a default and can be grudgingly persuaded to acknowledge the existence of things like the code, and the data.
The final product might not even have ten lines of actual functional python code, but it took you a month to piece together. Despite this, you still have the sneaking suspicion you didn’t quite do it right. You know that some things could be more neatly stowed and for some reason autodoc has gone completely berserk and seems to have opened an annex of L-space, but frankly you are exhausted and full now of regret. Was it all worth it, you ask yourself? Consolation comes from the thought that next time, it will be so much easier.
Haahahhahahah h hahaaahh hhahaha hahahaa.
Oh, you poor chump.
§ Advice that often turns out to be pitched at someone who knows slightly more than you do, or slightly less, leaving the crucial piece of information unstated because it is too trivial to mention, or too complex for you right now.
* I’d just say, avoid the ones where the person says “ignore all the other ways to do it, this is the proper way”, because you will shortly find yourself in a situation metaphorically, if not literally, like those movie scenes in which someone has to choose between cutting the yellow and blue wires before the dramatically large and visible counter reaches zero, or they’ll nonchalantly say something about hand editing the registry. After all that, it still won’t work, and nothing else will.
** pandas I can live with, but spider? Nopenopenope. And the only thing I know about anacondas is they’re thick as tree trunks and like to hug… you to death at the bottom of a muddy river before swallowing you whole. The name may not be wholly inappropriate.
† There are many Better Ways. This was just the first. What you don’t realise at this point is that there have been so many Better Ways, but sadly, when an Even Better Way is found, no one tidies up the internet.
†† You even consider buying a new computer, rather than try and work out where the programs you installed squirrelled away their configurations.
*** and killed with fire.
**** Everyone uses it because programmers use it. Why programmers use it is something I’d best not explain because I might need their help in the future.
††† The documentation refers to it as “lightweight wrapper”, which sounds great until you realise that’s like wrapping decorative tissue paper round a cannonball. The cannonball is still a cannonball and the tissue paper, while making everything look pretty, just makes the cannonball slightly harder to handle.
***** the animals are mythic at this point, and I note here that the sphinx was a tricky sod.