Typos and Bitwise Operations
Sometimes Typos Can Be Fun
A little while ago, I was playing around with some code in the Python REPL1 and made a typo that spawned a whole rabbit hole of reading.
|
|
Time to do some digging.
Evil bitwise hacking
Let’s start with a quick explanation of what we’re actually looking at here, because some of the terms are confusingly similar.
Bit: A one or a zero. Multiple bits come together to form a byte, which can communicate information.
Byte (Octet): A byte is eight bits. For example, the letter “a” (specifically the lowercase version) is one byte in size and looks like this: 01100001
.
Nibble: Only four bits — half of a byte and capable of representing numbers from 0 to 15. For ease of reading, we’ll be mostly using nibbles as examples throughout this post.
Binary Numbers: how computers view numbers. This can be a very deep topic, but the gist of it is that all integer numbers are made of bytes. A nibble (a four-byte number) is something like 0000 or 1010, and the way that you read it is by starting at the right and counting left in powers of two. If we look at 1010, you read it by breaking it apart into places. Since there are four places, the meanings of each place are 8-4-2-1. Taking this, we can use our binary number (1010
) and look at the values to figure out what we have: (1 * 8) + (0 * 4) + (1 * 2) + (0 * 1)
, or 8 + 2 = 10. By the same logic, 0000
is (0 * 8) + (0 * 4) + (0 * 2) + (0 * 1)
, which turns out to be a big whopping zero.
Binary Arithmetic: Any operation or modification that involves two numbers, like multiplication or addition.
Unary Arithmetic: Any operation or modification that only involves one number.
MSB: Most Significant Bit. It’s the bit farthest to the left.
~: A tilde. A very lovely symbol. Sometimes referred to as “that squiggle”.
Note that this is all kind of a pain and rather more complicated than it seems to need to be, but a certain amount of complexity is what allows computers to do a lot of math very quickly, and doing it quickly is really what’s important here.
Going back to the original question of “what in the world happened when we ran ~4
”, it turns out that ~
is a unary operator
, in that it does math on a single number. Specifically, what it does is it performs a “NOT” operation on the number. A NOT operation flips all of the bits from 0 to 1 (or vice versa). This is what that looks like (spoiler alert: I’m about to ruin this and make it more complicated):
|
|
This interaction is called a “Two’s Complement”, and is used by pretty much all modern computers. (As opposed to the One’s Complement, which you can read about separately if you’re interested.) I deliberately left some pieces off of the earlier example, so let’s look at it again. The first thing we need to do is tack on another bit to the left side, and we’ll use that to keep track of whether a number is positive or negative. 0 is positive, 1 is negative.
|
|
Here’s where it gets real weird. When reading the complement (the inverted version), you treat the Most Significant Bit (MSB) as a number… that you subtract the others from. The way you read the last number is:
|
|
Fun fact: to get the proper negative version of your starting number, it’s the same process as outlined above, just with adding 1 (0 001) to the number before ‘assembling’ it with the MSB.
What can you do with it?
So now that we’ve gone over how this works, let’s talk about silly things you can do with it.
Accessing the end of lists
In Python, lists can be accessed from the end by using negative indexing, i.e.:
|
|
Notice that the first one is the second element in, but asking for the negative indexed version gets us the last element of the list. Enter the magic tilde:
|
|
Perfection, and very fast on large lists! In Python, there are only a handful of ways to go through lists backwards, and some of them are a little silly:
|
|
However, on lists of smaller size, it comes out to about the same:
|
|
A direct ’not’ operator
In some third party libraries like django
and numpy
, the tilde operator is implemented as a way to say “the opposite of this”. For Django, you can use them with Q() queryset objects to reverse the meaning of a Q() condition.
For example:
|
|
In numpy, you can use it on an array of booleans to flip everything in the array:
|
|
You can also use it to do silly things with pandas dataframes!
Write your own functionality
The tilde calls a magic method called __invert__()
, which you can implement yourself in your own objects!
|
|
What kind of incredibly questionable things can you add to your projects?
Wrapping up
Trying to figure out why all this works sent me down a rabbit hole, and for someone without a CS degree, there was a lot to uncover! Hopefully you learned something too. Go thou and invert!
The Python read-eval-print-loop system that runs when you just type
python
in your terminal. ↩︎