With one last code modification, you’re in, and the contents of the vault are yours! Cracking Codes with Python is not quite about breaking into banks or pulling off elaborate heists, but it’s always fun to dream.
Cracking Codes with Python by Al Sweigart is a newly published (January 2018), 424-page book from No Starch Press that bills itself as “An Introduction to Building and Breaking Ciphers.” You can pick up a copy of this book (ISBN-13: 978-1-59327-822-9) from No Starch for $29.95 with a free eBook. Al Sweigart is a professional software developer who teaches programming to kids and adults. He is the author of Automate the Boring Stuff with Python, Invent Your Own Computer Games with Python, and Scratch Programming Playground, also from No Starch Press. His other books are also freely available under a Creative Commons license on his website Invent with Python.
This book is also an introduction to Python, and covers concepts that are even applicable to people who have never programmed before now. The only pre-requisites for this book are a computer that can run Python (any operating system) and the wish to learn.
Before I dive into my review, I’d like to cover who I am, what I do, and my interest in this book. Hopefully this will explain my existing strengths and weaknesses going into the book, and help you decide if this is for you. I am a Senior Penetration Testing Consultant for Secureworks and have pentested as a career or hobby for over 5 years. I was previously a developer, though never any production level Python (plenty of scripting though). I’ve also had plenty of CTF experience, so I at least have a vague interest (and some experience) in cryptography. That said, I am far from an expert in cryptography and have never taken any earlier training or coursework in it before.
Instead of a rote chapter by chapter summary, I’d like to organize this review a bit differently. I will break each major section up into parts, and go over what the section covers as well as anything particularly good (or bad) in the section. After I cover the book in this way, I’ll go over some of the real applications that you’ll develop over the course of the book, including a link to my versions in my GitHub. Finally, I’ll wrap up the book in its entirety and add any parting words that you might want as a potential reader.
The Introduction and Chapter 1 go over a few great topics and build up what the book is really about. The author covers a few basic cryptosystems as well as an introduction to cryptography in general. I really like the notes about former export laws and the RSA encryption scheme being included. This is important for readers to understand how far cryptography, as well as computer law, have come since the early 1990s. The paper cryptography tools section was fun, though I wasn’t the biggest fan of the included virtual cipher wheel. It might have been nice to cover one more cipher by hand, but I understand that the book covers other ciphers later on.
Chapters 2 and 3 are purely an Introduction to Python. I’ll be honest, when I first glanced at the table of contents, I was a bit skeptical that any Python could really be covered in the 28 pages of these two chapters. That said, the author does a great job of introducing basic concepts and building on them piece by piece. I would have preferred 3 chapters, with the IDLE/Hello World chapter being a bit longer, but that’s also probably personal preference. By the end of this chapter, most people new to programming should have a basic of understanding of what is to come as well as how to interact with Python. All of the following chapters will build upon the knowledge of these chapters, so this section is not intended as a fully fledged Python tutorial.
After the introduction to Python, it is time for some more ciphers. Chapters 4 through 8 follow a similar structure of introducing a new cipher, building on the previous Python knowledge, and then demonstrating how to break the cipher. The Caesar cipher was a great first choice, as the reader should already be familiar with it from Chapter 1. The book was clearly designed for reading the chapters in order, especially for readers who are learning Python. That said, this is a hands-on guide for beginners, so more experienced developers might find the early chapters a bit boring outside of the encryption/decryption algorithms. I also really like the Summary at the end of the chapters, as it should help reinforce the covered concepts to new developers.
Chapter 9 strays from the formula slightly, but I think this chapter is great. While it doesn’t necessarily mention Unit Testing or Test Driven Development (TDD), it does say that the reader is developing automated testing. I think that this is a perfect topic to cover at this point in the Python tutorial, and it also verifies that the reader copied or coded the previous 2 programs correctly. I even learned about the random.SystemRandom() method, which I’ve never used before.
Chapters 10 through 12 build on the Python knowledge with a slightly different formula. Chapter 10 covers file I/O, which is very important for new developers to understand. Additionally, it allows readers to see the usefulness of the encryption programs that they’ve built, when they can encrypt an entire file. Chapter 11 builds on that knowledge in another Python-centric chapter. It walks the reader through a class that detects whether or not a string is English based on some minimum word and letter requirements. The detectEnglish class is nice to have, and I’ll probably start using it (or something similar) during some of my CTFs. Finally, Chapter 12 is a great one, as it introduces the first real cryptanalysis technique. Note that I was unable to get the script from Chapter 12 to work initially. The reason for this was that isEnglish was returning false even with the original plaintext. The reason for this was that each of my dictionary keys was ending in a carriage word. Once I fixed the file for my Mac (cat dictionary.txt | sed “s/$(printf ‘\r’)\$//” > dictionary-fixed.txt), I was good to go. This chapter did not introduce a lot of new Python, so it really allows the reader to see how they can perform an attack against the Transposition Cipher using the detectEnglish class.
Chapter 13 through 18 are similar to the earlier chapters, where they introduce a new cipher, build on the crypto and Python knowledge, and then break the cipher. It was handy how the author only briefly mentioned the Euclidian algorithms, but then allowed the reader to do more research on their own. I was also able to slightly increase my affine cipher algorithm’s security by using all ASCII characters between 32 and 127 for my SYMBOLS. This increases the key possibilities from 1320 to 6840, though this is still fairly trivial to brute-force. While the affine cipher isn’t much more secure (cryptographically) than the Caesar cipher, it was still fun to learn about it. Chapter 17 was another crypto attack, and the first that didn’t use any manner of brute forcing. I knew that substitution ciphers could easily be solved using online tools, but it was nice to walk through it step-by-step and cover why a brute-force attack is infeasible. Finally, the Vigenère chapter introduces a cipher that cannot be defeated by brute-force or word pattern analysis! Before I go any further on the Vigenère cipher, I also wanted to mention the string concatenation notes. Not only will it depend on your Python version, but there is an even faster method. In Python 2.7.13, using stringTest.py, I got a time of 20.535 seconds with concatenation, 20.188 seconds with list concatenation, and 17.540 seconds with an inline list comprehension. For more information, you can visit the following article. The Vigenère cipher was a fun one, and it was great to see how such a “simple” change to the Caesar cipher made it “unbreakable” for so long.
Chapters 19 and 20 cover frequency analysis, and applying this technique to break the Vigenère cipher. Chapter 19 was a great walk-through for the technique, and it presented a much cleaner solution than I would have tried for my first attempt. I probably would have gone through the message letter by letter, counted the character counts, and compared them to an English frequency order manually. The dictionary attack was similar to the others, and fairly straightforward. Beyond a dictionary attack, I never actually knew how a substitution cipher could be broken. The author does a great job walking the reader though Kasiski elimination, which really is a cool technique. Note that this is a much longer script than any of the previous chapters, so be sure to follow along with the source code, as well as the text, if you’d like to fully understand it. I also really liked how the author covered the technique in-depth before showing any Python code, as it definitely helped me to have a basic understanding of the attack beforehand. It might also help to have the script open while going through the chapter, as it is pretty easy to get lost while the author walks through it.
Chapter 21 covers the One-Time Pad Cipher which is a slight modification of the Vigenère cipher that makes it unbreakable. While the one-time pad wasn’t covered as a script (since it was basically just the Vigenère), I have included it along with my scripts as a basic example. While this was a shorter chapter, it was definitely worth covering, especially if the reader is hoping to actually use any of the encryption methods in this book. While I am not sure if I will use the one-time pad for anything personally, I can definitely think of a few CTF challenges that could revolve around it!
Chapters 22 and 23 build up to the RSA algorithm. The prime number testing and generation was a good start, and I had personally never seen the Rabin-Miller algorithm before. I have used the Sieve of Eratosthenes for the Project Euler problem #3 though.
Chapter 24 covers “textbook” RSA, and it does so in a fairly straight-forward way. That said, I had a few questions that weren’t answered until the middle of the chapter, but that may just be because of my thought process/personal preference. I’ve used a few tools to encrypt/decrypt or “attack” messages encrypted with RSA before, but I definitely have a better understanding of the algorithm now. I am glad that the author mentions not rolling your own cryptography at the end, but mentioning it more than once might be nice. There were a few things that I wish the author had covered in this chapter though, such as:
- How does the algorithm weaken if either p or q isn’t actually prime?
- What attacks could be performed if an attacker discovers either p or q (common in CTFs)?
- How to perform a brute-force attack against a much smaller key-size.
- What are some of the “advanced techniques” that cryptographers use to break the algorithm demonstrated?
My Humble Contributions
I’ve released my versions of the scripts, so that you can see what you will actually be creating during the book. You can find them in my GitHub repository, so please feel free to take a look, copy, change, or share! Note that I am using Python 2.7, and have some experience in Python, so the scripts will not match the book exactly. Additionally, some of the examples from the book won’t work exactly in my scripts, due to requiring more input or allowing more symbols. Note that I have modified some of the files that begin with a number, and I’ve accounted for this in the import statements as well. I tried not to mess with the “library” modules such as cryptoMath, so those are exactly the same.
Note that a number of the scripts are mostly duplicates, so it might be worthwhile to just add logic to select the algorithm as well. For example, the transpositionCipherHacker and affineCipherHacker are almost exactly the same, with the only real changes being the imported classes and references. I may do this in the future for my versions of the scripts, but, for now, I leave this as an exercise to the reader.
Final Thoughts on Cracking Codes with Python
If you are new to Python, or development in general, then I highly recommend you follow the author’s suggestion of reading through the book beginning to end. That said, if you already have a fair bit of Python experience, then I’d suggest a slightly different approach to the book. In that case, I would start by reading the beginning of the chapter. Once you get a hang for what the algorithm does, and how it works, I’d move on to the code. If you can understand how the script works, and what it does, then feel free to move on to the next chapter. If not, you can follow along with the author as he steps through it.
I also have a funny note about the following quote from the author: “The ciphers in this book (except for the public key cipher in Chapters 23 and 24) are all centuries old, but any laptop has the computational power to hack them. No modern organizations or individuals use these ciphers anymore, but by learning them, you’ll learn the foundations cryptography was built on and how hackers can break weak encryption.” This isn’t necessarily true, as I’ve seen a Caesar cipher used to encrypt sensitive data in a production environment.
In conclusion, this book was definitely worth the read even as an experienced Python developer. I learned more about cryptography, and even a few new Python tricks. It was also really cool to see scripts that I wrote actually encrypt/decrypt something, as opposed to some random crypto tool/website. I even found a few errors in the book that I submitted to the author! I would also love to see a sequel to this book, only covering more “advanced” algorithms such as RC4, DES, AES, etc. I understand that it might not fit the author’s theme though, as that will likely be more about math and cryptography than an introduction to Python.
The man, the myth, the legend; Ray Doyle, OSCP, GXPN, aka @doylersec is an avid pentester and security enthusiast. He now works as a Senior Penetration Testing Consultant at Secureworks, and has been there for over a year now.
When he’s not hacking for work he’s, well, hacking for fun as well…Ray has competed in many hacking competitions and CTFs over the years, most recently with Team Eversec, and managed to place 7th in the DEF CON Open CTF, 1st in the Raleigh BSides CTF, 2nd in the DerbyCon CTF, 1st in the DEF CON 24 SOHOpelessly Broken CTF (winning a DEF CON ‘black badge’), and 1st in the DEF CON 25 Wireless CTF (helping to win another black badge).
Ray enjoys reviewing new books and courses for knowledge building, as well as information sharing with the community. He has run an infosec focused blog for the last few years, which you can find at https://www.doyler.net/. If you have any questions, comments, or suggestions, then feel free to reach out!
Other than security, you can always hit him up for a game of Overwatch (doyler#1799 in the Diamond rank) or a Super Smash Brothers Melee money match.Tags: book review cipher cryptography programming python