This project is due at 11:59pm on Friday, September 27, 2024.

Description and Deliverables

In this project, you will gain hands on experience cracking passwords and you will hopefully adopt better password practices that you take with you in your career. As such, this project has two distinct parts. I highly recommend that you start part one immediately: the necessary computations can take a long time complete, and you will need several runs to finish this project.

To receive full credit for this project, you will turn in the following two things:

A file named cracked.txt that contains the usernames and cracked passwords for at least 50 users contained in your individual /etc/shadow file.
A file named password_policy.txt which contains a few sentences explaining which password manager you are using.
You will need to setup 2FA for your github account.

Each of these deliverables is described in greater detail below.

Part 1: Password Cracking

Linux systems typically store cryptographically hashed user passwords in crypt format in the /etc/shadow file. If you have sudo access to a Linux system, you can view this file on your own system (don’t try to look at this file on systems you don’t own, like the Khoury College Linux machines). The file format for the /etc/shadow file is described here.

In this part of the project, you will crack the hashed passwords contained in a leaked /etc/shadow file. There are more than 50 usernames and passwords in the file to give you more opportunities to earn full credit; however, it will take several days of compute power to crack 50 passwords, so start this process early. Each student in the class has their own individual shadow file to crack based on their Northeastern username. You can access it at https://shelat.khoury.northeastern.edu/dl/24f-2550/shadow/<username>.shadow, so for example, my file is located here. Make sure you use your individualized file to get credit. If your file does not exist, then you were not registered in gradescope at the time we generated these files, so please create a Piazza post. You can also use curl to download your file:

$ curl https://shelat.khoury.northeastern.edu/dl/24f-2550/shadow/<neu username>.shadow

Cracking Tools

We recommend that students use well-known, heavily optimized cracking tools like John the Ripper or HashCat for this part of the project. Both tools are available for multiple platforms, although they are trivial to install on Debian-based Linux systems:

sudo apt install john
sudo apt install hashcat

Both tools have built-in support for the /etc/shadow file format, have the ability to pause and resume cracking sessions (a useful feature, since cracking can take hours/days), and support multiple different strategies for guessing passwords (e.g. brute force, word lists, etc.). We leave it to you to determine which tool you prefer and learn its command line syntax. Students are welcome to use whatever password guessing approach they want; many wordlists are available for free online. We recommend starting with the rockyou list that we presented in class lecture and also a good source of English dictionary words of length up to 10.

John the Ripper and HashCat both have the ability to run in multi-threaded configurations (i.e. they try to crack multiple passwords in parallel). We highly recommend that students utilize these features; for example, on a quad-core server, running John the Ripper with the “–fork=3” option to use three CPU cores is a reasonable approach. The cloud shell machines do not have a GPU, but if you have access to one, or want to try breaking passwords on your local machine, consider using the GPU-optimized, OpenCL modes available in both programs, since GPUs are several orders of magnitude faster at password cracking than CPUs.

Cracking Approach

The leaked shadow file is designed to have a sliding difficulty scale. Without doing anything fancy, roughly half of the passwords should crack in just a few minutes if you have found good starting password dictionary. Why do you think these passwords were so easy to crack?

Next, with a reasonably comprehensive English wordlist (e.g., /usr/share/dict/words on a Mac, or linux system is a good start, but includes words that are too long, and thus you can process to improve efficiency) combined with common permutation rules, another 15 passwords should crack within 24 hours.

For example, using John the Ripper the following command will attempt to crack the passwords using a wordlist of your choice and John’s built-in permutation rules (e.g. capitalizing the first and last letters of words, adding random numbers to the end of words, etc.).

$ john --wordlist=[path to your wordlist] --rules --fork=3 [path to the shadow file]

The remaining passwords are more challenging and require you to write expansive substitution and permutation rules (hint: symbols, numbers, etc). This exercise requires creativity and tenacity to solve. A good strategy is to think about simple heuristics that people use to make passwords and then write masks and/or rules that apply those heuristics to dictionary words. You may find that a program like john is good for the first passwords, and a program like hashcat is good for the last set of passwords.

The hashcat --help command will information about how to use it. Consider using attack mode 0 or 3. One of the first problems that you will need to solve is to pick the correct format for hashcat to use. The hashcat --example-hashes command might help. You can use grep to filter for the md5 variants and look for the $1$ string.

More Hints

If you are having slow runtime with hashcat and are using a Mac make sure that you downloaded hashcat with brew in terminal. Brew automatically will optimize hashcat for your computer and give you a faster runtime.

$ brew install hashcat

If you are using JTR, make sure you are using John Jumbo. There are some attack modes missing from the original John that are necessary for this assignment. You may have already experienced this issue if you could not use mask attacks.

$ brew install john-jumbo

You will need to expand beyond just using wordlists and rules. In particular, you will need to learn about mask and hybrid attack modes.

Mask attacks are what allow us to perform a targetted brute force attach on the hashes. The masks reduce the key space that hashcat tries for a candidate password by specifying character-sets and patterns.

Let’s say we want to crack the password “dhJ” and we only know that it is 3 characters long and consists of lowercase and uppercase letters. This password is not likely in a dictionary so we would need to use a mask attack.

In order to do this , we need to fill 3 placeholders with a custom character-set in hashcat. To create this custom character-set we use -1 and then specify that you want all uppercase letters (?u) and all lowercase letters (?l). The total key space for the custom character-set is now 52. We then want to specify how many placeholders to use this custom character-set for (3).
```
$ hashcat ... -a 3  -1?u?l ?1?1?1
```
(Note the ... above requires all the basic arguments about where the shadow file is, etc. The -a 3 specifies the attack mode for hashcat, which in this case is brute force with the mask. Use hashcat --help page or read the web page for hashcat to learn more.)

Typical brute-force would use all characters so by specifying a character with a smaller key space we reduce the total runtime.
Make sure that you specify the correct ‘hash mode’ for either jtr or hashcat. In this case, the first few characters of the hash in the shadow file should give you a clue on how to pick this mode.
Hybrid attacks allow us to combine both masks and wordlists. Let’s say we had a dictionary with just the word “northeastern” If the password we want to crack is “northeastern873”, it would be very hard to do this with rules. However with this attack mode, a mask is simply appended to the word from the wordlist. In this case we want to append 3 placeholders of the digits 0-9 (?d).
```
hashcat ... -a 6 example.dict ?d?d?d
```
Finally, rule based generation methods using hashcat (or john) will also be helpful in this project. You can sequentially apply rules to tackle some of the harder passwords in the project.
If you don’t know where to begin, start with incrementing the length of your mask/hybrid attack in hashcat with -i in order to cover all your cases.

File Format for Part 1

To complete part 1 of this project, you will turn in a file named cracked.txt that contains the usernames and cracked passwords for at least 50 users in the leaked shadow file. Each user and corresponding password should appear on one line in cracked.txt separated by a colon. For example, the format of a valid submission might look like this:

romeo:really_strong_password6@
juliet:1337cr4ck1ngsk1llz
tybalt:weak1234
mercutio:lalala

Note: it is important to make sure that you copy the solved hashes from hashcat/jtr exactly; if you miss a character, the autograder will not give you credit.

Part 2: Good Password Habits

In this part of the project, you will practice good password habits by (a) learning how to pick a passphrase, (b) installing and using a password manager, and (c) setting up 2FA for your github account.

Good passwords

One big reason why people choose weak passwords that are easily cracked is because they have been taught that only confusing passwords are secure. People either reject this advice and leave themselves vulnerable, or adopt password creation heuristics that are not resilient to cracking in practice (e.g. English word plus one capital letter, one random number, and one random symbol).

Since the early 90s, security researchers have advocated various password strategies to avoid those pitfalls; a folklore strategy is to pick a passphrase consisting of easy-to-type words. Several websites expore this concept. For example, usepassphrase provides a slick interface to this idea.

However, why should we trust the random number generator in our browser to select a password? And why should we trust that the website above isn’t logging the result? To prepare a truly offline password, you will use the diceware approach explained. That author has produced a list of 7776 short words in this file. The idea is that you roll dice 5 times to select a word in this list (because $6^5 = 7776$). If you want a 5-word password, repeat this 5 times. The main question is whether this passphrase is “memorable.”

Your first step is to use the diceware approach to pick a passphrase you will use to create an account on a test server.

This method is a robust method for picking unguessable passwords which you can train yourself to remember. Some people store large amounts of money in Bitcoin and they do not want to trust a hardware device to store their Bitcoin wallet key—these people often use a method similar to this. In fact, that it has been formalized as Bitcoin Improvement Proposal 39 (BIP39). I know people who have stored substantial sums using this method, and have challenged attackers to guess their passphrase with nobody succeeding.

Password manager

Install a password manager on your main computer and your phone. Firefox, Chrome, and Apple all contain good password managers, but you can investigate several offerings such as Lockwise, 1Password, Dashlane, or LastPass.

Two Factor authentication

Setup 2FA for your github account. Under Settings, select Password and Authentication on the left side of the screen, and then follow the instructions to enable Two factor Authentication or Passkeys.
After the P0 due date, we will invite you to join the Neu2550 organization. The organization has a policy that all members have two-factor authentication enabled. Thus, you won’t be able to acccept the invitation until step 1 is complete. Note: we invited everyone who finished project0 (when you saved a file containing your github account name). We use this name to send the invitation, so if you do not see an invitation, please reach out to us as soon as possible! It is important that you join this group because future projects in this course may rely on your membership in this Github group.
Once we verify you have joined our org (and thus setup 2fa), you will receive the extra 20 points for the assignment.

This component of the assignemnt is self-driven and you can get out from it what you want. I encourage you to add second factor authentication to your other accounts such as google, etc.

Submission Details

Please follow these directions exactly.

Create a directory project1 under your git repo.
Add the file cracked.txt to this directory.
Add the file password_policy.txt containing which password managers you studied, and an explanation on why you picked the one you choose.
Submit your project in gradescope via Github as you did in project0. Feel free to resubmit the project as many times as you like. It is ok to have extra files in the repository as well, just make sure you have those two for us to find.

Grading

This project is worth 10% of your final grade, broken down as follows (out of 100):

1 point each - for cracking the first 30 passwords
2 points each - for cracking 31 to 40 passwords
3 points each - for cracking passwords 41 to 50
20 pts - joining our github group

Points can be lost for turning in files in incorrect formats (e.g. not ASCII), failing to follow specified formatting or naming conventions, failing to compile, failing to follow specified command line syntax, insufficient or incorrect randomization, etc., failing to follow specified formatting or length conventions, etc.

Tips

Cracking passwords can take days so start part 1 of the project as soon as possible! Most of this time will be spent experimenting with different rules. No single run of hashcat should take more than a few hours on a laptop; the last brute force password should take roughly XX hours on a Macbook Pro 13 laptop. However, you will need a few runs with different masks to crack 50.

This page contains all of the hints that we plan to give everyone in the class. You can ask the TAs for help on how to run programs or to help debug your mask rules, but please do not ask the TAs for extra hints on what to try. We want everyone to have the same footing for this assignment.

Be creative. It is OK to discuss the rules you have tried with your fellow classmates, but discuss them in general terms using English words; do not copy and paste commands with fellow students.

Project 1: Linux basics