caesar - CS50 Walkthroughs 2019

CS50
6 Sept 201914:31

Summary

TLDRThe script outlines a Caesar cipher program that encrypts text by shifting letters using a provided key. It explains the process of taking command-line arguments for the key, ensuring its validity, and converting it to an integer. The program prompts for plain text, then shifts each alphabetic character by the key while preserving case and non-alphabetic characters. ASCII values and modulo operations are utilized to achieve the wrap-around effect for the alphabet. The script provides a step-by-step guide on implementing the cipher in C, including handling strings, character checks, and encryption logic.

Takeaways

  • ๐Ÿ”‘ The task is to create a program that encrypts text by shifting each letter by a specified key.
  • ๐Ÿ–ฅ๏ธ The program will prompt the user for a key, receive plaintext, and then produce the corresponding ciphertext.
  • ๐Ÿ”  The encryption shifts letters forward in the alphabet by the key value, wrapping around if necessary (e.g., 'Z' shifted by 2 becomes 'B').
  • ๐Ÿ”„ The program must preserve the case of each letter and leave non-alphabetic characters unchanged.
  • ๐Ÿงฎ The key is provided as a command-line argument, which must be validated to ensure it is numeric and that only one argument is given.
  • ๐Ÿ” If the user inputs invalid arguments, the program should display a usage message guiding the correct input format.
  • ๐Ÿ”ก Alphabetic characters are identified using functions like isalpha, isupper, and islower, and are then shifted based on their ASCII values.
  • ๐Ÿ” The formula to wrap around the alphabet after shifting involves using the modulo operation (mod 26) to ensure continuity.
  • ๐Ÿ“ The program calculates the alphabetical index for each character, applies the shift, and then converts it back to the correct case.
  • ๐Ÿš€ The final step involves processing the entire plaintext string character by character, applying the shift, and then printing the resulting ciphertext.

Q & A

  • What is the main task of the Caesar program described in the script?

    -The main task of the Caesar program is to encipher or encrypt text by shifting all the letters in the text by a certain key amount, creating a ciphertext.

  • How does the Caesar cipher work with the alphabet?

    -The Caesar cipher works by shifting each letter in the plaintext by a fixed number of positions (the key) in the alphabet. If the shift goes beyond 'Z', it wraps around to the beginning of the alphabet.

  • What is the significance of preserving case when using the Caesar cipher?

    -Preserving case is important because it ensures that uppercase letters remain uppercase and lowercase letters remain lowercase in the ciphertext, maintaining the original style of the plaintext.

  • How does the script handle non-alphabetic characters in the plaintext?

    -Non-alphabetic characters such as spaces, numbers, or punctuation marks are not changed during the encryption process; they remain the same in the ciphertext.

  • What is the purpose of the command line argument in the Caesar program?

    -The command line argument is used to provide the key for the Caesar cipher, which determines the number of positions each letter in the plaintext will be shifted.

  • How can the program ensure that only a single, numeric command line argument is provided?

    -The program checks the number of arguments (argc) and verifies that the argument contains only digit characters. It uses functions like 'isalpha' to check for non-numeric characters.

  • What is the role of the 'argv' array in the main function of a C program?

    -The 'argv' array holds the command line arguments provided by the user when running the program. Each element of the array represents a separate argument.

  • How does the script explain the conversion of a string to an integer in C?

    -The script suggests using a function like 'atoi' (declared in stdlib.h) to convert a string containing a number into an actual integer value.

  • What functions are mentioned in the script to check if a character is alphabetic, uppercase, or lowercase?

    -The script mentions 'isalpha', 'isupper', and 'islower' as functions to check if a character is alphabetic, uppercase, or lowercase, respectively.

  • How can the ASCII values of characters be manipulated to implement the Caesar cipher?

    -By treating characters as their corresponding ASCII values, you can perform arithmetic operations to shift the characters. Using modulo 26 ensures the wrap-around effect when shifting past 'Z'.

  • What is the formula used to convert plaintext into ciphertext in the Caesar cipher?

    -The formula to convert plaintext into ciphertext is to take the ASCII value of the character, add the key, and then take the result modulo 26, which ensures the wrap-around within the alphabet.

  • How can you determine the length of the plaintext string in C?

    -The 'strlen' function can be used to determine the length of a string in C, which is useful for iterating through all characters in the plaintext for encryption.

Outlines

00:00

๐Ÿ” Introduction to Caesar Cipher

Brian introduces the Caesar Cipher, a method of encryption that involves shifting the letters of the alphabet by a fixed number, known as the key. The program's objective is to get the key from the user, prompt for plain text, shift the letters to create cipher text, and print the result. The process is demonstrated with examples, such as shifting 'A' by 2 to get 'C', and wrapping around the alphabet for letters like 'Y' and 'Z'. The importance of preserving case and leaving non-alphabetic characters unchanged is emphasized.

05:01

๐Ÿ“ Handling Command Line Arguments

The script explains how to access command line arguments in a C program using 'argc' for the count of arguments and 'argv' for the arguments themselves. It details the process of validating that a single numeric argument is provided, suggesting usage messages for incorrect inputs. The need to convert the string argument into an integer using a function like A2I from standard lib.h is also discussed.

10:03

๐Ÿ”„ Enciphering Characters and ASCII Values

This section delves into the technical aspects of enciphering individual characters. It covers the use of functions like 'isalpha', 'isupper', and 'islower' to determine character types and the ASCII values associated with each letter. The process of shifting characters while preserving case and wrapping around the alphabet is explained using a formula that involves adding the key and taking the result modulo 26. The concept of converting between ASCII and an alphabetical index is introduced to facilitate the shift and wrap-around.

๐Ÿ”  Implementing the Caesar Cipher Program

Brian concludes the script by outlining the steps to implement the Caesar Cipher in a program. This includes getting the key, prompting for plaintext, enciphering each alphabetic character using the established formula, and handling non-alphabetic characters by leaving them unchanged. The use of string functions like 'get_string' for input and 'strlen' for determining string length is mentioned. The summary emphasizes the iterative process of applying the cipher to each character in the plaintext to produce the final ciphertext.

Mindmap

Keywords

๐Ÿ’กCaesar Cipher

The Caesar Cipher is a type of substitution cipher where each letter in the plaintext is shifted a certain number of places down or up the alphabet. It is named after Julius Caesar, who reportedly used it to communicate with his generals. In the video, the Caesar Cipher is the central theme, with the script explaining how to implement a program that performs this encryption technique.

๐Ÿ’กShift

In the context of the Caesar Cipher, a 'shift' refers to the number of positions each letter in the plaintext is moved along the alphabet to create the ciphertext. The script describes how shifting letters by a key amount results in the encrypted message, with examples such as shifting 'A' by 2 to get 'C'.

๐Ÿ’กPlaintext

Plaintext is the original message or text before encryption. The script mentions that the program will prompt the user to type in some plaintext, which will then be enciphered using the Caesar Cipher method to produce the ciphertext.

๐Ÿ’กCiphertext

Ciphertext is the result of encrypting the plaintext using a specific encryption algorithm. In the script, after the plaintext is shifted by the key, the output is referred to as the ciphertext, such as 'CDE' being the ciphertext for the plaintext 'ABC' with a shift key of 2.

๐Ÿ’กKey

The 'key' in the Caesar Cipher is the number of positions each letter is shifted in the alphabet. The script explains that the program will first get the key as a command-line argument and then use this key to shift the letters of the plaintext to create the ciphertext.

๐Ÿ’กCommand Line Arguments

Command line arguments are the parameters provided to a program when it is executed from the command line. The script details how the Caesar Cipher program uses command line arguments to receive the key for the cipher, with examples of how to access these arguments using 'argc' and 'argv' in a C program.

๐Ÿ’กASCII

ASCII is a character encoding standard used to represent text in computers, with each character assigned a unique number. The script discusses ASCII values in the context of shifting letters while preserving the wrap-around effect of the alphabet, such as converting 'Z' to 'A' after a shift.

๐Ÿ’กModulo Operation

The modulo operation finds the remainder of a division between two numbers. In the script, the modulo 26 operation is used to implement the wrap-around of the alphabet in the Caesar Cipher, ensuring that shifting 'Z' by 1 results in 'A', due to 26 % 26 being 0.

๐Ÿ’กAlphabetical Index

An alphabetical index is a mapping of letters to numbers, where 'A' is 0, 'B' is 1, and so on. The script uses the concept of an alphabetical index to explain how to convert ASCII characters to their corresponding index before applying the shift in the Caesar Cipher.

๐Ÿ’กPreservation of Case

The script mentions that the Caesar Cipher program preserves the case of the letters in the plaintext. This means that if a letter is uppercase in the original message, it remains uppercase in the ciphertext, and the same applies to lowercase letters.

๐Ÿ’กNon-Alphabetic Characters

The script specifies that non-alphabetic characters such as spaces, numbers, and punctuation marks do not change during the encryption process. They remain the same in both the plaintext and the ciphertext, ensuring that only alphabetic characters are shifted.

Highlights

Introduction to the Caesar cipher encryption technique involving a shift of alphabetic characters.

The program prompts the user for a key to determine the shift amount for the cipher.

User is asked to input plain text which will be encrypted using the Caesar cipher method.

Explanation of how to handle the wrap-around of letters beyond 'Z' or 'z' using modulo operation.

Demonstration of how non-alphabetic characters remain unchanged in the cipher text.

Preservation of character case (upper or lower) in the cipher text.

Use of command line arguments to input the key for the Caesar cipher.

Validation of command line arguments to ensure they contain only numeric characters.

Conversion of the command line argument string to an integer for the cipher key.

Prompting the user for plain text input to be encrypted.

Description of how to encipher a single character in the plain text.

Utilization of ASCII values for shifting alphabetic characters while preserving case.

Application of the modulo operation to achieve alphabetic wrap-around in the cipher.

Conversion of ASCII characters to an alphabetical index for cipher shifting.

Reversion from the alphabetical index back to ASCII after cipher shifting.

Iterating through each character of the plain text to apply the cipher shift.

Use of string functions like 'strlen' to determine the length of the plain text.

Final encryption of the entire plain text and printing of the resulting cipher text.

Summary of the Caesar cipher program's process from key input to cipher text output.

Transcripts

play00:00

BRIAN: In Caesar, your task is going to be to encipher or encrypt

play00:04

some text by shifting all of the letters in that text by a certain amount.

play00:08

What your program is ultimately going to do is it's first going to get the key--

play00:13

the amount that we should shift each character by.

play00:15

Then we're going to prompt the user to type in some plain text.

play00:19

Then we're going to encipher that plain text

play00:21

by shifting all the letters by the key.

play00:24

And finally, we're going to print out the resulting cipher text.

play00:28

What does this look like?

play00:29

Well, if the key is 2, then for any plaintext character,

play00:33

we're going to shift that character by 2 letters in the alphabet.

play00:37

So if we had a plain text capital A, that

play00:40

would become the ciphertext capital C, because we're shifting it

play00:43

by 2 letters from the alphabet.

play00:44

B would become D. C would become E. So on and so forth up,

play00:48

until X would become Z.

play00:50

And then when we get to Y, if we were to shift Y by 2 characters,

play00:54

we would go past the boundary of the alphabet.

play00:56

So what we'll instead want to do is wrap around to the beginning of the alphabet

play01:00

so that Y becomes A when we shift it by 2,

play01:04

and Z becomes B when we shift it by 2.

play01:08

What does this mean for how your program should actually work?

play01:10

Well, if we were to run ./caesar and then provide, a command line argument,

play01:15

the number 2--

play01:16

meaning we'd want to use 2 as the key, shifting everything by 2--

play01:20

and we were to type in the plain text ABC,

play01:23

then your program should print out the ciphertext CDE--

play01:27

the same as the plain text, but with every character shifted by 2 letters.

play01:32

If we were to instead run ./caesar2 and provide the plain text of "hello,"

play01:37

for example, --

play01:38

then the ciphertext should be this J-G-N-N-Q,

play01:42

which is each of the letters in "hello" shifted by 2 letters.

play01:46

Let's take a look at one last example.

play01:48

If I were to run ./caesar with a key of 2 ans input a plaintext of "This is

play01:54

CS50," then the ciphertext should look like this.

play01:57

And there are a couple of things to note here.

play01:59

One is that we're preserving case.

play02:02

If a character is an uppercase letter in the plaintext,

play02:05

then it's also an uppercase letter in the ciphertext.

play02:08

And likewise, a character that's a lowercase letter in the plaintext

play02:11

is also a lowercase letter in the ciphertext.

play02:14

And any character that's not an alphabetical character at all--

play02:18

so things like spaces or numbers or punctuation marks--

play02:22

those don't change.

play02:23

The spaces stay spaces.

play02:24

The 50 stays 50.

play02:26

The exclamation point stays an exclamation point.

play02:28

Only the alphabetic characters are changing.

play02:32

So let's go through each of the steps of the caesar program

play02:35

to figure out how you can actually write the code to do this.

play02:38

The first thing you'll want your program to do is to get the key--

play02:41

get the amount that we should shift each of the individual letters by.

play02:45

How is your program going to do that?

play02:47

Well, remember that your program is going to take the key as a command line

play02:51

argument, meaning that when the user is running your program,

play02:54

they're going to run your program as something like ./caesar,

play02:57

followed by a number representing the key of how many characters to shift

play03:02

each of the letters by.

play03:04

How do you access those command line arguments?

play03:06

Well, recall that in a C program your main function can take arguments.

play03:11

It can take an argument called argc, which

play03:13

is going to represent the argument count-- the number of command line

play03:17

arguments-- and also an array of strings,

play03:20

called argv, representing each of those command line arguments.

play03:24

So for example, if I were to run ./caesar2 and then the number 3

play03:28

as the key, argc would be 2.

play03:31

I typed in two things--

play03:33

./caesar2 and 3.

play03:35

And argv is going to be the actual array of strings

play03:38

representing those arguments.

play03:40

And recall that when you have an array, you

play03:42

can access an individual element of that array using square bracket notation.

play03:47

So argv[0] is going to be the first string in the argv array--

play03:51

which, in this case, is ./caesar2--

play03:54

and argv[1] is going to be the second thing in that argv array--

play03:58

which, in this case, is the string 3.

play04:01

So that's how we might access the key.

play04:03

Now, when we get the key, you'll want to ensure that only a single command line

play04:08

argument is provided, and that that argument contains only digit

play04:11

characters before you actually take that argument and convert it to an integer.

play04:16

What I mean by that is that if the user runs ./caesar, and then 2 and then 8,

play04:21

providing more command line arguments that's expected, your program,

play04:25

rather than doing anything else, should just print out a usage message

play04:28

reminding the user that in order to run the Caesar program,

play04:31

they should run ./caesar2, followed by a key.

play04:34

And you should do this anytime they don't use the correct number of command

play04:38

line arguments.

play04:39

But even if they have the correct number of command line arguments,

play04:42

that argument might not be entirely numeric.

play04:45

If the user runs something like ./caesar20x,

play04:49

where the command line argument is not entirely numeric,

play04:52

you should also display a usage message reminding the user that they need

play04:55

to run ./caesar, followed by a valid key.

play04:59

How do you check if the key is valid?

play05:01

Well, you'll probably want to take that command line argument

play05:04

and check each of the individual characters

play05:06

in that string to make sure the character is a digit.

play05:09

And you might want to think about what functions

play05:11

might be helpful for checking if an individual character is or is not

play05:15

a digit.

play05:16

Once you've checked to make sure that the command line arguments are correct,

play05:19

the last thing you'll need to do here is actually take the command line argument

play05:23

and convert it into an integer.

play05:25

Notice that argv[1] right now, representing the key,

play05:28

is not the number 3, but the string "three."

play05:32

Recall that in C, every variable has a type.

play05:35

Some variables are ints, and other variables are strings, for example.

play05:39

And here, argv[1] is the string "three," rather than the integer 3--

play05:44

which we'll probably want to use in case we need to do some math with that

play05:47

number, for example.

play05:48

So what we'll need to do is convert the string into an into an integer.

play05:52

To do this, you can use the A2I function,

play05:55

declared in standard lib.h, that takes a string, like the string 3,

play05:59

and converts it into the number 3, for example.

play06:02

So that function might be helpful for making sure

play06:04

that your key is in fact an integer.

play06:07

After you've gotten the key, the next step

play06:09

is to prompt the user to type in the plaintext.

play06:12

To do that, you can use a call to get string,

play06:15

asking the user to type in what plaintext they want to encrypt.

play06:19

Once you've gotten that plaintext, the next step

play06:22

is going to be to encipher that plaintext, shifting all of the letters

play06:26

by the key to get the ciphertext result.

play06:29

But before we talk about how to encipher the entire plaintext,

play06:32

let's just talk about how to encipher one particular character.

play06:37

How are we going to do that?

play06:38

Well, there are a couple of cases that we might want to consider.

play06:41

If the character is an alphabetic character, like an uppercase letter

play06:45

or a lowercase letter, for example, then we'll

play06:47

want to shift that plaintext character by the key,

play06:50

making sure to preserve the case.

play06:52

If it was originally an uppercase letter,

play06:54

it should stay in uppercase letter, for example.

play06:57

Of course, if the character is not alphabetic-- for example,

play07:00

if it's a space or a punctuation mark or a digit--

play07:03

then you can leave the character as is.

play07:05

Nothing actually needs to change there.

play07:07

But how do you know if a character is an alphabetic character

play07:10

or an uppercase character or a lowercase character?

play07:12

Well, it turns out, there are a number of functions that you

play07:15

can use that might be helpful here.

play07:17

Functions like is alpha, is upper, and is lower.

play07:21

Is alpha is a function that takes a character

play07:23

and returns a Boolean value, true or false,

play07:26

depending on whether or not that character is alphabetic.

play07:29

And likewise, is upper checks to see if the character is uppercase

play07:32

and is lower checks to see if a character is lowercase.

play07:35

If we call is alpha, is upper, and is lower on capital A,

play07:39

for example, is alpha of capital A is true.

play07:42

It is an alphabetic character.

play07:43

Is upper of capital A is also true because capital A

play07:47

is an uppercase letter.

play07:48

And is lower of capital A is false because capital A is not

play07:53

a lowercase letter.

play07:54

And so that can help you to figure out, given a particular character,

play07:57

is it an uppercase character or is it a lowercase character?

play08:01

And every character, uppercase and lowercase, has an ASCII value.

play08:06

Recall that ASCII is just a mapping from characters to numbers

play08:09

that represent them, where we've decided that capital A is represented

play08:14

by the number 65, capital B is represented by the number 66,

play08:18

so on and so forth.

play08:19

And lowercase letters have a mapping as well.

play08:21

Lowercase a is represented by the number 97, lowercase b by the number

play08:25

98, lowercase c by 99, so on and so forth.

play08:29

And you can treat any individual character

play08:31

as equivalent to the number that represents it.

play08:34

So what does that mean in code?

play08:36

Well, it means that if I have a character in a variable called

play08:39

c that I set equal to the character A--

play08:42

capital A-- and I print out that character using %c to mean I want

play08:47

to substitute a character into this string, then what gets printed out,

play08:51

as you might expect, is the character A. But this character A is really just

play08:55

the number 65 inside of my computer, and I can treat this character like

play09:00

the number 65.

play09:01

And like any number, I can do math with it, adding or subtracting

play09:05

numbers from it, for example.

play09:06

So I can take capital A and add 1 to it, for example.

play09:10

Capital A is represented by 65.

play09:12

65 plus 1 is 66.

play09:15

So if I print out the character corresponding

play09:17

to the number 66, what's ultimately going to be printed out

play09:20

is the character B because B in ASCII is represented by the number 66.

play09:26

But what happens if I try to take a character like the character capital Z

play09:30

and shift that by one?

play09:32

If I take the ASCII value corresponding to the letter Z and add 1 to it,

play09:37

I don't get the ASCII value corresponding

play09:39

to capital A. I get something that's outside of the range of all

play09:43

of the capital letters.

play09:45

What I'd really like to happen here is that once I

play09:47

try to shift past the letter Z, I want to wrap around back

play09:52

to the beginning of the alphabet, back to the letter A. But how do I do that?

play09:57

Well, to do this, we can take advantage of a formula.

play10:00

And here's the formula that we're going to use to convert

play10:02

the plaintext into the ciphertext.

play10:05

It looks a little bit complicated, but we'll break it down piece by piece.

play10:08

c sub i here represents the ith character of the ciphertext.

play10:13

How do we determine what the ith character of the ciphertext is?

play10:17

Well, we're going to start by taking the ith character of the plaintext

play10:21

and we're going to add k to it, where k is the key.

play10:25

And we're going to take that whole result and take it modulo 26.

play10:30

This %26 means we're going to take the remainder when we take this whole value

play10:35

and divide it by 26.

play10:37

Why does this work?

play10:39

Well, for sake of example, let's say that a is represented by the number 0,

play10:43

b is represented by the number 1, c is represented by 2, so on and so forth,

play10:47

up until z, represented by the number 25.

play10:51

So how might I go about taking the character z

play10:53

and shifting it by 1, using this formula?

play10:56

Well, here is the formula.

play10:58

And the plaintext character I want to shift

play11:00

is the character z, which is here represented by the number 25.

play11:04

And I want to shift it by the key k, which in this case is 1.

play11:08

So I have 25 plus 1, which is the number 26, and 26--

play11:13

mod 26-- is what's the remainder when I divide 26 by 26?

play11:18

Well, the remainder is just 0, so I'm left with 0, which

play11:22

corresponds to a in the above chart.

play11:24

The result is that I'm able to take a character,

play11:27

shift it by a certain amount.

play11:29

And by using modulo 26, I'm able to get the remainder when I divide it

play11:33

by 26, which allows me to loop back around

play11:36

effectively to the beginning of the alphabet so that z wraps around to a.

play11:41

And this mapping of a to 0, b to 1, c to 2, et cetera,

play11:45

you can call an alphabetical index.

play11:48

Where I might say that a's alphabetical index is 0,

play11:51

b's alphabetical index is 1, c's alphabetical index is 2, and so forth.

play11:57

But of course, in our computer, we don't use the alphabetical index

play12:00

to represent each character.

play12:01

We use ASCII, where A is 65 and B is 66 and C is 67,

play12:06

and so we can't directly apply this formula.

play12:09

So what do we need to do in order to take an ASCII character

play12:13

and shift it by a certain amount, while still preserving

play12:15

this wrap-around effect of going from the end of the alphabet back

play12:19

to the beginning?

play12:20

Well, here's what we might try to do.

play12:22

We might first convert an ASCII character to its alphabetical index

play12:27

so that capital A becomes the number 0, for example,

play12:30

and capital B becomes the number 1.

play12:32

And you might want to think about what math you could do to make that work.

play12:36

Once you have an alphabetical index, you can shift that alphabetical index

play12:40

using the formula, adding the key and taking the result mod

play12:43

26 to get the new character's alphabetical index.

play12:47

And finally, you can take the resulting alphabetical index

play12:51

and convert that back into ASCII, so that 0

play12:54

becomes A, 1 becomes B, so on and so forth,

play12:57

being sure to preserve the case--

play12:59

uppercase letters should remain uppercase,

play13:01

lowercase letters should remain lowercase.

play13:05

And that will allow you to encipher one alphabetic character.

play13:08

If it's alphabetic, you'll want to shift it to its alphabetic index,

play13:11

then shift it using the formula, then convert the result back to ASCII.

play13:15

But how, then, instead of enciphering just a single character,

play13:19

do you encipher all of the characters that are in the plaintext?

play13:22

Well, recall that the plaintext is really just a string.

play13:25

And when you have a string, you can access individual characters by using

play13:30

array-like notation, where I can say text[0] will get me the first character

play13:36

of the string text, which in this case is capital T. Likewise,

play13:39

text[1] is lowercase h, text[2] is lowercase i, so on and so forth.

play13:46

And I can use this notation to access any

play13:49

of the characters inside of the string.

play13:52

How do I know how many characters are in the string?

play13:55

Well, there's another function called strlen, short for string length,

play13:58

that will take a string and tell you how many characters are in it.

play14:02

In this case, 12.

play14:04

So by combining this, knowing the length of the string,

play14:06

knowing how to access an individual character in the string,

play14:10

and knowing how to encipher an individual character in the string,

play14:13

you should be able to implement this Caesar program by first figuring out

play14:17

what the key is, then getting the plaintext,

play14:20

then enciphering that plaintext by shifting all the alphabetic characters,

play14:24

and printing the result. My name is Brian and this was Caesar.

Rate This
โ˜…
โ˜…
โ˜…
โ˜…
โ˜…

5.0 / 5 (0 votes)

Related Tags
Caesar CipherEncryption GuideAlphabet ShiftASCII ValuesModulo OperationCommand LineProgramming TutorialCharacter ShiftCase PreservationEducational Content