Understanding ASCII and Unicode (GCSE)

The Tech Train
6 Dec 201705:59

Summary

TLDRThis tutorial explains the concepts of ASCII and Unicode in the context of the GCSE computer science course. ASCII, a 7-bit encoding system, assigns each character on the keyboard a unique number, allowing for 127 different symbols. Unicode extends this, supporting a vast array of characters and symbols, including emojis, by using more bits (8, 16, or 32). The video demonstrates the difference by showing how a simple 'a' character takes one byte in ASCII, while a smiley face emoji requires four bytes in Unicode, highlighting the capability to represent over two billion possible characters.

Takeaways

  • 💻 ASCII and Unicode are character encoding systems used in computing.
  • 🔢 ASCII represents characters as numbers, with the letter 'A' being represented by the number 65.
  • 🔠 ASCII uses 7 bits to represent characters, allowing for 127 different characters, including letters, digits, and symbols.
  • ⌨️ Every key on a keyboard has a corresponding number, which can be converted to binary.
  • ⏭ ASCII's 127-character limit is insufficient for many languages and symbols.
  • 🌍 Unicode extends ASCII, supporting more characters by using 8, 16, or 32 bits for encoding.
  • 😀 Unicode allows for a broader range of characters, including alphabets, symbols, and emojis.
  • 📝 A single 'A' character in ASCII is stored as one byte, while more complex Unicode characters require more space.
  • 🔠 Extended ASCII uses 8 bits to double the number of characters, but it's still limited compared to Unicode.
  • 🚀 Unicode can represent up to 2 billion different characters, vastly expanding the number of symbols, languages, and emojis.

Q & A

  • What is the purpose of the ASCII and Unicode tutorial in the video?

    -The tutorial aims to explain the concepts of ASCII and Unicode in the context of the GCSE computer science course, focusing on how characters and symbols are represented in binary form.

  • How are decimal numbers converted into binary code?

    -Decimal numbers are converted into binary code by taking the eighth place value columns from the binary table, identifying which numbers are needed to sum up to the total, and then placing ones under those numbers, filling the rest with zeros.

  • What is the ASCII table and how does it relate to binary numbers?

    -The ASCII table is a standard that assigns a unique decimal number to every character on the keyboard. These decimal numbers can be converted into binary numbers, with each character having a corresponding 7-bit binary representation.

  • Why is 7 bits used for ASCII encoding?

    -ASCII uses 7 bits to represent characters because it allows for 128 different possible values (2^7), which is sufficient to represent all uppercase and lowercase letters, digits, and a range of symbols.

  • What is the significance of the number 65 in ASCII encoding?

    -The number 65 in ASCII encoding represents the capital letter 'A'. It is a standard example used to demonstrate how letters are assigned binary values in the ASCII system.

  • How does the ASCII system handle commands like backspace and delete?

    -Commands such as backspace, escape, tab, enter, and delete are also represented by binary numbers in the ASCII system, with the delete key corresponding to the maximum value of 127.

  • What is the limitation of the ASCII system when it comes to representing characters?

    -The ASCII system has a limitation as it can only represent 127 different characters, which is insufficient for representing the wide variety of characters and symbols used in different languages and modern communication, including emojis.

  • How does Unicode differ from ASCII in terms of character representation?

    -Unicode is similar to ASCII for the first 127 characters but can use more bits (8, 16, or 32) to represent a much wider range of characters, including all alphabets, symbols, and emojis.

  • What is the maximum number of characters that Unicode can represent?

    -Unicode can represent up to 2,147,483,647 different possible characters when using 32 bits, which includes a vast array of symbols and emojis.

  • How can you demonstrate the difference between ASCII and Unicode in a text file?

    -You can demonstrate the difference by typing the letter 'A' (ASCII) in notepad and saving the file, which will be one byte in size. If you type a Unicode character like an emoji using the alt code and save, the file size will be four bytes, reflecting Unicode's use of 32 bits per character.

Outlines

00:00

🔤 Introduction to ASCII and Unicode

This paragraph introduces the concepts of ASCII and Unicode in the context of the GCSE computer science course. ASCII is a standard that assigns a decimal number to every key on the keyboard, allowing characters to be represented in binary form. The ASCII table uses seven bits to represent characters, which can create 127 different numbers corresponding to letters, symbols, and commands. The paragraph explains how the character 'A' is represented by the decimal number 65 and its binary equivalent. It also highlights the limitations of ASCII, which can only represent 127 characters, and introduces Unicode as a solution to represent a broader range of characters, including those from different languages and emojis. Unicode extends beyond ASCII by using more bits, thus supporting a vast array of characters.

05:01

📝 Demonstrating ASCII and Unicode File Sizes

The second paragraph demonstrates the practical difference between ASCII and Unicode through a file size comparison. It instructs viewers to open Notepad and type the character 'a', which is represented by one byte in ASCII. When the same is done with an emoji, which is represented in Unicode, the file size increases to four bytes. This is because Unicode uses 32 bits to store each character, allowing for over two billion possible characters. The paragraph concludes with an invitation for viewers to ask questions and to like the video, emphasizing the significant capacity of Unicode compared to ASCII.

Mindmap

Keywords

💡ASCII

ASCII, or the American Standard Code for Information Interchange, is a character encoding standard that represents letters, digits, and symbols using seven bits. The video explains that ASCII allows each key on the keyboard to be associated with a specific number, like how the character 'A' is represented by the number 65. It's useful for representing basic text characters but is limited to 127 characters.

💡Unicode

Unicode is a more advanced character encoding system that extends the capabilities of ASCII by using 8, 16, or 32 bits. This allows it to represent a much wider range of characters, including symbols, emojis, and letters from different alphabets. In the video, Unicode is described as essential for representing characters from languages not covered by ASCII, as well as supporting over two billion different possible characters.

💡Binary Code

Binary code is the system of representing numbers or characters using only two digits: 0 and 1. The video explains how binary is used to represent both numbers and letters, such as how the letter 'A' is represented in binary as 1000001, corresponding to the decimal number 65. It's central to how computers encode information.

💡Seven Bits

Seven bits refers to the number of binary digits used in the ASCII encoding system to represent characters. The video mentions that ASCII uses seven bits, which allows for a total of 127 possible combinations, each corresponding to a letter, number, or symbol. This is a key limitation of ASCII, as it restricts the number of characters it can represent.

💡Extended ASCII

Extended ASCII is a version of ASCII that uses an eighth bit, effectively doubling the number of possible characters to 256. The video briefly discusses this system as a way to include more symbols and characters, but it still falls short of the extensive range provided by Unicode, especially when dealing with global languages and modern symbols like emojis.

💡Decimal Numbers

Decimal numbers, or base-10 numbers, are the standard numerical system used in everyday life, consisting of digits from 0 to 9. In the video, decimal numbers are used as a reference point when converting characters into binary, such as when the decimal number 65 represents the letter 'A' in ASCII.

💡Character Encoding

Character encoding is the process of assigning numeric values to letters, symbols, and control characters so they can be represented in a computer system. The video focuses on two encoding systems—ASCII and Unicode—explaining how these systems allow computers to understand and process text from different languages and symbols.

💡Emojis

Emojis are graphical symbols that represent facial expressions, objects, or concepts, widely used in digital communication. In the video, emojis are used as an example of characters that cannot be represented by ASCII, but are supported by Unicode, illustrating Unicode's ability to handle a large and growing set of symbols beyond text.

💡Keyboard Commands

Keyboard commands refer to non-character keys on the keyboard, such as Enter, Tab, Backspace, and Delete. The video explains that these commands also have binary representations in ASCII, with the Delete key, for instance, being represented by the number 127, the maximum value ASCII can encode.

💡Bits

A bit is the basic unit of information in computing, represented as either 0 or 1. The video discusses how different numbers of bits are used to represent characters in various encoding systems, such as seven bits in ASCII and up to 32 bits in Unicode, which allows for a significantly larger set of characters.

Highlights

Introduction to ASCII and Unicode for the GCSE computer science course.

Explanation of converting decimal numbers to binary by using place value columns.

Every key on the keyboard has a number associated with it in the ASCII table.

Capital 'A' is represented by the decimal number 65 in ASCII.

ASCII uses 7 bits to represent characters, allowing for 127 different values.

Binary representation of the letter 'A' using 7 bits.

ASCII can represent all uppercase, lowercase letters, digits, and symbols.

Special keys like backspace, escape, and delete also have binary representations.

Limitation of ASCII: it can only represent 127 characters, insufficient for many languages and symbols.

Unicode is introduced as an extended system that includes ASCII and supports more characters.

Unicode can use 8, 16, or 32 bits, supporting over 2 billion different characters, including emojis.

Extended ASCII can use 8 bits, but it still falls short in representing global languages and symbols.

Unicode is identical to ASCII for the first 127 characters, but can represent a much wider range.

Demonstration of typing and saving a character in Notepad shows ASCII file size is 1 byte.

Demonstration of saving a Unicode character shows file size is 4 bytes, reflecting the larger bit requirement.

Transcripts

play00:03

[Applause]

play00:09

in this tutorial I'm gonna cover the

play00:11

terms ASCII and Unicode for the GCSE

play00:15

computer science course we already know

play00:18

how to convert ordinary decimal numbers

play00:21

into binary code we simply take the

play00:24

eighth place value columns for the

play00:27

binary table identify which of those

play00:30

numbers we need in order to add up to

play00:32

the total in this case of 182 and then

play00:35

put ones under each of those numbers

play00:38

filling the remaining columns with zeros

play00:40

but how can we do the same thing for

play00:43

letters we understand numbers can become

play00:46

numbers but how do we turn letters into

play00:49

binary numbers every single key on your

play00:54

keyboard has a number associated with it

play00:57

every single character that you can type

play01:00

can be translated into a decimal number

play01:02

for example the character capital A on

play01:06

your keyboards becomes the number 65

play01:10

this is a standard and is globally

play01:13

recognized it's part of the ASCII table

play01:16

every letter every symbol and every

play01:19

number on your keyboard has a

play01:21

corresponding number associated with it

play01:24

so a is always 65

play01:31

Eskie uses seven bits in other words the

play01:35

binary numbers representing these

play01:37

letters have seven bits to them so we

play01:40

use these seven columns to create a

play01:43

possible 127 different numbers each of

play01:48

which has a number letter or symbol

play01:50

associated with it so if we take seven

play01:54

bits like this we can create a binary

play01:56

number using 64 and one to create the

play02:00

number 65 which represents a so this is

play02:04

the binary representation of the letter

play02:06

A on your keyboard with 127 different

play02:11

possible numbers we can represent all of

play02:14

the uppercase letters or the lowercase

play02:16

letters all of the digits 0 to 9 as well

play02:20

as a wide range of symbols

play02:24

additionally there are also commands on

play02:27

your keyboards that are also represented

play02:29

by binary numbers so the backspace

play02:31

button escape tab the Enter key and also

play02:35

the delete button which actually is the

play02:38

number 127 the maximum that ASCII can

play02:41

represent

play02:48

however there is a problem because

play02:51

whilst these 127 different symbols and

play02:55

characters seem fine they fall far short

play02:57

of the number of characters that we

play03:00

often need to represent these languages

play03:02

for example are not represented in

play03:04

standard asking so how do we represent

play03:07

all these different characters and

play03:10

symbols as well as the wide range of

play03:12

emojis emojis or whatever you'd like to

play03:15

call them the answer is that we use

play03:19

Unicode now Unicode is exactly the same

play03:23

thing as ASCII for all of the characters

play03:25

nought to 127

play03:27

but Unicode can use more bits than

play03:31

asking and so it can represent a wider

play03:34

range of different symbols so it can

play03:36

include all of the alphabets a much much

play03:39

wider range of symbols as well as a wide

play03:42

and growing range of emojis

play03:54

so just to recap them ascii is an

play03:57

encoding system for letters on your

play03:59

keyboard that uses seven bits we can use

play04:02

an eighth bit occasionally which we call

play04:04

extended ASCII which doubles the number

play04:06

of possible characters although that's

play04:08

still far short of what we need for the

play04:11

full set of languages symbols that we

play04:13

use unicode is the same thing as ascii

play04:16

for the first 127 symbols of characters

play04:20

and unicode can support far more

play04:24

characters by using 8 16 or even 32 bits

play04:29

which would allow the support for two

play04:31

billion 147 million four hundred and

play04:35

eighty three thousand six hundred and

play04:36

forty seven different possible

play04:39

characters that's a lot of emojis

play04:45

[Music]

play04:49

now we can try this ourselves if you

play04:51

open up notepad on your computer and

play04:54

type the single character a which we

play04:57

remember is 65 in binary and then save

play05:01

the file we'll see that the file size is

play05:03

one byte which is a full set of eight

play05:07

ones and zeros now if you do the same

play05:10

thing but this time hold down alt and

play05:13

press one and you'll get a single

play05:15

character that looks like a little

play05:16

smiley face in notepad save this and

play05:19

it'll need to be saved as unicode which

play05:22

will result in the file size being four

play05:25

bytes that's four times as large even

play05:28

though it only contains a single

play05:29

character and that's because unicode is

play05:32

using 32 bits to store every single

play05:35

character providing the support for 2

play05:38

billion possible different characters so

play05:41

that's the difference between ASCII and

play05:43

Unicode if you have any further

play05:45

questions please do leave them below and

play05:47

if you like this video please give it a

play05:49

thumbs up thank you for watching

Rate This

5.0 / 5 (0 votes)

الوسوم ذات الصلة
ASCIIUnicodeBinary CodeComputer ScienceCharacter EncodingGCSE TutorialDecimal to BinaryCharacter RepresentationEmoji SupportGlobal Standards
هل تحتاج إلى تلخيص باللغة الإنجليزية؟