COS 333: Chapter 6, Part 1
Summary
TLDRThis lecture delves into data types, focusing on their importance in programming. It begins with an overview of data types, moving on to discuss primitive data types like integers, floating points, and characters. The lecture then covers user-defined types, including ordinal types with enumerations, and concludes with an exploration of array data types. It touches on the design issues related to data types, such as operations defined for a type and syntactic mechanisms for specification, providing a comprehensive foundation for understanding data structures in programming.
Takeaways
- π The lecture delves into various data types, starting with an introduction and covering primitive data types, character strings, user-defined ordinal data types, and array data types.
- π’ Data types define a range of values, memory storage methods, and a set of predefined operations for data objects, emphasizing the importance of type in programming languages.
- πΎ A data type's memory representation can vary, such as using two's complement for integers or IEEE standards for floating points, highlighting the binary encoding of data objects.
- π‘ Character strings are sequences of characters and can be implemented as primitive types or as arrays of characters, with operations like concatenation and substring referencing.
- π User-defined ordinal types, such as enumerations, map a set of named constants to integer values, providing readability and type safety in programming.
- π Arrays are aggregate data types that can be static, fixed stack dynamic, stack dynamic, fixed heap dynamic, or heap dynamic, each with different allocation and binding properties.
- π The dynamic nature of data types, especially arrays, allows for flexibility in programming, but also introduces complexity in memory management and type safety.
- π The advantages of certain data types, like boolean for readability and integer for direct hardware mapping, are discussed, along with their impact on programming language design.
- π The disadvantages, such as memory inefficiency in decimal types or limited range in fixed-length arrays, are also considered in the context of programming language evaluation.
- π The lecture examines different programming languages' support for data types, such as Java's support for primitive data types and Python's support for complex numbers.
- π The importance of design issues in data types, like operation definitions and syntactic mechanisms for specifying data types, is emphasized for programming language development.
Q & A
What are the three important aspects defined by a data type in relation to data objects?
-A data type defines a collection of data objects with a range of values, how the data objects are stored in memory, and a set of predefined operations on these data objects.
How does a data type's range of values differ between integers and floating-point values?
-For integers, the range of values is defined by a minimum and maximum whole number that can be represented without a fractional portion. For floating-point values, the range is defined by the minimum and maximum values allowable for the type, along with the precision associated with the type.
What is the significance of the IEEE standard in floating-point representation?
-The IEEE standard floating-point representation is significant as it provides a consistent and widely used method for encoding floating-point values on a binary level, ensuring compatibility and accuracy across different systems and platforms.
Why might a programming language provide support for unsigned integer types?
-Unsigned integer types are provided to allow for a greater range of positive integer values, effectively doubling the range of numbers that can be represented with the same amount of memory, compared to signed integers.
What is the primary advantage of using a decimal type over a floating-point type for representing monetary values?
-The primary advantage of using a decimal type is accuracy, as it stores data using a fixed number of decimal digits, providing an exact representation without the approximations inherent in floating-point types.
What is the main drawback of using a character encoding scheme like ASCII?
-The main drawback of using ASCII is that it only represents a limited set of characters, primarily English letters and some symbols, and does not support the wide range of characters needed for other languages.
Why is the support for character strings considered important in programming languages?
-Support for character strings is important because it enhances readability and writability, allowing programmers to work with sequences of characters in a more intuitive and efficient manner, which is especially useful in applications that involve text manipulation.
What are the two main design issues to consider when implementing support for character strings in a programming language?
-The two main design issues are whether character strings should be a primitive type or a special kind of array, and whether the length of the string should be static (fixed) or dynamic (able to grow and shrink during program execution).
What is an enumeration type and how does it differ from a primitive ordinal type?
-An enumeration type is a user-defined data type with a set of named constants as its possible values. It differs from a primitive ordinal type, which is built into the programming language and has values that can be mapped to a set of positive integers.
Why are heap dynamic arrays considered more flexible than other types of arrays?
-Heap dynamic arrays are considered more flexible because they allow the array to grow and shrink during program execution, with the memory allocated from the heap rather than the stack, which means the array's size can change as needed without being constrained by the scope of the variable.
Outlines
π Data Types and Their Fundamentals
This paragraph introduces the concept of data types, delving into the specifics of what they define and how they are represented in memory. It explains the three key aspects of data types: the range of values they encompass, their bit-based memory representation, and the predefined operations applicable to them. The paragraph also touches on the importance of data types in programming and the distinction between data objects and objects in object-oriented programming.
π’ Delving into Primitive Data Types
The focus shifts to primitive data types, which are foundational to most programming languages and often directlyζ ε° to hardware-level representations. The paragraph discusses various primitive data types, including integer types with signed and unsigned variants, floating-point types that approximate real numbers, and the IEEE floating-point standard 754. It also covers the representation of complex numbers and the unique aspects of decimal and boolean data types, emphasizing their storage and operation efficiencies or inefficiencies.
π Character Data Types and Encoding Schemes
This paragraph explores character data types, the necessity of numeric coding schemes for character representation, and the evolution of encoding standards from ASCII to Unicode. It discusses the limitations of ASCII and how Unicode, with its various standards like UCS2, UCS4, and UTF, addresses the need for representing a broader range of characters from different languages. The paragraph also examines character strings, their potential as primitive or array-based data types, and the implications of static versus dynamic string lengths.
π Operations and Design Considerations for Character Strings
The paragraph delves into the operations that can be performed on character strings, such as assignment, copying, comparison, concatenation, and pattern matching, often facilitated by regular expressions. It also discusses design considerations for character string types in programming languages, including whether strings should be primitive, the dynamics of string length, and the decision-making process for supported operations.
π User-Defined Ordinal Types and Enumerations
The paragraph introduces user-defined ordinal types, contrasting them with built-in primitive ordinal types like int, char, and boolean in Java. It provides an in-depth look at enumeration types, explaining how they are defined, their use of named constants, and the design issues surrounding their implementation in programming languages, such as coercion to and from integer values and the allowance of enumeration constants across multiple types.
π·οΈ Advantages and Evaluation of Enumeration Types
This paragraph evaluates the benefits of enumeration types, such as increased readability and writeability, and their role in enhancing the reliability of programming languages by preventing invalid operations and values. It also discusses the varying levels of support for enumeration types across different programming languages, highlighting those that prevent coercion to integers, thus avoiding potential errors.
π An In-Depth Look at Array Data Types
The paragraph provides a comprehensive overview of array data types, discussing the design considerations for their implementation in programming languages. It covers the legality of different types for subscripts, the importance of range checking for subscript expressions, and the timing of subscript range and storage binding. The paragraph also touches on the maximum number of subscripts allowed, the initialization of arrays, and support for array slices.
π Array Subscripting and Storage Binding Categories
This paragraph categorizes arrays based on their subscript binding and storage binding characteristics, explaining the differences between static arrays, fixed stack dynamic arrays, stack dynamic arrays, fixed heap dynamic arrays, and heap dynamic arrays. It discusses the advantages and disadvantages of each category, such as efficiency, space utilization, flexibility, and the potential for dynamic resizing.
π οΈ Modern Programming Languages and Array Support
The paragraph examines how modern programming languages, including C, C++, Java, C#, and scripting languages like Perl, JavaScript, Python, and Ruby, support different array categories. It highlights the specific features and limitations of each language in terms of array management, such as the use of 'new' and 'delete' in C++, garbage collection in Java, and the ArrayList class in C#.
π Array Initialization Techniques
The final paragraph discusses array initialization, showcasing how different programming languages, such as C, C++, Java, C#, and Ada, provide structures to initialize arrays at the time of allocation. It illustrates various initialization techniques, including the use of braces to define initial values in C-like languages, Ada's elaborate support for array initialization with fine-grained control, and Python's list comprehensions.
Mindmap
Keywords
π‘Data Type
π‘Variable
π‘Primitive Data Types
π‘Character Encoding
π‘Array Data Types
π‘IEEE Floating Point Standard
π‘Concatenation
π‘Descriptor
π‘User-Defined Ordinal Data Types
π‘Static and Dynamic Length Strings
π‘Pattern Matching
Highlights
Introduction to data types and their importance in defining a collection of data objects, storage in memory, and operations on these objects.
Explanation of how data types specify a range of values, such as the minimum and maximum for integers, and precision for floating points.
Discussion on the bit-based representation of data types, including encoding schemes like ASCII and IEEE floating-point standards.
Clarification of the difference between data objects and objects in object-oriented programming, emphasizing the programmer-defined abstract data types.
Introduction to primitive data types, which are not defined in terms of other data types and often reflect hardware-level representations.
Analysis of integer data types, their various forms in different programming languages, and the concept of signed and unsigned integers.
Overview of floating point types, their use of scientific notation in binary, and the IEEE floating point standard 754.
Complex type explanation, its limited support in programming languages, and its unique representation involving two floating point values.
Decimal type discussion, its use in business applications for monetary values, and the difference from floating point types.
Boolean data type exploration, its simplicity with only two values, and its implementation using bytes instead of bits.
Character data type examination, the necessity of numeric coding schemes like ASCII and Unicode for character representation.
Character strings as sequences of characters, the design issues of whether they should be primitive or arrays, and their operations.
Enumeration types, their definition by programmers, and the use of named constants to represent a series of values.
User-defined ordinal types, the concept of mapping values to positive integers, and the use of enumerations in programming languages.
Array data types, their role as collections of homogeneous data elements, and the various design issues related to their implementation.
Array initialization techniques, the different ways programming languages allow for setting initial values of arrays.
Conclusion of the lecture with a summary of the discussed data types and a preview of upcoming topics in the next lecture.
Transcripts
in the previous chapter we discussed
names bindings and scopes
part of this discussion dealt with
variables where we looked at a number of
attributes that are bound to each
variable and one of these attributes is
the type of the variable
in this chapter we will be looking at
these data types in more detail
these are the topics that we'll be
discussing in today's lecture we'll
begin with a quick introduction into
data types after which we'll move on to
primitive data types we'll then take a
look at character string types
user-defined ordinal data types and then
finally array data types
now we've already discussed the concept
of a data type in the previous chapter
to reiterate what we spoke about there
we said that a data type defines three
important things in relation to data
objects
so firstly a data type defines a
collection of data objects with a range
of values
so for example if we are talking about
an integer data type and then we specify
that integral whole values can be
represented without a fractional portion
and we also have then a particular
minimum and maximum value that defines
the range of values within which those
integer values then can fall
in a similar fashion if we have floating
point values we also have a minimum and
maximum allowable value for that
floating point data type and we also
have a precision associated with the
type
then secondly a data type also defines
how the data objects are stored in
memory so what we are talking about here
is a bit based representation
in other words
what kind of representational scheme are
we using to encode these values on a
binary level
so for example we may have unsigned
integer values or we might for example
use a two's complement representation or
a sign and magnitude representation if
we are representing floating point
values we would be using the ieee
standard floating point representation
if we are representing characters we may
be using an ascii encoding or a unicode
encoding all of these are mechanisms for
representing these data objects within
memory on a binary level
and then in the third place a data type
specifies a set of predefined operations
on these data objects so for example
integer values can be added to one
another or subtracted from one another
they can also be multiplied or divided
however for example a concatenation
operation is not defined for an integer
value
on the other hand if we're talking about
strings concatenation or substring
operations may be defined but for
example it isn't a defined operation to
allow two strings to be multiplied by
one another for instance
now in this context it's important to
understand that when we're talking about
a data object we're not talking about an
object in the sense of object-oriented
programming
instead we're talking about objects
which represent integers of a programmer
defined abstract data type so in other
words what we are specifying then here
is a representation of a data type but
which is very importantly specified by a
programmer
now in addition to this we also have the
concept of a descriptor and this is
simply the collection of the attributes
associated with a variable as i
previously mentioned we discussed all of
these attributes in the previous chapter
and i won't go through them in any
further detail at this point
so there are two main design issues then
that arise for all data types
firstly which operations are defined for
this particular data type that we are
currently considering
and secondly how are data types
specified what kind of syntactic
mechanism do we use to specify
a data type
so now we'll move on to primitive data
types
now almost all programming languages
provide a set of what we refer to as
primitive data types and the primitive
data type is simply a data type that is
not defined in terms of other data types
now a large number of these primitive
data types are simply reflections of
what is actually happening on a hardware
level as we'll see in a moment
but some other primitive data types are
not direct representations of how these
values are represented on a hardware
level however they require only a little
non-hardware support for their actual
implementation
the first primitive data type that we'll
look at is the integer data type which
you should be very very familiar with by
now
so integer data types almost always an
exact reflection of what's happening on
a hardware level and therefore the
mapping between the type in the
high-level programming language and what
is actually being represented is fairly
trivial
now there may be as many as eight
different integer types so for example
if we look at java then java has only
signed representations of its integer
types
and there are four integer types namely
byte which is the smallest
then short
int and long which is a long integer
representation which is the largest on a
bit level
so this of course then means that the
range of values that we can represent in
a byte is much smaller than the range of
values that we can represent in a short
int or long
now some languages also then have
unsigned integer values so for example
if we look at c and c plus plus then
there are unsigned versions of bytes
short integers integers and long
integers so what i would like you to do
at this point is pause the video and try
to answer what the advantage is of
providing support for unsigned integer
types
next we have floating point types which
are also primitive data types and these
types model real numbers but only as
approximations
and so they use a kind of a scientific
notation but on a binary level in order
to perform this kind of representation
now
languages that are intended for
scientific use and this also then
extends to multi-purpose programming
languages that can also be used for
scientific purposes such as cnc plus
plus
will provide at least two floating point
data types
so for example in the c based languages
we have the float and double types
float represents a single precision
floating point value whereas double
represents a double precision floating
point type
now there are sometimes more floating
point data types that are supported in
certain programming languages
so for example there might be a long
double type that may be supported in a
particular programming language
some implementations of cnc plus plus do
provide support for a long double type
however this is exactly the same as a
regular double type so it all depends on
the specific programming language and
the platform that the programming
language is intended to compile code for
now usually floating point data types
are exact representations of the
hardware counterparts on a binary level
but this isn't always necessarily the
case
now there are different ways of
representing floating point values but a
very commonly used approach
today is the ieee floating point
standard 754
which is very commonly used
so on the bottom right of the slide we
see a representation of a single
precision floating point value at the
top and at the bottom a double precision
floating point value we can see each of
these representations consists of a
number of bits where these bits are
subdivided into a single sine bit in
each case
and then a number of bits for the
exponent in the case of our single
precision floating point value we have
eight bits for the exponent for double
precision representation we have 11 bits
for our exponent and then finally the
fractional portion which consists of 23
bits in the case of the single precision
floating point value and 52 bits in the
case of the double precision floating
point value
so
we see then that there are two
attributes associated with floating
point values and these relate to the bit
based representation of these values
firstly we have precision and this is
then the accuracy of the numbers
fractional part so in other words
to how many digits after the
decimal point um can we accurately
represent a floating point value
and this then directly
relates to the number of bits in the
fraction portion of the representation
then we also have the range which
defines the minimum and the maximum
values that can be represented using a
particular number of bits within a
floating point representation
now this is defined by both the number
of bits in the exponent part of the
representation as well as the number of
bits in the fraction part of the
representation but most important of
these is the number of bits in the
exponent part of the representation so
the more bits we have available for the
exponent the larger the range is that
can be represented
the next primitive data type that we'll
look at is the complex type which is
used to represent complex numbers
now the complex type is not very widely
supported there are only a small handful
of programming languages that support
complex types the most notable of which
is fortran however python also supports
the complex type
so what i would like you to do at this
point is to pause the video and try to
explain why it makes sense for the
fortran programming language to support
the complex type
now complex types are a little bit more
complicated in terms of their
representation than integers and floats
they consist of two
floating point values however they are
treated as a single unit and this is why
complex types are still considered
primitive data types and not a compound
type such as the user-defined record
data type which we'll speak about later
on in this chapter
so the first floating point part of a
complex type is the real part and the
second part is the imaginary part now
the literal form for a complex value
needs to represent both the real and the
imaginary part
so in python we would use the notation
seven plus three j in parentheses and
this then represents a complex value in
its literal form the literal form being
the actual representation of a value
using this type so in the same way that
7 is a literal representation of an
integer and 3.48 is a literal
representation of a floating point value
7 plus 3j is the literal representation
of a complex value
so in this case the 7 then is the real
part and the 3 is the imaginary part
the next primitive data type we'll look
at is the decimal type now it's very
important that you don't confuse decimal
types and floating point types which we
discussed previously both are used to
represent real values however they do so
using a different kind of storage format
now decimal types are used in the
context of business applications
particularly when they are required to
model monetary values
so they are essential and very much at
the core of the cobol programming
language but c sharp also offers support
for a decimal data type
so what i would like you to do at this
point is to pause the video and try to
explain why cobol provides support for a
decimal data type and why the concept of
a decimal type is so central to cobol
also related to this try to answer why c
sharp
would provide support for a decimal data
type and why this makes sense in the
context of c sharp and what c sharp is
intended for as a programming language
all right so instead of using an
approximate representation the way that
floating point values do
a decimal primitive data type stores
data using a fixed number of decimal
digits
so each digit is then represented
separately using a coded representation
which is referred to as a binary coded
decimal representation or alternatively
a bcd representation what this means
then is that each digit of a decimal
number is represented separately in
other words each digit is encoded as a
separate number
unto itself what this means then is we
have a very accurate representation
because each individual digit is
represented explicitly we're not working
with an approximate representation the
way that we were with floating point
values and this is the primary advantage
associated with decimal types
so what i would like you to do at this
point is to pause the video and try to
answer why this accuracy is important in
the context of business applications
particularly when we are working with
monetary values
so the primary advantage then associated
with decimal values is that they are
very accurate however there are two
major disadvantages associated with
decimal types
the first is that we waste a lot of
memory and the reason for this is that
the bcd representation that we use is
not very compact so we see
that with floating point values we have
a very compact notation we use a
scientific notation like representation
of course represented on a binary level
and this is a very efficient
representation scheme we have a compact
representation in terms of the number of
bits that are used in memory to
represent our floating point values but
we also have
operations that are defined for these
floating point values which have been
optimized and can be performed very
efficiently
so we also then as a disadvantage
associated with decimal values have a
fairly limited range and the reason for
this
is that we have a limited number of
decimal digits that are represented now
of course we can increase the range by
adding more decimal digits however this
will then require a larger bit based
representation within memory so we are
always going to run into a problem where
we want to represent very large values
either negative or the positive
direction then we are going to require a
lot more memory and therefore our range
is limited in one respect or another
another disadvantage associated with
decimal types which is not explicitly
mentioned in the textbook but
is also a concern to keep in mind
is that operations for decimal values
are not as efficient as operations for
floating point values and the reason for
this is that these operations typically
need to be simulated in software because
they are not directly supported on a
hardware level typically with floating
point values operations such as addition
subtraction multiplication and division
are actually implemented on a chip level
and therefore are very efficient but
this is not the case when we are talking
about decimal representations and
therefore these calculations are a lot
less efficient when it comes to decimal
values
the next primitive data type we'll look
at is the boolean data type which you
should also be very familiar with the
boolean data type is the simplest data
type of all
and it has a range of values that
consists of only two elements one
element representing a true value and
the other representing a false value
now it would be possible to implement
billion data types using single bits
however in most modern computers these
days addressing limitations don't allow
individual bits to be addressed and
therefore we cannot retrieve individual
bits one at a time so as a result of
this usually a full byte consisting of
eight bits is used in order to represent
a boolean value
so what i would like you to do at this
point is to pause the video and try to
think of a disadvantage of this kind of
representation
so then if we look at the primary
advantage of a boolean primitive data
type
this is readability so what i would like
you to do at this point again is to
pause the video and try to explain why
boolean primitive data types contribute
to the readability of a programming
language that supports them
the last of the primitive data types
we'll look at is the character data type
now of course characters can't be
directly represented in memory
and therefore a numeric coding scheme is
required to represent any character data
what this means is that on a binary
level a sequence of bits will represent
each individual character each sequence
of bits translates to a number and this
number is a numeric code that maps to a
specific character
now there are a wide variety of
different encoding schemes that can be
used in order to represent characters
in general the more bits that are used
for an encoding the larger the range of
numeric encoding values there are which
means we have then a larger set of
characters that we can represent
now the most commonly used encoding
scheme is the ascii encoding scheme
which you should be familiar with ascii
stands for american standard code for
information interchange and here a
single byte is used for each of the
numeric codings what this means is we
can then have 256
separate numeric codings which means we
can represent
256 secret characters now the major
drawback associated with the asking
coding scheme is that it only represents
roman characters so it's only really
useful for representing english
what this then has led to is other
encoding schemes which use a larger
number of bits to represent each of the
individual characters allowing for a
much larger set of characters which can
then be used to represent non-english
alphabetical characters
so alternatives then generally fall into
the unicode category
and here we have a 16 bit or two byte
encoding scheme referred to as
ucs2
so this allows for the characters from
most of the natural languages that exist
in the world to be represented and this
was originally supported in the java
programming language c-sharp and
javascript however also support unicode
and most modern programming languages
have been extended or designed from the
ground up to support the unicode
now there is a larger unicode standard
which uses 32 bits to represent each of
the individual character encodings in
other words four bytes and this is
referred to as
ucs4 so this was originally supported by
fortran starting with the 2003
update
now the ucs standards have largely been
superseded by the utf standard ucs
stands for universal coded character set
and utf stands for unicode
transformation format
so these days utf is more commonly used
and the equivalent of ucs2 in utf format
is
utf-16 whereas the equivalent of ucs4 is
utf-32
so
java and c-sharp use the newer utf-32
standard in order to represent their
characters
so this brings us rather neatly to the
next data type we'll look at namely
character strings
so character strings are data types
where the values are simple sequences of
characters where one character follows
on from another in a linear sequence
now it is possible for a character
string to be primitive in nature in
which case each character string is
handled as a single self-contained
entity however it's also possible for a
character string to not be primitive in
nature in which case the character
string is usually represented as an
array of characters where each character
is then a primitive type
now there are two main design issues
that need to be decided on when
designing a character string type in a
programming language firstly will
character strings be a primitive type or
just a special kind of array that stores
characters
and secondly should the length of the
string be static in other words fixed
once after compile time or should it be
dynamic in nature in other words the
length of the string can grow and shrink
over the course of program execution
time
now of course as we've seen before a
type also specifies the operations that
are valid for objects of a particular
type
so when we are talking about character
strings there are usually a variety of
operations that might be valid for
character strings and during the design
of a character string type one must
decide on the operations that will be
supported by the programming language
for these character strings
so typical operations are firstly
assignment and copying these are two
different operations so let's assume for
example that we have two strings a and b
and we then perform an assignment a
equals b
now if we are talking about an
assignment then very often what happens
is we are simply performing a reference
assignment so in other words if we have
a equals b this means that a will then
refer to the string b
and if we change b then a will also
refer to the changed string and vice
versa
a copying operation is actually then a
literal copying of the content of one
string into another in which case then
each of the two copies are completely
independent from each other so if we
have two strings a and b and we perform
the assignment a equals b then if we
perform a copying operation then the
content of b will be copied into the
string a if we modify b then a will not
be modified and vice versa
we also typically have comparison
operations so we have an equality
operation usually which tests to see
where the two character strings are
equivalent to each other in other words
they contain exactly the same characters
we may also have inequality operations
such as greater than and of course here
we need to decide on the semantics of
this kind of operation
usually this involves a comparison of
characters on a numeric level
where we then actually compare the
encoded numeric values of each of the
individual characters to each other but
again this is something that needs to be
decided during language design time
the incarceration or concatenation is an
operation where two character strings
are added to one another where one
string is then added to the end of the
other string
of course here the semantics must also
be decided upon so is one of the strings
modified by means of the concatenation
operation or do we generate a new string
which is the result of the concatenation
then we very often also provide
support for substring referencing in
high-level programming languages
so here we are referring to a
sub-portion of an existing larger
character string again we need to decide
how this referencing takes place are we
simply referring to a character within
an existing string or is there a
mechanism for referring to a specific
sub-portion of the string
and then finally pattern matching is
supported by a number of modern
programming languages particularly
scripting languages and here what we are
talking about is providing a mechanism
whereby a particular characteristic of a
string can be expressed
and we then want to determine whether a
string matches the
specification now normally
in
the modern day high level programming
languages that we usually encounter
this specification
for a
pattern matching sequence would be
performed by means of what's referred to
as a regular expression there are
competing standards that were
alternatives to regular expressions in
the past but most of those have fallen
away and these days
in almost all situations where we
perform pattern matching we will be
using regular expressions of some sort
now of course different programming
languages provide different kinds of
support for character strings
so we'll look at a few examples of
programming languages and how they
support character strings on this slide
and the next
cnc plus do not provide primitive string
data types however they do provide
support for character strings by means
of arrays that contain character types
they also provide operations such as
concatenation and the extraction of
substrings from existing strings by
means of a set of library functions that
perform operations on these character
arrays
fortran and python both provide a
primitive character string type and they
also then provide built-in support for
assignment as well as several other
operations that can be performed on
character strings
java provides a primitive character
string type by means of the string class
however it does also provide something
similar to c and c plus plus's character
arrays by means of the string buffer
class the string buffer class is in a
lot of ways much more efficient to work
with than the string class is and the
reason for this is that operations on
the string class such as concatenations
for example
will result in a new string being
created
this then results in a lot of additional
objects being created in memory and then
potentially being disposed of as well
which means that the garbage collector
needs to work overtime the string buffer
class on the other hand provides direct
access to characters within it so
therefore one can perform character
manipulation operations on string buffer
objects which are then much more
efficient because new objects are not
being created we are instead simply
modifying the existing string buffer
object that we are working with
the snowball 4 programming language
which we spoke about briefly in chapter
2
is as we've seen a string manipulation
language it was designed for the
implementation of text editors
so because of this the manipulation of
character string data is very central to
the language and therefore the string
type is primitive in snowball 4.
there are also a wide variety of
operations that can perform
manipulations on these strings
including very elaborate pattern
matching
what's interesting to note here is that
the pattern matching matching uses a
very different mechanism to modern day
regular expressions
then perl javascript ruby and php all of
which are scripting languages all
provide built-in support for pattern
matching and they all use regular
expressions which as i've previously
mentioned has become the standard today
of course strings have a length
associated with them which is equivalent
to the number of characters stored in
the stream
now there are three different ways that
the length of strings can be managed by
a programming language and we'll look at
each of these in turn now and also look
at some practical examples of
programming languages that use these
different string length options
so first of all we have static length
strings and as the word static implies
here the length of the string is
determined before runtime and doesn't
change through the course of runtime
now cobol is an example of a programming
language that supports static length
strings
and interestingly enough java's string
class is also an example of a static
length string
so what this then implies is that when
we're working with static length strings
we can then not grow or shrink the
strings through the course of run time
so what this means is any operation that
is performed that would modify a string
so for example removing characters from
the string or doing something like
concatenating two strings together
actually they need to produce a new
static length string so this is the case
with java string class
if we concatenate two java string class
objects together then what actually
happens behind the scenes is a new
string object is created and this then
stores the characters from the two
strings that are concatenated with one
another
the next option is limited dynamic
length strings so here the word dynamic
implies that the length of a string can
change through the course of execution
however this is limited
so
two good examples of programming
languages that support limited dynamic
length strings are c and c plus plus
which you should be relatively familiar
with
so what happens here then is that a
fixed length structure is used to house
the string and in the case of cnc plus
plus this structure would be an array
so the length of the array is then fixed
it might be determined prior to runtime
if this is just a static
array
but it may also be determined at runtime
if we for example dynamically allocate
the array that the string will be
contained in
either way the length of the string is
fixed once it has been set and it cannot
change through the course of runtime
so now our array stores the characters
in the string and then a special
character is used to indicate the end of
the string and in the case of c and c
plus plus this is a null character
the length of the string is also not
maintained explicitly
so what this thing means is we can then
store
a string that can be contained within
the array
and we can store as many characters as
the array allows us to store as long as
we leave a space for the special
character but we can also have strings
that are shorter than this length and in
this case we are then essentially
terminating the string early
and before we reach the balance of the
array by means of the use of this
special character
now because the length isn't maintained
explicitly we can't generally look this
length up however we can determine the
length at runtime by iterating through
the characters in our string and
counting them one by one until
eventually we reach our special
terminating character
then in the third place we have dynamic
length strings
and as the word dynamic implies here
these are strings where the length can
change through the course of run time
so these strings in other words then are
variable in length they can grow and
shrink and there typically isn't a
maximum length
examples of programming languages that
support dynamic length strings are
snowball for
perl and javascript so we see that these
programming languages or scripting
languages
and this of course makes sense because a
dynamically growing shrinking string is
something that you would typically want
in the context of a scripting language
where you want to very quickly knock
together a program without having to
worry about memory allocation and
de-allocation
so dynamic link strings are different to
limited dynamic length strings in that
we are no longer constrained by the
containing structure in the case of cnc
plus the array structure
so behind the scenes a dynamic length
string may be implemented in a variety
of different ways
we may use for example some sort of
linked structure like a linked list
where each node would then store a
character or alternatively we may be
using arrays behind the scenes and then
we have automatic growing operations
that take place where a new array would
be allocated that would be large enough
to contain the resultant string after a
string manipulation operation
now ada gives us a lot of flexibility
ada in fact supports all three of these
string length options so it supports
static length strings limited dynamic
length strings and also
fully dynamic length strings
so let's evaluate character strings in
terms of whether they are useful to be
included in a programming language or
not so in order to do this we need to
look at the advantages
and potentially disadvantages in
relation to our programming language
evaluation criteria which we've been
using throughout this course
well character strings provide increased
readability
so what i would like you to do at this
point is pause the video and try to
explain why character strings contribute
positively to write ability
and in order to do this you'll need to
think about what the alternative would
be if you wanted to support sequences of
characters in some way
assuming that
character strings are not directly
supported by the programming language
you're considering
so if we implement character strings as
a primitive data type with static length
then this is fairly inexpensive for us
to provide
and this is of course because all of the
memory allocation
is happening prior to run time
so this thing makes these kinds of
strings where the length is static and
we implement them as a primitive type
very efficient in terms of runtime
execution
so why not then support them in a
programming language
so if we then move from our primitive
static link strings through to dynamic
length strings then of course this is
very nice if our programming language
supports these kinds of strings we have
very flexible string strings that can
grow and shrink and this thing of course
improves the writability of our
programming language
however is it worth any additional cost
that might be associated with the
support
so what i would like you to do at this
point is pause the video
and try to explain why there is a
negative cost impact in other words why
does the cost of the programming
language increase if we provide support
for dynamic length streams
think about how the memory allocation
and the allocation needs to work for
these kinds of strings
we'll now take a look at the
user-defined ordinal types so in order
to understand what a user-defined
ordinal type is we first need to know
what an ordinal type is
and this is quite simply a type that has
a series of possible values but each
possible value can be easily associated
with the set of positive integer values
now it is of course possible for ordinal
types to be provided by a programming
language and this is the case in java
but the same also holds in c and c plus
plus as well as a wide variety of other
programming languages
so we see in java that the built-in
primitive types int char and boolean are
all three ordinal types now why is this
the case well a variable of type int has
a value that is an integer value and of
course it also allows for negative
integer values
but it is possible by simply adding a an
offset to each of the integer values
that we can transpose then every value
that is legal for an integer variable
into the positive range so what we now
have is a situation where each of the
values that an int variable can take on
will then correspond to a positive
integer value
so next if we look at
the chart type in java
and we see that we have a series of
characters that can be represented by a
variable of type chart
however what we saw previously in this
lecture is that every character maps to
a positive integer code
because we have to use an encoding
scheme in order to represent our
characters in memory so what this then
means again is that every value that
char variable can take on in other words
each character value that a child
variable can take on then maps to a
positive integer value
and then in the third place we have
boolean variables
so here we then of course have a
variable which is of type boolean and
can take on one of two values a true
value or a false value now of course we
can map true and false values to zero
and one values and this then again
allows us then to create a
correspondence between every value that
a brilliant variable can take on and a
positive integer value
so these are then all examples of
primitive ordinal types because they are
built into the java programming language
but we also have user-defined ordinal
types which are then ordinal types that
have been defined by a programmer
now there are several user-defined
ordinal types that are possible but we
are only going to be looking at one of
these namely the enumeration type which
is quite commonly used in practice
so let's spend some time looking at
enumeration types in a bit more detail
so an enumeration type is a type that
can be defined by a programmer and it
has a series of possible values that it
can take on where each of these values
will be then provided in the definition
of the enumeration type and also these
values are named constants
so let's look at an example of this in
the c-sharp programming language but
enumerations are supported in a variety
of other programming languages including
c and c
so here we have then a definition of an
enumeration type this is indicated by
the special word enum
and we provide then a name for our
enumeration type which in this case is
day
now this doesn't mean that we are
defining a variable named day we are now
creating a new type called day so
a variable then of type day can be
defined we could for example have a
statement that declares a variable
where we would specify day my day and in
this case my day would then be the name
of the variable and the type would then
be day
now following the name that we have
provided for our enumeration type we
then have a pair of braces and in the
braces we then have a list of named
constants and they are separated by
means of commas
so we can see for example that man is a
named constant
2 is also named constant and this holds
for all of the remaining values all the
way up to sun
so these are then basically just names
that are used to refer to values
and what then happens in c sharp as well
as in a variety of other programming
languages is that the first named
constant in this case man will then map
to an integer value of zero
two would then map to an integer value
of one and so on and so forth
so there are three main design issues
that need to be decided on if we are
going to provide support
for enumeration types in a programming
language
so firstly can enumeration constants
appear in more than one type definition
for example in more than one enumeration
so let's take this back to the example
that we just discussed let's assume that
we have this enumeration type called day
defined and then we define a second
enumeration type that we call week end
day and weekend day then has two values
that it can take on namely sat and sun
now in this case we are then reusing the
named constants sat and sun they appear
once in the enumeration type day but
they also appear in the enumeration type
week end day so the question is then
does the programming language allow for
this or not now some programming
languages will just simply disallow this
and this removes any chance of any
ambiguity
however it does reduce flexibility
somewhat
but if we do allow then these named
constants to appear in multiple
enumeration types then how do we
differentiate between them so we need
some sort of syntactic construct that
specifies whether we are referring to
sat or sun
in the enumeration type day or weekend
day and this is necessary so that we can
perform a type and check for each of
these constants we need to know which
specific type we are referring to
then secondly our enumeration values
closed in other words automatically
converted to integer values now we saw
in our example that each of the named
constants maps to an integer value so
this kind of coercion does make sense
however there is an extended implication
to this so if we allow enumeration
values to be automatically converted to
integers then all of the
operations that are valid for integers
will also then be valid for
instances of the enumeration type
so for example addition and subtraction
would be valid now if we take this back
to our previous example what this thing
will mean is that if we allow variables
of type data to be automatically
converted into integers then this means
that we can add variables of type day
and we can subtract them from one
another does it make sense for us to add
2 to man probably not the same also
holds for subtraction
so some programming languages will
completely disallow this kind of
coercion
however other programming languages will
treat enumeration types as integers
just simply because the mapping
trivially makes sense
then in the third place we have a fairly
similar design issue to
the second one that i just discussed
are other types coerced to enumeration
types so for example if we have integer
values
then can those be automatically
converted into an enumeration type this
will for example then allow us to create
a variable of type day so let's call it
my day and then we could assign to that
an integer value of one for example and
that assignment would then be equivalent
to assigning two
to our variable
so is this allowed by a programming
language or not well of course it does
make logical sense that this kind of
coercion should be allowed because again
there's a trivial mapping between these
named constants and integer values
however
a question then arises what happens if
we attempt to assign an integer value
that is outside of the balance of the
enumeration so for example if we have my
day which is of type day we would then
be able to assign a value of 15 to that
how does that assignment then get
handled is it automatically converted
into the appropriate range
for our day enumeration is that
assignment not allowed does it generate
some kind of runtime error
the and these are all questions that
need to be answered in terms of how the
programming language will go about
dealing with these kinds of coercions
the same of course holds for assigning a
negative integer value to an enumeration
type
so let's evaluate enumeration types
well the main advantage introduced by
support for enumeration types is that
they increase readability
so let's assume that we have a program
and we want to represent a number of
colors now if we don't have support for
enumeration types in our programming
language we would have to create a
separate integer code for each one of
the colors that we want to represent
so this is of course not very
readable because we are referring to
integer codes we need to memorize these
integer codes and of course it is quite
error-prone in practice to do this
so instead of then having separate
integer codes for our individual colors
we can in our program then simply define
an enumeration type
lists the colors that we want to
represent and then the encoding is
handled for us behind the scenes we
don't need to worry about that
so that then increases readability it
could also be argued that it improves
writeability
now enumeration types also aid the
reliability of the programming language
in which they are supported and this is
because we can build in checks that the
compiler can perform to ensure the
validity of variables of a particular
enumeration type
so we can perform then
checks related to operations so for
example the compiler can disallow
instances of an enumeration type from
being added to each other so we can then
block the possibility of two colors
being added to one another
the compiler can also check to ensure
that values that are assigned to an
enumeration variable are all within a
valid range so for example if we have a
set of 10 colors and we try to assign an
integer value of 50
to that enumeration type variable
then the compiler can disallow that and
can generate an error
which will alert the user to the fact
that a value outside of the legal range
is being assigned to an enumeration
variable
now ada c-sharp java 5.
these all have better support for
enumeration types than c plus does and
the reason for this is that enumerations
are not coerced into integer types in
these programming languages so this then
this allows a situation where we for
example would add two variables of a
particular enumeration type to one
another that kind of operation is not
allowed by ada c-sharp and java 5.
the last type that we'll look at in this
lecture is the array type
so you should be familiar with arrays
arrays are basically an aggregate or a
collection of homogeneous data elements
so in general we assume that all of the
data elements contained within an array
have exactly the same type so for
instance we can define an array of
integers or an array of float values
now individual elements are then
identified by means of the position
within the array and this position is
relative to the first element and we
usually refer to this then as indexing
or subscripting
so in general we will find that the
first index or subscript is zero the
next one along is one next one along is
two and so on and so forth however some
programming languages don't use a base
zero indexing approach
but instead will
start at one or may even allow the
programmer to define where they would
like the indices to start
now there are a number of design issues
that need to be considered if we are to
provide support for arrays in a
programming language
first of all what types are legal for
the subscripts so the subscripts are the
indices into the array are we only
allowed to use integers or will the
programming language allow us to use
some other ordinal type something like
an enumeration type for example
then secondly our subscripting
expressions in element references range
checked so this relates to an expression
being used for the index into a
particular array
does the runtime system of the
programming language then perform some
kind of checking to ensure that the
element that is being accessed actually
does lie within the array in other words
does it prevent the programmer from from
indexing past the end of the array
then in the third place
when are subscript ranges bound
so this relates then to the size of an
array at what point is the size
determined does it happen before runtime
or can it happen during the course of
run time
then somewhat related to the third point
when does allocation take place so here
we're talking about memory allocation
for the array
in other words at what point then will
the spaces for the various data elements
be reserved in memory again does this
occur before runtime or does it occur
during runtime and can it change through
the course of execution
then what is the maximum number of
subscripts so as you know we can create
multi-dimensional structures using
arrays so for instance a matrix can be
represented using a two-dimensional
array a cubic structure can be
represented using a three-dimensional
array so is there some kind of limit to
how many subscripts we can have which
would then limit the dimensionality of
an array structure
or is there no limit
this is a design issue that needs to be
decided up front for the programming
language in question
then are ragged
or rectangular arrays allowed or are
both allowed we'll get to what ragged
and rectangular arrays are in a moment
then can array objects be initialized so
what we are talking about here is a sign
assigning an initial value to a variable
that is an array variable
so is there a way in other words to
populate the array with initial values
before the programmer can then actually
use the array
and then finally are any kind of array
slices supported and again we'll get
into what array slices are in a moment
array indexing or subscripting is a
simple mapping from indices to elements
contained within an array
so over here we have an example
we have an array referred to by array
name
and this is in the identifier of the
array that we are going to be indexing
or subscripting we then also have an
index value which is
then specified for our array and this
indicates then the position
within the array that we want to access
element from
relative to the first position in the
array
we then have a mapping that is performed
and this mapping then maps our index
value to an element that is actually
contained within our array
now of course different kinds of
syntactic notations can be used to
represent indexing and the majority of
programming languages use the notation
that you will be familiar with namely
square brackets that contain the index
that we want to access within our array
however fortran pl1 and edo all use
parentheses in other words round
brackets in order to
contain the index value that
we wish to access within our array
now in ada the reasoning behind this is
that it provides uniformity between
array references and function calls the
reason for this is that ada considers
both of these to be mappings so in the
case of an array reference we are
performing a mapping from an index to a
specific array element if we are talking
about function calls we are mapping
parameter values into the function so
because ada considers these two
operations to be similar because they
both involve mappings it then uses the
parenthesis notation
in order to indicate this similarity
next let's look at what data types are
allowable to serve as array indices or
subscripts
so what we are looking at here is what
type can a particular index or subscript
value take on
the core question here is will the
programming language only allow for
integer values to be used as array
indices
or will it allow other ordinal types to
be used
so fortran c c plus plus and java all
only allow integer values to be used as
subscripts
however in the case of c and c plus plus
this does include then any particular
value that will be automatically
converted into an integer value by means
of coercion and i've provided some
additional details in the notes for this
slide now the ada programming language
is different any ordinal type can be
used as a subscript into an array so
this includes the integers of course but
also user-defined enumeration types
boolean values as well as character
values because all of these allow for a
mapping between the variable's value
and an integer value that is positive in
nature all of these then are defined as
ordinal types and can be used as
subscripts
next let's look at support for index
range checking for arrays
so there are two main approaches that
can be used when it comes to index range
checking
firstly every access into an array could
be range checked and then the
alternative to that is to perform no
range checking
it is also possible for a programming
language to support both of these
approaches
using some kind of mechanism to
differentiate between them
so if we look at this in terms of real
world programming languages c c plus
plus pearl and fortran all don't perform
any index range checking at all
so what i would like you to do at this
point is to pause the video and try to
think of an advantage and a drawback
associated with this lack of index range
checking specifically in terms of our
programming language evaluation criteria
that we've been using in this course
then java ml and c-sharp all
do support index range checking and in
fact require that every indexing or
subscripting into an array must be range
checked so again i would like you to
pause the video at this point and try to
answer
what an advantage and a drawback would
be associated with this approach to
range checking
once again in relation to the
programming language evaluation criteria
we've been using so far
then ada
again follows a slightly different route
to the other programming languages
ada does have array index range checking
by default however it can be turned off
by the programmer so what i would like
you to do at this point is again pause
the video and try to think of an
advantage associated with ada's approach
we'll now delve into subscript binding
and storage binding
so subscript binding is a binding of a
size to an array variable where the size
specifies how many values are contained
within this array
storage binding on the other hand is the
binding of the actual memory space that
is allocated for each of the values
contained within the array
now there are different categories of
arrays
that can be defined based on how they
deal with subscript binding and storage
binding
so first of all we have the simplest
kind of array which is a static array
so this is then simply a static local
variable but the type of this variable
is an array so you can think of this in
the context of c or c plus plus where we
have a variable that is defined inside a
function but this variable is defined as
a static variable and then the type of
the variable is for instance an integer
array
so in this case then the subscript range
is statically bound in other words it
takes place before run time so the size
of the array is then specified before
program execution begins
what this means is that the length of
the array must then be a constant
because this length is not allowed to
change
during run time it's only sick once
prior to run time
where the
subscript range is then bound to the
array
now storage allocation then is also
static in nature so the memory allocated
for the array also then
is
allocated prior to runtime
so we see then that this is the case
with a static local variable which has a
type of an array
and this is because this variable is
allocated prior to runtime and then
remains allocated for the entire
execution of the program
now the main advantage associated with
purely static arrays is that they are
very efficient in terms of execution
time and this is because all of the
allocation that takes place is static in
nature it does not occur dynamically at
run time in general any dynamic
allocation is
usually slower than a static allocation
now what i would like you to do at this
point is to pause the video once more
and try to think of a disadvantage
associated with static arrays
now the next category of arrays that
we'll look at is the fixed stack dynamic
array
so here we have the subscript range
which is statically bound again this
means that the length must be a constant
however a location for the array occurs
at declaration elaboration time and
de-allocation usually occurs at the end
of the arrays scope
so an example of this kind of array
structure is in coc plus where we have a
local non-static variable that has the
type of an array
now in that case a variable cannot be
used to define the length of the array
the length of the array must be
specified by means of a constant so a
literal integer or a named constant for
instance
and so this then
allows
the
size of the array in other words the
subscript range to be defined then
statically prior to runtime because this
is defined by a constant value
however we will only allocate memory for
the array once execution reaches the
declaration of that array and then
similarly once the scope of the array
variable then ends then that memory
space will be deallocated automatically
so in other words we have an array that
behaves in the same way in terms of
storage allocation
as a regular
local variable which is non-static in
nature would have
so the main advantage of fixed stack
dynamic arrays is that they are very
space efficient what i would like you to
do at this point is to pause the video
and try to explain why fixed stack
dynamic arrays are efficient in terms of
storage space in memory
next we have stack dynamic arrays so in
stack dynamic arrays the subscript range
is dynamically bound
this means that the length then is
allowed to be a variable obviously a
constant could also be used
the storage allocation for resdac
dynamic array is also dynamically bound
so in other words both the subscript
range and the storage
binding will then be determined at
runtime
now both the subscript range and the
storage binding will be then fixed after
the initial binding takes place
now there is unfortunately no analog for
this in the cnc plus programming
languages but if we could define an
array variable locally inside a function
and we could use for the size of that
array a variable this then means that
the size of the array could be different
in different executions of the program
depending upon what value that variable
then receives for the size of the array
however that size cannot change once it
has been allocated so the array can't
grow and shrink over the course of
runtime
also then
storage space would be allocated for the
array once the declaration of that array
is reached and then generally that array
would be deallocated once execution
reaches the end of the arrays scope
so the main advantage with a stack
dynamic array is flexibility and this is
because the array size doesn't need to
be known until the array is actually
used and of course then the array size
can be allocated from a variable which
gives us increased flexibility you could
for example have the user of a program
type in the size of the array which
would then result in that much memory
space being allocated
next we have fixed heap dynamic arrays
and here the subscript range and storage
binding are both dynamic but both of
these are fixed after allocation takes
place
now binding is done by means of an
explicit request so a good example of
this is dynamic arrays in c plus
so we can create then a pointer
to an array and then we would
specifically use the new special word in
order to create a memory allocation for
the array also at this stage the
subscript binding takes place because
the size of the array can be specified
by means of a variable
once we're done with the array then we
have to use a delete directive and this
will then indicate that the memory space
can be deallocated for our array so here
we have an example of this in c
we have a pointer named p which is an
integer pointer and we've then assigned
to that
a new integer array with the size
specified in the brackets over here so
that size can then be a variable and
what this then means is that the
subscripts have been bound at runtime
because we've specified the size of the
array but also storage space has been
allocated now what's important to
understand here is that storage is
allocated from the heap not the run time
stack so what this means then is that an
array that is allocated in this way
can then continue to exist
once the scope of its variable has ended
the final array category that we have in
terms of subscript binding and storage
binding is the heap dynamic array so
this array is as dynamic as we can make
it the binding of both subscript ranges
and storage space will occur dynamically
at runtime and the binding can also
change any number of times through the
course of the program's execution
now it's also important to understand
that storage once again here is
allocated from the heap
rather than the runtime stack
so what this then means in summary is
that a heap dynamic array can grow and
shrink as items are added to the array
and removed from the array
so the main advantage being associated
with heap dynamic arrays is that they
are very flexible
arrays are exactly the size that they
need to be no larger no smaller and they
can continuously adapt as we change the
number of elements contained within the
array
so what i would like you to do at this
point is to pause the video and try to
think of a disadvantage associated with
heap dynamic arrays
so let's look at some modern programming
languages and what support they provide
for these array categories that i
discussed over the previous three slides
now cmc plus plus we've already looked
at largely over the course of the
previous discussion
if we have arrays that are declared
locally inside a function and they use
the static special word then they are
static arrays however if we have arrays
that are defined locally inside a
function but they don't use the static
special word then they are fixed stack
dynamic arrays
both c and c plus plus also provide
support for heap dynamic arrays
we looked at an example in c plus where
we used a pointer and then the new
special word
in order to allocate memory for a heap
dynamic array
and then we also use delete to
de-allocate memory space for the array
and c uses a similar approach however it
doesn't have special words for
allocation and deal location instead it
uses library functions such as malloc
and dialog to allocate and deallocate
memory for an array
now c sharp has relatively similar
support to c plus plus since of course
makes sense because c sharp is largely
derived from c plus plus however it does
additionally provide an arraylist class
which is a heap dynamic array
and so this means that objects of type
arraylist can then grow and shrink over
the course of execution as values are
added to or removed from this instance
in java all arrays are fixed heap
dynamic and so they use a fairly similar
notation to c plus
using a new special word to indicate
that memory allocation must happen for
the array however in the case of java
there's no explicit delete directive
instead java relies on the garbage
collector to clean up memory once an
array is no longer needed
then the perl javascript python and ruby
scripting languages all support heap
dynamic arrays and this of course makes
sense because all of these are scripting
languages and therefore the increased
flexibility provided by heap dynamic
arrays will come in useful for
programmers who want to quickly put a
program together without too much effort
finally for this lecture we will talk
about array initialization
so array initialization refers to
structures provided by a programming
language in order to allow for the
initialization of arrays at the time of
storage allocation
so over here we have an example in c c
plus java and c sharp we can see that we
are defining a variable called list and
this is an array of integer values
now over here we can see that we have
performed an assignment and this
assignment is then to a range of values
enclosed in braces
and there we have listed then the values
that must be initially contained within
this array
so at subscript to index 0 we will have
the value 4
at subscript 1 the value 5 at subscript
2 the value 7 and add subscript 3 the
value 83
notice also that we haven't provided a
size for this array and the reason for
this is that the initialization
can determine for us what the size of
the array needs to be
from the number of values contained in
the braces so in this case we would have
an array of size 4 that would be created
now we can also do the same kind of
thing with character strings in c and c
plus plus
so over here we have defined a variable
called name and this is an array of
characters and we can assign to that
then the string literal fred in this
case we know that this is a string
literal because it contains the
characters for the word fred in double
quotes
now once again here we haven't specified
the size of this array and this is
because once again the size can be
determined from what is assigned to the
array so in this case sufficient space
would be allocated for all of the
characters in the literal string
so we have then space for f r e and d in
other words four character spaces but
because we know that arrays
of characters are null terminated in c
and c plus plus we would then have an
additional space for that null character
bringing the total length of the name
array to 5.
we can also do things like create arrays
of strings in cnc plus plus
so here we are creating an array called
names and this array then stores
character
pointers
we can then perform an assignment so
once again we then have an assignment
which then has a series of values which
are contained inside braces
and contained within these braces we
then have string literals namely bob
jake and joe
java has a similar kind of approach but
it would use of course stream objects so
over here we are creating an array
called names that contains strings and
then we use our initialization list in
the braces in order to then
represent the string literals that will
be contained within the names array
now ada provides fairly elaborate
support for array initialization and
over here we have an example of this
so here we are declaring a variable
called list and we specify that list is
an array
where the first index or subscript of
the array is 1 and the last index or
subscript is 5. we also specify that
this array contains integer values
so now we perform an assignment to the
second line over here
and this line specifies that at index or
subscript 1 we will store a value of 17
at index or subscript 3 we store a value
of 34 and then at all other subscripts
or indices we will place values of zero
so this gives us relatively fine grained
control
over which values are stored within an
array and this notation also has the
potential for being much more compact
than an explicit initialization for each
of the individual values contained
within the array
finally python uses what are referred to
as list comprehensions
in order to perform array initialization
but we'll discuss this later on in this
chapter
all right so that then concludes this
lecture's discussion
on
data types in the next lecture we will
be continuing our discussion on arrays
looking at some of the more exotic kinds
of arrays and array operations that
might be supported
and then we will also look at
records
tuples and lists
5.0 / 5 (0 votes)