Streams

IIT Madras - B.S. Degree Programme
12 Feb 202228:28

Summary

TLDRThe video script delves into the concept of iterators and streams in programming, particularly in Java. It explains how iterators sequentially access elements in a collection, while streams offer a more declarative approach, allowing for parallel processing and lazy evaluation. The script illustrates stream operations like filter, map, and reduce, highlighting their utility in transforming and summarizing data efficiently. It also introduces the idea of infinite streams and the importance of handling empty streams with optional types.

Takeaways

  • 🔄 An iterator processes elements in a collection one by one in a sequence, allowing for operations like counting words longer than a certain number of letters.
  • 📚 A collection can be thought of as a stream, where elements are processed in a continuous flow, similar to watching a news ticker on TV.
  • 🛠 Stream processing is an alternative to iterative processing, allowing for a different style of handling collections through functions that take and produce streams.
  • 🌐 Stream processing can be more efficient as it supports parallel processing, making use of multiple computational resources for operations that are independent.
  • 📈 The concept of lazy evaluation in streams means that operations are performed only when necessary, potentially saving resources by not generating entire sequences.
  • 🔧 Stream operations are typically non-destructive, meaning the original collection remains unchanged after the stream has been processed.
  • 🚀 Streams can be created from various sources, including collections, arrays, or functions that generate values on demand.
  • 🌀 Stream transformations include operations like filtering, mapping, and limiting, which can be chained together in a pipeline to process data.
  • ⏸ The use of 'limit' and 'skip' allows for control over the number of elements processed in a stream, enabling more granular data handling.
  • 🔍 Operations like 'max', 'min', and 'findFirst' can be used as terminal steps in a stream pipeline to produce a single result from a stream.
  • ⚠️ Handling empty streams is important as certain operations may not be valid without elements to process, often returning an 'Optional' type to indicate the possibility of no result.

Q & A

  • What is an iterator and how does it work with collections?

    -An iterator is a mechanism that allows you to traverse through a collection, such as a list or a set, and produce the elements one by one in a sequence. It is used to run through items in a collection and perform operations on them, such as counting words longer than 10 letters in a list of strings.

  • How does stream processing differ from iterative processing using an iterator?

    -Stream processing allows you to think of the elements in a collection as a stream of values, similar to watching a TV stream. Instead of explicitly iterating through the collection one element at a time, you write functions that take these streams as input, process them, and produce a stream as output. This approach is more declarative and can potentially be parallelized, unlike iterative processing which is more imperative and sequential.

  • What is the advantage of using streams over iterators for processing collections?

    -Streams offer a declarative programming style where you describe the transformations you want to achieve rather than how to achieve them. They also support lazy evaluation, allowing operations to be performed in parallel if computational resources are available, and can be terminated early if the desired result is achieved, making them more efficient in some scenarios.

  • How can you convert a collection into a stream in Java?

    -In Java, you can convert a collection into a stream by using the `stream()` method provided by the Collections interface. This method takes a collection and returns a stream of its elements. For arrays, you can use the `Stream.of()` static method to achieve the same.

  • What is the concept of lazy evaluation in the context of streams?

    -Lazy evaluation in streams means that the next step in processing is started as soon as the output of the previous step is available. This allows for efficient processing as you don't need to generate the entire stream before starting the next operation. It also enables operations like stopping the stream generation once a certain condition is met.

  • How can you create a stream from a function in Java?

    -You can create a stream from a function using `Stream.generate()` which takes a function that generates values on demand. This is useful when you don't have a collection to start with but need to produce a sequence of values dynamically.

  • What is the purpose of the `filter` operation in streams?

    -The `filter` operation in streams is used to select a subset of the stream's elements based on a predicate. It keeps only those elements for which the predicate returns true, effectively allowing you to extract specific items from the stream, such as words longer than 10 letters.

  • How does the `map` operation transform the elements of a stream?

    -The `map` operation applies a function to each element in the stream, transforming them into a new form. For example, it can be used to extract the first letter of each word in a stream of strings or to convert each element in a way that suits the后续 processing needs.

  • What is the significance of `flatMap` in stream processing?

    -The `flatMap` operation is a variant of `map` that flattens the output. If the `map` operation results in a stream of lists (or other nested structures), `flatMap` collapses these into a single stream, making it easier to handle the resulting elements as a simple sequence.

  • How can you handle empty streams in Java when performing operations like `max` or `first`?

    -Operations like `max`, `min`, and `first` on streams may result in an `Optional` type to handle cases where the stream is empty. This approach ensures that the result is well-defined even when no elements are present in the stream, preventing errors like `NoSuchElementException`.

Outlines

00:00

🔄 Understanding Iterators and Streams

This paragraph introduces the concept of iterators and streams in collection processing. An iterator is used to traverse through a collection by producing one element at a time sequentially. The paragraph then contrasts this with the stream approach, where elements are processed as a continuous flow, akin to a TV news ticker. It also discusses the shift from imperative to declarative programming styles, highlighting the benefits of stream processing, such as the potential for parallel execution and lazy evaluation, which allows for efficient handling of potentially infinite data sequences.

05:04

🌐 Stream Processing and Lazy Evaluation

The second paragraph delves deeper into stream processing, emphasizing the concept of lazy evaluation where operations on the stream are performed only as needed, rather than upfront. This approach allows for the efficient handling of large or even infinite data sets by generating values on demand. The paragraph also explains how streams can be created from existing collections or generated on-the-fly using functions, and it outlines the basic operations that can be performed on streams, such as filtering and mapping.

10:08

🛠 Stream Creation and Transformation

This paragraph focuses on the methods of creating streams from various sources, including existing collections, arrays, and on-demand generation using functions. It discusses the use of 'stream.generate()' and 'stream.iterate()' for creating infinite or conditional streams. The paragraph also touches on the transformation of streams through operations like filtering, mapping, and flat mapping, which allow for the modification and extraction of data in a declarative manner.

15:09

📈 Stream Operations: Limit, Skip, and Termination Conditions

The fourth paragraph discusses operations that control the flow and size of a stream, such as 'limit', which caps the number of elements in a stream, and 'skip', which discards a specified number of initial elements. It introduces conditional termination with 'takeWhile' and 'dropWhile', explaining how these operations allow the stream to continue or start based on the satisfaction of a given predicate. The paragraph highlights the importance of these operations in managing stream processing, especially in scenarios involving potentially infinite sequences.

20:13

🔍 Stream Termination and Result Collection

This paragraph covers the final stages of stream processing, where after a series of transformations, the stream is collapsed into a single result, such as a count, maximum, or minimum value. It explains how operations like 'count', 'max', and 'min' are used and addresses the challenge of handling empty streams, which may result from selective filtering. The paragraph also introduces the concept of 'Optional' types as a means to manage operations that might not yield a result due to an empty stream.

25:15

📝 Stream Processing Summary and Optional Types

The final paragraph summarizes the stream processing capabilities, from creating streams to applying transformations and obtaining results. It reiterates the functional programming style enabled by streams and the advantages of this approach over traditional iteration. The paragraph also foreshadows a discussion on 'Optional' types, which are crucial for handling operations that may result in no output, providing a preview of topics to be covered in subsequent lectures.

Mindmap

Keywords

💡Collections

Collections refer to a group of objects that can be manipulated as a whole. In the context of the video, collections are used to represent a set of items, such as a list or a set, which can be iterated over using an iterator. The script mentions that collections can be converted into streams, allowing for a different style of processing where elements are treated as a sequence.

💡Iterator

An iterator is an object that enables a programmer to traverse a collection. The video script explains that an iterator takes objects from a collection and produces them one by one in a sequence. This is a fundamental concept in the video, as it sets the stage for discussing how collections can be processed using different methods, including iterators and streams.

💡Stream

A stream is a sequence of elements that can be processed in a declarative manner, as opposed to the imperative style of using an iterator. The script describes streams as a way to view the elements of a collection as a continuous flow, similar to a stream of data on a TV screen. Streams allow for operations like filtering and mapping to be applied in a more functional programming style.

💡Filter

Filter is a stream operation that allows for the selection of a subset of elements from a stream based on a specified condition. In the video, filtering is used to demonstrate how to extract words longer than 10 letters from a list of strings. This operation is crucial in stream processing as it helps in narrowing down the data to what is relevant for further processing.

💡Map

Map is another stream operation that applies a function to each element in the stream, transforming them into a new form. The script uses the example of mapping each long word to its first letter, illustrating how map can be used to transform data within a stream. This operation is essential for data transformation in stream processing.

💡Flat Map

Flat map is a variant of the map operation that flattens the resulting stream of elements. The video script explains this concept by showing how a map operation that returns a list for each word can be flattened into a single stream of characters. This is useful when the transformation results in a nested structure that needs to be simplified.

💡Limit

Limit is a stream operation that restricts the number of elements in a stream to a specified amount. The script mentions using limit to take only the first 100 random numbers from an infinite stream. This operation is useful for controlling the amount of data processed, especially when dealing with potentially infinite streams.

💡Skip

Skip is a stream operation that drops the first n elements of a stream. The video script uses skip to demonstrate how to ignore the first few elements of a stream, such as skipping the first 10 random numbers. This operation is useful for starting the processing from a specific point in the stream.

💡Take While

Take while is a stream operation that continues to include elements in the stream as long as a certain condition is met. The script explains this by saying that the stream will stop including elements once a random number smaller than 0.5 is encountered. This operation is useful for dynamically determining the end of the stream based on the elements' values.

💡Drop While

Drop while is a stream operation that skips elements in the stream until a certain condition is met, after which it includes all subsequent elements. The script contrasts this with take while, explaining that drop while will ignore elements until the condition becomes false, and then start including elements. This operation is useful for dynamically determining the start of the stream based on the elements' values.

💡Optional

Optional is a type in Java that is used to represent a value that may or may not be present. The script mentions optional in the context of operations like max, min, and find first, which might not always have a valid result if the stream is empty. Optional types help in handling cases where the result of a stream operation might not be defined.

Highlights

Introduction to collections and iterators, explaining how iterators produce elements in a collection one by one.

The concept of using an iterator to process a list of strings, such as words from a text file, to perform operations like counting words longer than 10 letters.

The alternative approach of using streams in Java to process collections, drawing an analogy with a TV stream for understanding.

Explanation of how streams allow for processing collections in a declarative programming style, as opposed to the imperative style.

The advantage of stream processing for parallel execution when operations on the stream are independent.

Introduction to lazy evaluation in streams, where operations are performed as soon as input is available, without waiting for the entire sequence.

The ability to create infinite streams using Java's stream.generate() and stream.iterate(), and the concept of lazy evaluation in these scenarios.

The process of creating a stream from a collection using the stream() function, and the conversion of arrays into streams using stream.of().

Demonstration of filtering a stream using a predicate to select a subsequence, such as keeping only words longer than 10 letters.

The map operation in streams, which applies a function to each element, and the concept of flatMap for handling nested structures.

Limiting and skipping elements in a stream using limit() and skip() functions, and their use in controlling stream processing.

The takeWhile and dropWhile operations for conditional termination and initiation of stream processing.

Merging, concatenating, and sorting streams, and other natural operations that can be performed on streams.

The final operations on streams, such as count, max, min, and the handling of empty streams with optional types.

The transformation of streams through a pipeline of operations, similar to connecting pipes, and the concept of non-disruptive stream processing.

The importance of handling empty streams and the use of optional types to deal with potential absence of elements in stream operations.

The overall shift from traditional iterators to streams for a more functional programming style with Java collections.

Transcripts

play00:15

So, we have seen collections. So, when we have  a collection, we have a lot of items in a single  

play00:21

object. And we can run through these using  an iterator. So, what is an iterator do? An  

play00:27

iterator takes the objects in the collection, and  produces them for us one by one as a sequence. So,  

play00:34

we take a collection, it could be a list,  it could be anything could be a set.  

play00:38

And what the iterator does is it produces the  elements in this collection in a sequence one  

play00:42

by one? So, for example, supposing we have  taken a text file, and we have somehow split  

play00:48

it up into a list. So, it is a list of  string. So, words, the list of string.  

play00:53

And now we want to go through all these words. And we want to, for example, pick out the words,  

play00:58

which are longer than 10 letters and  count how many of them there are.  

play01:02

So, then we run this iterator. So, we  run through every word in this list.  

play01:06

And then each time we see a word whose length  is bigger than 10, we increment the count. So,  

play01:10

this is a standard way of processing an iterator,  I mean, processing a collection using an iterator.  

play01:18

So, as we said, it generates all these  elements in a sequence. So, why not work  

play01:22

with the sequence itself. So, an alternative  approach with Java provides us is to think of  

play01:28

the elements in a collection as a stream.  So, imagine that you are watching TV,  

play01:35

and you have this some scores or some stock  market or something going at the bottom.  

play01:40

So, it is a stream of values, which is going  across the screen. So, in the same way, rather  

play01:44

than explicitly going through the collection,  one element at a time using an iterator,  

play01:49

we assume that the iterator is sort of emitting  these values to us as a stream. So, we write some  

play01:55

functions which take these streams as input, and  process them and produce a stream as output.  

play02:01

So, this is the different style of processing  collections using streams. So, here, for instance,  

play02:07

just as an example, we take this set of words,  which is given to us as a list, we convert it into  

play02:13

a stream, then we run a function which filters  out. So, it extracts out those words from the  

play02:20

stream. So, it is like a stream of objects is  going past you on a conveyor belt, and you are  

play02:25

pulling out the words that you do not like. So,  you are only allowing the words to pass through,  

play02:30

which are longer than 10 letters. And then  eventually, the words that survived this process  

play02:34

will get counted. So, this is how stream  processing works at a high level.  

play02:40

So, why do we want to do stream processing rather  than iterative processes in (())(02:45) iterator?  

play02:45

Well, remember at the beginning of this  course, we talked about the distinction  

play02:50

between declarative programming and imperative  programming. So, in imperative programming,  

play02:54

which is most of the programming that we  do in a language like Java, or even Python,  

play02:59

we explicitly describe how the computation should  be performed. So, we decide which variables we  

play03:05

will need to store the intermediate values, which  variables will need to store the final values,  

play03:09

how we will compute these values and update  them, how we will store them, and so on.  

play03:13

So, we gave a very detailed step by step process  of actual computation. Whereas in a declarative  

play03:18

style of programming, what you are really saying  is, I have a set of values, and I tell you some  

play03:24

transformation that I want to achieve. And  then assuming this transformation is achieved,  

play03:28

I have another set of values, and I tell you  what have to do with that, and so on.  

play03:31

So, you just keep describing the effects  of the transformation. And to some extent,  

play03:36

the implementation of these is not important, or  it is something that can be left as a detail to be  

play03:42

done separately. So, this is what our stream  processing done. So, it basically it says,  

play03:48

I do not want to know, in particular, how  I am going through the (())(03:52).  

play03:51

And I am just assuming that I get this stream  of objects. And for each of these objects,  

play03:56

I tell you, which ones to keep those whose length  is greater than or equal to 10. And then finally,  

play04:01

I tell you what to do with them, which is  to count them. So, this is declarative. So,  

play04:06

one of the advantages of doing this kind of stream  processing is when I am doing an iterator.  

play04:11

I have to go through the elements one by  one, and I have no option because I keep  

play04:14

getting the next one. Whereas actually, if I  get a stream of elements, then in principle,  

play04:18

if the operations that I want to perform on  the stream are not dependent on each other,  

play04:23

then I can do it in parallel if I have  more computational resources at hand.  

play04:27

So, in fact, Java provides a way to create  a parallel stream, we will not be looking  

play04:31

too much at it right now. But there is one  advantage of using streams, which comes from  

play04:35

the fact that you can actually deal with the  elements in the collection in parallel. The  

play04:41

other thing that we can do with a stream is  we can do what is called lazy evaluation.  

play04:47

So, normally when I create a sequence  of functions. So, I take a function f,  

play04:53

and I feed the output of f as an input to  g. Then I need to get the entire output of f  

play04:58

before I can start working with g. So, if I have  to take a map, for example, in a normal sense,  

play05:03

if I map a list to another list, I  have to first compute the new list.  

play05:07

And then I have to take that new list and do  something else with that. Now, in lazy evaluation,  

play05:13

you start doing the next step as soon as the  first step is available. So, for instance,  

play05:17

as soon as some words come out of the stream,  I can start filtering them. Now, supposing I  

play05:23

was not interested in just counting them, but I  just wanted the first 10 words in the stream.  

play05:28

Then the moment I have produced 10 words, I  can cut off this process. So, I do not need  

play05:32

to generate all the stream the entire stream  of words, if later on, I get a signal. So,  

play05:38

in some sense, I can pull values from the stream.  It is all it is like an iterator in a sense,  

play05:42

but it is basically allowing you  to think of the stream process  

play05:45

as being lazy, it generates a stream a new value  whenever it is needs to, and it can stop.  

play05:52

And in particular, this means that if your  stream is actually producing an infinite  

play05:56

sequence of values, there is no question of  stopping that stream, because it is infinite.  

play06:01

But if you only need a finite set of  values coming out of the stream in the end,  

play06:06

then once you have got that, you can stop  producing this infinite stream. And we will see  

play06:10

examples of where you can actually produce  infinite streams in this lecture.  

play06:16

So, to work with streams, we have to do three  things, the first thing is we have to create  

play06:20

a stream, because if you do not have  a stream, you cannot work with it,  

play06:23

then we have to apply these transformations,  which takes streams as input and produce streams  

play06:27

as output. And finally, after doing all the  transformations that we are interested in.  

play06:33

We will apply some kind of final operation to  get a result. So, we wanted to get for example,  

play06:38

in the earlier example, we wanted to count the  number of words, which are more than 10 letters  

play06:43

long. So, that is a result, which comes  after doing some transformation. So, here,  

play06:48

for instance, we have this idea that we create  the stream, then this operation in the middle  

play06:54

is the one that transforms the stream, it  drops the words that are too short for us.  

play06:58

And this is the operation that actually computes  the result. So, one thing to notice that we  

play07:05

could have more than one such here, we have  only a single intermediate operation filter,  

play07:10

we could have more than one such intermediate  operation. So, we really have to think about these  

play07:15

streams as passing through. I mean, it is like, it  is really passing through a set of operations.  

play07:22

So, it is like, you are applying for a passport.  So, you go to some desk, and then they say go to  

play07:27

the next desk, and they go to the next desk,  so on. But while you are doing that, the next  

play07:30

person has also, joined the queue. So, it is not  that the entire system waits for you to finish  

play07:36

before it admits the next person into  that passport queue. Because there are  

play07:40

6 or 7 desks that you have to pass through. So, at any given point, there are people waiting  

play07:43

at each of the desks for their next step. So, in  the same way, the stream will also, work that way  

play07:49

or no single item actually stores the stream. So,  the stream is kind of is a transitory object. So,  

play07:56

you should unless you need to, you will not  actually keep a copy of the stream in between.  

play08:02

So, the stream is generated at the start,  it is processed, and then at the end,  

play08:05

it is consumed by the last function, which  wants to prepare whatever summary you want,  

play08:10

like count or something like that. So, the  other thing to note is that the stream itself  

play08:18

does not get changed. So, it is not like a  function which updates its arguments. So,  

play08:24

the input stream is passed to the output stream.  And the output stream is a copy with some values  

play08:30

transformed or removed or whatever, but  the input stream remains as it is. So,  

play08:34

the stream processing is not, it does not affect  its values, it is not destructive, in a sense.  

play08:41

So, the first step, we said is to create  a stream. And we are typically creating  

play08:46

streams from collections, that was our motivation  that instead of iterating, through a collection,  

play08:50

we would like to emit the elements in a collection  as a stream of values is a sequence of values.  

play08:56

So, quite naturally, there must be a way  to do that. And collections, the interface  

play09:02

where all the properties of various types of  collections were defined, as we saw earlier,  

play09:06

actually has a function called stream. So, we can take any collection, which is  

play09:13

defined as part of that hierarchy  under the collections interface,  

play09:16

and we can apply this function stream to  it and it will convert it into a stream  

play09:20

and the type is it is a kind of generic type.  So, it is a stream of some underlying value.  

play09:27

So, if you start with a collection, like list  of string, the underlying value string.  

play09:30

So, if you convert it to a stream, it becomes  a stream of strings. So, this is part of the  

play09:35

collections interface and this is a standard way  that you would convert a collection into a stream.  

play09:41

Now, if you had an array now technically  arrays are like collections, but they  

play09:46

are kind of primitive types in that sense.  Although they are objects they are predefined  

play09:50

and they are not really sitting in this  collections hierarchy. So, you need to  

play09:53

do something else. So, there is this function,  static function in stream, which is called of,  

play10:01

so, the stream.of will take an array and  convert it into a stream of the same type.  

play10:08

So, if you have an array of strings, then this  stream of() will converted to a stream of strings.  

play10:15

Now, the other way to generate a stream is to keep  producing the value spontaneously from a function.  

play10:20

So, you do not have a collection to start  with, you just have a way of generating  

play10:24

values on demand. So, a function that will  generate values on demand cannot take as  

play10:29

anything as an input normally. So, the easiest  way to generate such a function is to pass a  

play10:34

function which takes no arguments and keeps  emitting a value. So, here is one example.  

play10:40

So, there is the stream.generate(). So, what it  does is it takes as an argument a function. So,  

play10:47

remember these are these lambda expressions.  So, this is the input to the lambda expression,  

play10:50

this is the output of the lambda expression. So,  this is a function which takes no arguments. And  

play10:55

if you take no arguments, you have to use  this open bracket close bracket to make the  

play11:01

lambda expression syntactically valid. Remember that if you have one argument,  

play11:05

then you can omit this if you say v goes  to something, then you can omit this.  

play11:09

But if you have no arguments, then you cannot just  leave a space and write an arrow. So, you have  

play11:13

to put this empty bracket. So, this is take no  input and produce uniformly the string Echo.  

play11:20

So, the first stream that we are producing is  just an infinite stream consisting each element  

play11:27

in the stream consisting of the string Echo. The  second one is doing something more interesting,  

play11:32

it is taking remember the different ways  in which we can pass these functions. So,  

play11:37

lambda expressions or we can pass the class  name and a function in that class.  

play11:40

So, this is a static function, which generates  a random number. So, this will now generate  

play11:45

these random numbers one after the other. So,  they will be different random numbers. But again,  

play11:49

in principle, this will go on as long as I  want because I have not specified anything.  

play11:54

So, these will produce values,  remember this lazy thing.  

play11:57

So, whenever I need a new value, in some  sense, I will get one more Echo or I will  

play12:01

get one more random number. But hypothetically  we are producing first and infinite stream  

play12:05

of Echoes and an infinite stream of random  numbers and then later on processing them.  

play12:11

Now, the other way that you might want to generate  a stream is to start with a value and then keep  

play12:16

producing new values which depend on the previous  value, a simple example would be just a kind of  

play12:21

iterating sequence like 0 1 2 3. So, therefore, this is called iterate. So,  

play12:26

you start with an initial value, and you provide  a function which says, given the previous value,  

play12:31

what is a next value. So, if I start with 0, then  the next value is 0 plus 1, because n will be 0,  

play12:37

and the output will be 0 plus 1. Now, I will  say n is 1 and the out so, I could, for example,  

play12:42

if I wanted to go through only even numbers, I  could say n goes to n plus 2, then I will start  

play12:47

with 0, the next one will be 2 and so on. So, n is not the position n is the previous value  

play12:52

generated in this tree. So, what do I do with  the previous value to generate the next value.  

play12:58

So, this will now generate an infinite sequence of  integers 0 1 2 3 the way I have written it, what  

play13:03

if you wanted to run this up to a point and stop?  Well, you can do that by providing a different  

play13:11

format where you provide three arguments. So, you have the initial value, and you have  

play13:15

this function which generates the next value as  before, but now, you give a terminating condition,  

play13:22

you say that you want to stop  if the value that you produce  

play13:26

is bigger than 100. Now, notice that  this is a condition on the values not  

play13:30

a condition of position. Now, it so happens in  this particular example, that the values are the  

play13:35

positions. But for instance, if I had written n  squared, and I use the same condition, then what  

play13:43

will happen is that I will go from not n squared,  maybe n squared plus 1 might be better.  

play13:48

So, I will go from 0 to n squared plus 1, the next  one will be 1, then 1 will produce 1 plus 1 will  

play13:54

be 2, 2 will produce 2 squared plus 1 will be 5  and so on and then 5 will produce 25 plus only 26  

play14:02

and then the next value 26 squared will be more  than 100. So, this will stop. So, this will only  

play14:06

produce the first four values in that sequence.  So, this is not the number of values, but it is  

play14:11

a condition on the value is being produced and the  moment I reached that condition it will stop.  

play14:18

So, that was as far as generating streams. So, we  can generate a stream from a collection or we can  

play14:23

generate a stream using a function. Now, we have  generated the stream and we want to process it.  

play14:28

So, we already saw the simplest form of processing  which is a filter. So, filter will take a  

play14:33

predicate. So, what is a predicate a predicate is  a function which takes an input and tells you true  

play14:37

or false. And what filter does is it throws  away all the values for which the predicate  

play14:43

returns false it keeps only the true. So, in our case the predicate was map a word w to  

play14:50

the predicate that it is length is greater than  10. So, if it is length is greater than 10, the  

play14:54

word is mapped to true if it is not greater than  10 it is word mapped to false. So, all the false  

play15:00

values, all the words whose length is  less than or equal to 10 will be dropped.  

play15:04

And this will filter out the short words from  your sequence and keep only the long words.  

play15:09

So, filtering, I mean, when you think of filtering  it in real life, you think of it both ways you  

play15:13

apply a filter, you want to keep something  out, and you want to let something through.  

play15:16

So, filter, the predicate is actually telling you  what let us go through, not what is retained, we  

play15:21

are not interested in what is retained here, what  is retained as a short word, they get thrown away,  

play15:26

what is important is what goes through. So, that is the filtered output.  

play15:32

So, filtering is a very simple process  of selecting a sub sequence out of the  

play15:37

sequence. But more often, we want to transform the  sequence as we go along. And this is called map.  

play15:44

So, map applies a function to each element in the  stream. So, here, for instance, again, we give a  

play15:50

expression which says, so,  notice that up to this point,  

play15:54

we have filtered out all the long words. So, words whose length are at least 10, at least  

play16:00

11, rather greater than 10. And now what we are  saying is for each of those words. So, this dot by  

play16:06

the way is the sequence concatenation operator, it  says, pass stream concatenate, pass this stream to  

play16:14

this thing to this thing to this thing. So, I will  start with this stream, I pass it to filter, and  

play16:19

the output filter is now being passed to map. So, map now has a function which it wants to apply  

play16:25

to every element that it gets. And this function  is to take the string, remember, this is now a  

play16:30

stream of strings. And these are long strings  that two strings have at least like the level.  

play16:35

So, it takes each string and it maps it to its  first letter, it takes a substring starting at  

play16:40

position 0 of length 1, which is numbering the  first letter. So, now this is extracted the first  

play16:45

letter of each long word. So, now it could be that  map actually does something more than that.  

play16:53

So, what if instead of extracting the first  letter, supposing we had a function, it is not a  

play16:59

real function in Java, you would have to write it.  But let us suppose that we had a function called  

play17:03

explored, which you take a word string, like  to take a string like list, and explode it as  

play17:13

a stream or a sequence with four letters. So, now what map is doing is, it is taking one  

play17:21

value and making it into a list of values. So, now  if I have a sequence of words, a stream of words  

play17:26

coming, each word actually becomes a nested list.  So, I have a list and list a list a list. So,  

play17:32

I have a stream of lists. So, very often, we do  not want a stream of lists, we want to actually  

play17:38

think of this whole thing as just one stream. So, it is just I have replaced every word by it is  

play17:43

constituent letters. And now I want a sequence of  all the letters across all the long words. So, for  

play17:49

this, I can now do something which is a variation  of map called a flat map. So, earlier, I had said,  

play17:56

map each string to explode of s. Now, I am  saying flat map each string to explode of s.  

play18:03

So, all this says is that the output of this  map is going to be having one next trial level  

play18:09

of brackets, if you want to think of it  and dissolve those brackets, it collapses,  

play18:13

a list of lists into a list or a stream  of list into a single stream. So,  

play18:17

that is what flat map has. So, these are some of  the things that one can do to process streams.  

play18:23

Now, these are transforming streams,  by value, dropping them and so on. So,  

play18:29

one of the natural things that you might want to  do to a stream is to limited to a certain length.  

play18:33

So, there is a function called limit, which takes  an integer and it limits it to that many things.  

play18:38

So, remember our earlier function generate,  which was generating this infinite sequence  

play18:44

of random numbers generated by Math.random. So, now we can say I do not want an infinite  

play18:49

sequence, I only want 100 of them. So, now I take  this stream, which is infinite, and I pass it to  

play18:54

this limit. So, it will stop generating numbers  after the first 100. So, this lazy evaluation  

play18:59

will take over. So, this will give me the first  100 random numbers produced by that stream.  

play19:05

Or I could say, I do not know if the first few  numbers are really random, because these are  

play19:09

all pseudo random, they start with some seed  and all that. So, let me wait for the thing to  

play19:12

become really random. So, skip the first step.  So, there is a function called skip, which will  

play19:16

take the stream and drop the first few elements  in this case 10, whatever number you tell it to  

play19:22

skip and then continue from there onward. So, this will now take an infinite stream,  

play19:26

and it will produce an infinite stream,  but having dropped some prefix of it.  

play19:31

Now, this stopping and starting might be based not  on a fixed number, but on some value. We already  

play19:38

saw that we had the way of generating using a  value saying stop generating when you do this,  

play19:43

but whatever stream is coming and  we are looking for something and  

play19:45

we want to stop and we see that something. So, take while says keep reading the input until  

play19:52

some condition holds. So, in  this case, it is saying check  

play19:55

whether the number that you see. So, random will  generate numbers between 0 and 1. So, what we are  

play20:00

saying is whenever I first see a random number  which is bigger than half, I will stop.  

play20:05

So, take while says so, long as n  does not set. So, is really when this  

play20:12

number, take while says so,  long as this predicate is true  

play20:16

keep so, it will not stop it will stop when  you first see a number smaller than 0.5. So,  

play20:22

it says take while the predicate is true. So, like  any while loop, while condition do something.  

play20:29

So, while this predicate is true, keep  accepting things a stream, the moment  

play20:33

the predicate becomes false stop. So, long as you  are producing numbers bigger than equal point 5,  

play20:39

we will continue the moment you see a number  which is smaller than point 5 you will stop. So,  

play20:43

this is take while and correspondingly just like  so, this is like a limit with a condition.  

play20:49

So, the limit is achieved by when  condition becoming false. And the skip  

play20:54

is with a condition is called dropWhile. So,  it says until some condition becomes true,  

play20:59

I will drop it. So, long as the numbers are  smaller than 0.05. I will ignore them the  

play21:08

moment I see a first number which is bigger than  0.05. I will start from there and continue.  

play21:14

So, remember that this is only indicating the  termination point. So, take while says the moment  

play21:20

I see a number which is smaller than 0.5 stop,  even if I see bigger numbers after that. So,  

play21:25

it is not a filter, it is different from filter  what filter will do is it will keep on applying  

play21:30

these even beyond that take while says the moment  the condition becomes false stop, dropWhile says  

play21:35

the moment the condition becomes false start. So, as long as the condition is true, skip after  

play21:41

this condition becomes false start and then  ignore the condition after, the condition is not  

play21:45

operating after that. So, that is the difference  between take while and dropWhile and filter. So,  

play21:49

filter keeps on going, the condition is  true and false at various points here,  

play21:53

once the condition switches you stop, either you  terminate the stream or you start the stream.  

play22:02

So, you can do other things with streams,  you can take two streams and merge them,  

play22:05

you can combine them, concatenate them extract  distinct elements, we can sort them and so on. So,  

play22:10

you can look up the Java documentation, there are  quite a few natural things that you can do with  

play22:14

streams. And you can look up how to do that. So, now let us move to the last part of this  

play22:20

pipeline. So, we started with creating a stream,  then we had these functions, which will transform  

play22:25

streams in different ways, filter them, map  them, limit them by some condition, and so  

play22:30

on. And now we reach at the end something that we  want to do with it. So, we want to, in some sense,  

play22:36

collapse this whole sequence into  some number or some value. So,  

play22:41

the thing that we saw earlier was count, we  wanted to count the number of long words.  

play22:46

So, here, for instance, if we are looking at  the random generation thing, we might want  

play22:51

to say that I generate 100 random numbers. And  then I filter those, which are bigger than 0.1.  

play22:58

So, I am counting now those which survived this  filter, those which are passing through this  

play23:03

filter. So, how many random numbers in 100? Did  I generate which are larger than 0.1. So, since  

play23:08

the random numbers are between 0 and 1, and if it  is uniformly distributed, you would expect that 1  

play23:14

10th of them are in the range 0 to 1.1 and 9 10th  of them are in the range, because we have a range  

play23:20

from 0 to 1. And you are saying that at 0.1. So, you are saying how many of them are here. So,  

play23:25

if you have filtering out those numbers which  are bigger than that, and you have 100 of them,  

play23:29

you should expect count to be approximately  90. If this is a good (())(23:33). So, this  

play23:33

is one way to maybe validate whether your random  function is working the way you expect or not.  

play23:39

So, the other thing that you might want to do is  check something like the maximum or the minimum.  

play23:43

So, I take say the first 10 random numbers  and I asked what is the maximum one.  

play23:49

Now, in general, you can be flexible  about how you compare these things. So,  

play23:55

you have to pass a function in this case,  these numbers are coming from random. So,  

play23:59

they are doubles. So, we will pass the static  compare to function which is defined in double  

play24:05

and this will allow us to compare two doubles  in the normal way. So, that the maximum is the  

play24:08

largest number, but the important thing is that  you must pass a function a comparison function  

play24:13

to max and to min. So, otherwise what it  will do is it will produce the max.  

play24:18

Now, there is something strange happening  here. So, you are taking a stream of doubles,  

play24:25

you are finding the maximum. So, therefore,  the maximum should be one of those doubles, and  

play24:30

therefore the result should be a double. But it  is the return value of this is something different  

play24:35

from double. So, why is that? Well, the reason is  that I could actually have something in between  

play24:43

before I take the max for instance. Supposing I am asking for  

play24:47

filtering the first 100 numbers to include only  those which are very, very small. Now, it is  

play24:54

possible that none of the first 100 numbers meet  that predicate none of them are actually smaller  

play25:00

than 0.001. So, if none of them are that small,  then what will happen is that the effect of this  

play25:06

filtering will be to produce an empty string. I start with 100 random numbers. And then I am  

play25:11

filtering them through this very selective  predicate, which looks for very, very small  

play25:15

ones. But it is possible that in a random sequence  of 100, none of them meet it, and then I produce  

play25:20

something which is empty. And now I asked max,  to find the maximum of an empty sequence.  

play25:27

So, as we all know, the maximum of an empty  sequence is not something that is well defined.  

play25:32

So, in this case, max will traditionally would  return something like false or null or something  

play25:37

like that. And this is the reason why we have  this kind of optional thing. So, this says that  

play25:44

the outcome of this computation is likely to  be a double, but it may not be a double.  

play25:49

And we will discuss these optional types in more  detail in the next week. So, another thing that  

play25:55

one can do is to find the first element. So, you  want to take the first 100 numbers, filter it and  

play26:01

find the first one. Now again, the problem is that  the filtering might produce an empty sequence. So,  

play26:06

the first element is well defined if the  sequence or the stream is not empty.  

play26:10

But I have now again, given a given the reverse  of this, I have said, supposing it is a very,  

play26:14

very random number very close to 1. So, that these  numbers are between 0 and 1. So, if it is bigger  

play26:21

than point 999 a very small fraction are going to  be there, it is possible that nothing survives.  

play26:25

So, again, find first may have nothing to find  the first off. So, again, this like our earlier,  

play26:32

max, and min functions will return  not always the type that you expect  

play26:36

it will return an optional version of that  type. So, we will deal with this later.  

play26:42

So, what we have seen is that we can take a  collection, any type of collection, and where we  

play26:47

would normally produce a sequence of values using  an iterator, we can instead produce a stream using  

play26:53

the collections.stream function. So, we can always  think of a collection as a stream of values.  

play26:58

And then we can process the stream rather than  running through it using a conventional iterator.  

play27:03

And this is nice, because we can use  these declarative kinds of things and get  

play27:08

what is actually a functional programming  style of working with collections.  

play27:13

So, collections you can actually operate on as  streams. So, we can create a stream as we said.  

play27:19

So, we can also create a stream be set from a  function, and we can write a generate function  

play27:23

which generates a stream. So, we create  the stream either from a collection or  

play27:26

from a function, we pass it through this stream  transformation, which are not disruptive.  

play27:31

They take an input stream, produce an output  stream, then you concatenate. So, it is like  

play27:35

connecting a sequence of pipes. So, you pass  this flow through this pipe, it goes through some  

play27:41

mixing box comes out as a stream  of values, possibly different color  

play27:45

in the next pipe, and so on. And then  eventually at the end of this pipeline,  

play27:50

you take these things, and you do  something to it to produce a result.  

play27:55

And when you produce a result, one of the problems  we have to deal with is the empty streams, because  

play28:00

in this whole process of mixing and matching and  filtering, and all that we might have actually  

play28:04

generated something which produces no output. So,  if we have no output, then the result that we are  

play28:08

trying to compute may not be a valid thing. If it is a count, for instance, it will just be 0.  

play28:13

But if it we saw if it is max, or if it is the  first element or the last element or something  

play28:17

like that. That does not make sense. So,  therefore, we have to be careful about  

play28:20

empty streams, and we will see how to deal  with them with these optional types later.

Rate This

5.0 / 5 (0 votes)

Related Tags
Java CollectionsIteratorsStreamsData ProcessingDeclarative ProgrammingFunctional StyleLazy EvaluationFilteringMappingOptional Types