Next we're going to look at a sample program implementing a raise with Toy, and at the same time sound a note of caution about a Neumann machine. So to implement an array, we're going to use the same basic idea that we talked about in a more abstract fashion when we introduce the rays for Java. We keep the items contiguous in memory starting at a given memory address. So this is an array of 11 elements that are stored in memory starting at location 80, and the way that we'll get to the eighth element of the array is just add i to the address of the first element of the array. So, a [i] of two is at 82, 80 plus 2 and so forth. So, it's a simple calculation to get to the point where we want to know about the contents of an indexed array element. We're going to use indirection which is built into the Toy instruction set to facilitate this. So, the idea is we're going to keep the array address in a register and then when we want to access an array value, we're going to add the index and then use the indirect load and store instructions which you use the contents of a register rather than an address within the instruction to reference a memory. So these are the three instructions 7, A and B from the TOY instruction set that we're going to use to implement arrays. Load address loads the address bits of the instruction into the specified register. Load indirect load uses the contents of a specified register as an address and loads the memory word at that address and store and direct does the same for store. So for example here's an example of an indirect store. So 7 A 80 loads the address 80 into register A, that's where our array starts. And then we'll have another register say register 9 that is the index into the array. Usually, we start that at zero and increment it till we get to the end of the array. So, later on in the code somewhere after register 9 has been incremented to some other values say, we're going to have this kind of calculation 1 C A 9 that's an add instruction, destination register is C and registers A and 9 are the two to be added. A's got the address at the beginning of the array and 9 has the index. So we add those, then we'll get in register C, the actual memory address that we want to reference. And then the store indirect instruction B, is going to take register D the contents of register D and store that in memory. But the place that it's going to be stored in memory is not in 0 C, since it's a store indirect instruction, it will be the address that's in register C and that's the one that we just computed somewhere in the array maybe 84 or 85 or something. And then maybe we increment our i and then continue on our way. So we look at this code in context on the next slide. That's the basic idea using the load indirect and store indirect instructions in computing the address of the memory word that contains the indexed array element. So, here's code that reads an array from standard input. So, again we're going to keep the items starting at memory location 80 that's burned into this code. We start with the Register 9 equals 0 and then we're going to access the eighth element in the array by adding a plus i and that's how we access a of i. Just a note, this example is simpler than the one in the book for the lecture. The ray processing in the book, includes the length of arrays and the representation more like Java so that we can pass arrays as arguments and return values from functions as we do in Java. Then you can see that in the book. For now, we're going to work with this much simpler code where we have the location of the array built into the code and we don't carry the length along but that's this code is easier to understand and quicker to describe and then you can read the more general code in the book. So here's our Java pseudo code for that. So we do read in the length from a standard input. So our array representation on standard input is an integer followed by that many array words that are to be stored. So address of A 0 is going to go in register A and the index is going to go in register nine. So, our while loop is going to compare register nine our index against register B which is our value of N, the length of the array. So, by subtracting them, if we get to 0 then we branch out of the loop with this branch of 0 instruction. Otherwise, we go ahead and compute our array index, read another word from standard input and then use an indirect store to store that in the array. Then we increment i and then stay in that loop, incrementing register nine, reading words from the tape and storing it using indirect store in the array until we've loaded up all the words and we get with register nine, register B equal. We'll do a simulation of this code on the next slide. So here's the same code and now we're just going to dynamically simulate it. So we start our register one gets one, which we need to increment. Register B, off the tape will read the length of the array, which is six and you can see over on the right the tape moved up to read the six out of the array. Register A gets 80, that's the beginning of the array and register nine gets 0. And now we're going to go through the loop where we compute an array address in register C, we read a word from the tape into register D, and we store that word into the memory address referenced by register C. So, again register 9 0 and register B is 6 so the result is not zero. So we go into the loop. C equals register A plus register 9 which is just 80 in this case and I will read the first word of data off the tape and that's a 1, and so we read that off the tape put it in register D and then we use the indirect store to store that word from register D into location 80 and that gets stored in register 80. Then we increment register nine to be one and now I go back to test if we're done with the y loop yet. And now we'll just look at what happens all the way through the loop each time. Each time we compute a new address in register C, read another word from the tape into D, store that in memory and increment register nine. In this way, with this code, we read all the words from the tape until we get to register 9 equals 6 that's equal to register B we subtract them and then in the instruction allocation in 15 we branch out of the loop. So that's a typical example of getting data from a paper tape into memory. And again, in the book you'll see a more complicated example where you can use this to create a memory representation of the array that you can pass to a function that you can process the array by completing the sum of the elements or whatever else you might want to do. Now let's imagine a typical scenario which actually held true for many scientists when the first computers came along. Let's say Alice develops a procedure for her experiment and she has a scientific instrument that's connected to a paper tape punch, like Doug's father. So what she's going to do is, take the paper tape to a computer to process her data. Now, the big advantage of the computer over the old calculating machines was that you could write a program and use that program to process the data. So she'd use code just like what we just described to go ahead and load her data and then have code the processes the array to analyze her data and produce some kind of output. Very typical scenario. Generally, in this kind of situation, you'd have a lot of data and once you've got the program going well, you wouldn't change the program much. And then maybe punch out the results on paper tape to save them and that would be a natural workflow for scientists. Well, let's suppose that Eve has some experiments and realizes that the same kind of analysis that Alice does would be useful for her, too. So, Eve says, "Hey, Alice, could you process my data?" And Alice, being a trusting soul, says, "Of course, sure." So then, Eve gives her a tape, and actually maybe the computer was slow, maybe not even Alice gets the tape. Some computer operator loads it up and walks away, or maybe Alice is standing there watching. So, let's take a look at Eve's tape. Now, of course, maybe you wouldn't do that. You just take the tape. But if you look at this tape, it looks a little fishy. So, the first thing is, that it says it's got 256 words. There's only 256 words in the whole computer. So, that should be a clue that something might be wrong. And then, there's 146 words on the tape, and they're all eights. So, that's a little peculiar. What kind of data is that? Again, if you didn't actually scrutinize the tape, maybe you wouldn't notice. And then, there's something suspicious at the end. But that's a tape that could be presented to Alice's program that reads in the data, and processes the array with no problem at all, it's the input to her program. So, let's look at what happens. It's not at all what Alice might expect. First thing that happens is, well, you're supposed to read the stuff from the tapes, starting at memory location 80. So sure enough, it just reads and fills up memory with all eights, all the way up to the end of the memory. But actually, when you get to FF, that punches something out on the output. That's a little suspicious, right? Because that's a punch on standard output. And then, the address overflows back to zero, and then, memory location 00-0F are overwritten with eights. But that's when things really start to get bad. So, let's take a look now at what happens. So, again, it's the same code for loading things from the array. So, now, we're at a position where we just read an 888 from the tape, and stored it in memory location 0F. And now, the next time, we're going to get another 888 and store it in 1O. Well, that was our code where we set register 121, and next time we put another 888 there. Now, this is looking ominous, and it should. And then now, it's not just eights, it's something else. Now, there's a 98FF. Actually, maybe it's not data, maybe it's a code. Well, actually it's both. It's data because it was read off the tape. Now, in the position where it is, let's take a look at what happens now, when we increment the PC. We increment register nine, and now we branch back to location 14. Location 14 used to be the instruction that subtracted register six from registered nine to check in register two if it's zero. But look what's there now. It's a branch instruction back up to location 12. So, what has happened is that Eve has taken over Alice's computer with a loop that just writes 888 on the output, an infinite loop. So, if Alice had gone away to lunch, she would come back to find that all her paper tape laid out on the floor with 8888 punched on it. This seems like a fantastic example, but it's actually quite a real example. By taking a look at this carefully and understanding the idea that we took code from her input data. It looked like data but it turned out to be code. That's a key concept in a Von Neumann machine. It should be able to do this. Otherwise, we couldn't write compilers and interpreters and so forth. But, there's a danger, and this is the danger. Eve could have loaded any program at all, completely taken over Alice's computer. And in fact, this still happens to us today. This idea that a hacker can own your computer just by presenting data to one of your programs, is a property of Von Neumann machines that lives with us today. See, here's a simple and actually widespread example. In language, C, C++, and Objective C has exactly this problem called the String Buffer Overflow problem. So, here's some C code. And again, we haven't talked in detail about C code but you can get an idea of the char buffer 100 statement, declares a buffer that's big enough to hold a hundred characters. And then, following that is a scanned F statement that reads a string of characters for that buffer from standard input. But this code doesn't check that the string will fit. If there is a longer string, then it'll overflow that buffer. The system doesn't check either. So, what happens is, the memory representation has the buffer stored at the beginning of the code. And then, control for the first instruction is after that. And, that's where the code for scan F and print F and the rest of the code and the routine are. So, what happens is that, when the hacker reads in a long string, the hacker can put code there, which is the hacker's own program just like Eve did. And at that point, when the system transfers control to that place which was overwritten by the hackers long string, the hacker owns your computer, and can do anything at all with it. He has control of the instructions that are run on the computer. This was, our first famous example. This was in 1988. A virus like this, called the Morris worm infected research computers all through the United States. In 2004, there was a thing called the jpeg of death. That if you were in a Windows browser, and you hit a certain image, it would overflow and code at the end would transfer the hacker's code, and that hacker would own your computer. And people figured out how to get unlicensed games the same way with a buffer overflow attack like this, where people could see that the code doesn't check for a buffer being too big. And then, they could write their own code to get loaded into the buffer, and just the program thinks it's reading data, and the hacker who owns your machine. This is still true in iOS devices and it's listed as a top five vulnerability. There's lots of code out there, and unless the programmer checks up for this, then that code is vulnerable to this kind of hack. Now, with Java, we try to write secure code. We try to check the bounds of arrays so that the compiler will tell us if we do this by mistake. And also the types of things, is another check against malicious code like this. But as we discuss in the theory lecture, there's no way to avoid this in general. It's a fundamental consequence of the idea that data can later be executed as code.