>> On our next homework assignment, we're going to implement a small programming language and so, I want to in this segment, start explaining in general how one goes about implementing a programming language. And then, more specifically, zoom in on the particular simplified setup and useful setup we'll use on our homework assignment. So here's a typical workflow for an implementation of a programming language. Think of it as first taking in some string, which is the text of the program someone writes down. It would be in a file, of course, but we could read that file in, and we can think of the contents as being in a string and we'll call that string the concrete syntax. It's the syntax the programmer actually wrote down. The first typical stage is to pass that syntax to the parsing procedure, the parser. And as we've all have seen if we programmed, we often error messages. It's the job of the parser to give us syntax error messages. Things like a parenthesis in the wrong place, or using a keyword in the wrong position, or something like that. But if we don't have any syntax errors, The output of the parser is what is called an Abstract Syntax Tree, or AST. So it's still syntax, but it's in a much more structured form than what the programmer wrote down. It is a tree that shows what concerts in the language are being used and how. So in this example, we would say that we have a function call, function calls would have say, two children in this case the first expression and the second. The second one here says oh, it's a constant and the particular constant is 4. The function has an argument x and then it's body. It's body is an addition expression. The two sub-expressions of the addition are both variables with x and we build this entire tree that gives us the structure of the program that we are suppose to implement. At that point your language might have some sort of type checker that runs over this tree giving it additional error messages like if instead of having a number that would pass this function, we tried to pass it another function it would say well that's parses. I can create an abstract syntax tree from that, but I get a type error message. And if I don't have of those, then I pass this AST onto the rest of the implementation, which is in charge of running the program and giving an answer. So now, let's talk about the rest of that implementation. I would say that there are basically two fundamental approaches to implementing some programming language that I'll call B on this slide. If I want to implement B, one approach is I can write an interpreter in another language A. Right? My interpreter has itself to be a computer program so I'll use the language A to write an interpreter for B. Now interpreter is not a great name. A better name would probably be something like evaluator or executor or something but okay, everyone calls it an interpreter so we will as well. And the idea is to just take that syntax tree in B and come up with what the answer would be if you ran it. Maybe on some input that's provided when you run the program. The second approach is to use a compiler. So the idea with a compiler is, it is itself of course written in another language A. And it produces a program in a third language C and then we just take that program and we run it. So compiler is also a terrible name. A better name would be translator but its been called a compiler for many decades and so we will call it that too. And the key thing you have to do is take a program in B and produce a program in C that's equivalent, that has the same behavior, and then you have to rely on there already being an implementation for C. So, if C is binary code, the kind of code that your computer hardware can run directly, then you can think of the computer hardware as an implementation of C, and so your implementation of B just translates to a program in C and then, you run it on the hardware. Okay? So, this language A that we use to implement our language B, we'll sometimes called the metalanguage. And, the key thing we have to do in this section of the course and on your homework is in our heads keep straight what's A and what's B. What is something in the language we are using to implement a language and what is in the language that's being implemented. I should mention that by the way the reality on implementing languages is certainly more complicated. It's not just you have an interpreter or you have a compiler. Modern language implementations tend to do combinations of these two fundamental ideas. For example most implementations of Java start with a compiler that translates your program but not all the way down to binary, just to some intermediate language, it's often called bite code. Then there's an interpreter for that bytecode language, but that can be a little slow and so, that interpreter includes a compiler for compiling down to hardware. And then, most people would think well once, once you're in binary, I mean, then it's just interpreted by the chip but in fact modern chips at least for the x86 say well, actually we don't want to interpret quite those instructions, so when the programs running we'll actually translate things internally in the chip and into even smaller instructions and then interpret those. So the point is there's two fundamental ideas that can be combined in interesting ways. And Racket, the implementation of Racket has a similar mix, this is nothing specific to Java but the ideas are you have an interpreter or you have a The compiler. One other thing I have to mention is that a language is defined by written rules for what are the semantics of the different concerts in your language. Whether it is implemented with a compiler or an interpreter or some combination there of is an implementation detail. And so once you understand that, it should be obvious that there is no such thing as a compiled language or an interpreted language. There are implementations of a language that use an interpreter or implementations that use a compiler. Unfortunately for decades people have confused this notion and you will often hear such phrases like C is faster than LISP because C is a compiled language and LISP is an interpreted language. And I wish to emphasize that this really does not make very much sense and I for one I like to politely correct people when they repeat these, these old sayings that are just by definition not just wrong but don't really make sense. Now I will add there is a way that they make sense. At the end of the section I will talk a little bit about how languages like racket have an eval construct. When you have an eval construct in your language your you need to have a language implementation that's still around at runtime when the program is running because you may need to implement other programs in the language. But there's no reason why eval can't be implemented with an in, a compiler, just as well as an interpreter. Okay, so that's the end of that sermon. Let me now go back to this work flow and tell you that on your homework assignment, we're not going to do the parser and we're not going to do the type checker. We're just going to do an interpreter, and I want to emphasize to you how you don't really need a parser if you set things up correctly in how you implement one language inside of another one. So let me show you how we can actually skip the parsing step. Suppose we wanted to implement programming language B in programming language A. What we could do, if we didn't want to write a parser, is have programmers writing programs in B, write the ASTs. The trees, directly in programming language A. So they don't write strings that we then convert to the abstract syntax trees. They just write the abstract syntax trees. And if our programming language A is Racket. And we give B language programmers structs, this is really not so bad. What they're doing is they're writing down racket data structures that are exactly the AST of it represents a program in the language we're trying to implement. So for example maybe you would have some struct definitions like call and function and var, and if a programmer came along and just used these Racket constructors in this way, what they're essentially writing down is this picture on the left. They are writing down, exactly, this tree. You can actually read it. You just have to turn it sideways. You have call with two subexpressions, two subtrees. A function that has an argument list and an then an addition. On the other side sorry, we shouldn't have a negation here. It should just be a const. But if they did want to negate and come up with negative 4, they could have that and so on. So, this is how we can get programmers to not give us strings but to give us trees directly. We just have them write down their programs inside a bracket using constructors. So, this is already what we were doing with our arithmetic expression example in the previous segments. Think of the expressions built out of cost negate add and multiply as forming a little programming language. Let's call it the arithmetic language, alright? So, we are using Racket as our meta language. That's our language A. Our language B that we want to implement, we'll call the arithmetic language. People writing programs in the arithmetic language just use the Racket constructors to write down the abstract syntax tree. And then our language implementation is the eval-x program. This is an interpreter. This is exactly what interpreters do. They take a program, which is written down in racket using these trees, and they produce an answer in the language. And so we have already implemented a programming language. We have, in this language constants, negations, additions, and multiplies. The program itself is a Racket data structure, but it is a program in a language and the implementation of that language is the eval-exp function. So now all we have to do to get ready for our homework is learn how to implement more interesting language constructs then just negations, additions, and multiplies, but we have the framework that we need.