[MUSIC] In the cloud, you typically have an application that runs on multiple servers. So, for example, if you go and deploy your application across 100 or more different servers in the cloud. One of the big concerns you have to worry about is how do the incoming requests from various clients get routed to the various servers. So we've seen routing within each individual server to figure out which servelets. But if we have multiple copies of the servelets all running on a different machine, how do we decide, which request should be routed to which different machine. And then once we get to that machine, we know how the routing is done internally. Internally we use the web.xml, to route. But, up here, how do we do routing? How do we figure out which machine should receive the request? How do we get, keep track of the loads on the machines and all these different things? Well, what this is called is called HTTP load balancing. So typically what we have is a load balancer that knows how to look at which requests are coming in, and figure out how to allocate them to the various machines. And there's lots of different approaches for doing this. But one of the simplest approaches is just, just to do a round robin scheme. Where when a request comes in you allocate it to a machine. When the next request comes in you allocate it another machine and so forth and this works very well in certain situations. In particular for round robin to work. We want each of the servlets that's running on these machines to be stateless and not to care about what client it's talking to at that particular point in time. So let's think about this and why this matters. Let's say that a browser comes in and it begins talking to this particular machine. And it's sending requests to this machine, asking it to do something on it's behalf. So, the first sets of requests comes in and they're all routed to machine one down here. And in the process of that, it says, I'm going to log in. So we now logged into machine one down here. Now let's say we're using round robin routing up here. And the next request comes in and it now gets routed to this machine. Well one of the challenges is depending on how we set up our server is this machine may not know that I'm logged in. And I may try to go and access some functionality like view the balance of my bank account. That this machine says well you have to be logged in in order to actually view that information that you're trying to see. So one of the challenges that we have in a typical scheme where we're doing round robin, we're sort of blind routing, where we're saying let's balance our load across things. Is we either have to have a way of remembering state across all of these requests. Or we have to have a way of routing requests more intelligently and these things can affect how we set up our servers and distribute the work load across. So how are the ways that we could fix this problem? Well, one thing we could do is we could say, rather than routing each individual request differently, to a different machine, what we'll do is instead, we'll remember which machine a particular client was talking to, and we'll always send all the requests for that one client to that same machine. So, if one of our mobile phones begins talking to the cloud and talking to this individual machine, we can make sure that all the requests that are coming in for that one mobile phone, always get routed to that one machine. And so if that phone logs in, that machine will keep that state in memory and it will remember that, that user has logged into that machine. And then if another mobile phone comes in, we may route those requests to machine number two down here. And then if that person logs in on that particular machine, from their mobile phone, we'll keep that state. And so, rather than routing at the individual request level. We can route at a higher level of individual phones. So one of the things that you have to think about is, at what level are we going to route? Is it at the individual, and are we going to route them always to the same machine? Or is it at individual request level? And have we built our application in such a way that it doesn't matter if each individual request is routed to a different server? So one of the important things to know is is your application stateless. If your application is completely stateless routings really easy. You can route any individual request in an individual machine. However if your application is stateful it's much more difficult. You have to think about. How you do the routing. And so one thing is you may just route requests to the same machine that began the conversation with that client. So when a mobile client begins talking to this machine you keep routing to that same machine and typically what that's called is sticky or sticky sessions. And basically what this means is we're going to keep routing to that same individual server. Now, another way that we could do this is we could make it so that we could route to any individual machine. But we could do things like session distribution. So we could have, for example we could store all of the information about who's logged in or who's not logged in, in a central database. That all of our individual machines are talking to, and so this one says well it's logged in. It stores that information with some other type of token or security information in the database. And then if another request comes into this server over here it goes in and checks and says is this person logged in or is this client logged in. And what that will tell us is we can go and check the database. But because on a central persistent mechanism that's shared across all the different servers, we don't have to worry about if the first request is routed here and the second request is routed there because. The state is not being kept in this layer. Instead the state's being kept at a lower layer. So these are some of the concerns that you have to worry about when you're load balancing or doing HTTP load balancing across your various servers. Is figuring out: is your application stateless or is it stateful? And if it's stateful. How are you going to manage that state? Is the state going to lee, live on a single machine in its memory? In which case you need to route all of your requests from the same client to the same machine over and over. Or is the state distributed across multiple machines? Either using an M memory, distributed in memory cache, like memcache D, or a database that's connected to all of these different machines. In which case, if it's stateful, and it's got a, a state distribution mechanism, to where all your servers can see that state, then you don't again, don't have to worry so much about the routing. So, these are important concerns to think about. When you're building your HTTP load balancing strategy.