We begin data security by looking at database technology. Whenever we store data on a computer, we attach a label. We rely heavily on labels as we store more and more data. Labels can tell us what data collections are for and who owns them. The computer can easily find labeled information. We can't find Tina's novel by looking in the file system or rather we can find her novel by looking in the file system. But it's harder to find, say, the third chapter, unless Tina stored it as one chapter per file. If you see what I mean. The computer can access data more specifically when we label it more specific file. For example, the survey file contains a results of a survey. The format provides a label for each distinct piece of data. This is a basic feature of a database. There's an ancient saying in computing, in other words, something I heard way back when I went to college, that all the real computing problems are database problems. The database is nothing more or less than a collection of data in a predefined structure. The structure explicitly or implicitly labels every significant piece of information it stores. Since we've organized the data, we can easily collect it and then search and analyze and compare what we have. When we structure the data, we provide hints on the kinds of data we collect. This often shows a system which operations to use on individual data items. For example, we might add numerical values together, but we know it makes no sense to add a city name and a country name. We briefly look at two major types of databases in this video. I love key-value databases, but this course will focus primarily on relational database systems. Old-timers like to brag about their first computer while my first database was a contact list, a list of people's names and addresses. I had a separate list of phone numbers. Hardly anyone had email addresses back then. Key-value data is traditionally called an associative array, has various other terms applied with it. I first encountered it in the Lisp programming. Every modern language provides key data storage. Our examples here I use something akin to the objects and Taxi used in JavaScript source code. The key as a classic alphanumeric identifier, a readable string of letters and digits. The value is a text string, a number, a boolean value or other things the software is willing to handle. The syntax is different from JSON, which confusingly stands for JavaScript Object Notation. JSON syntax isn't used in JavaScript source code. It's used in the programs data. So JSON, the major differences, puts quote marks around every string including keys, I find that tedious. Everything between curly braces belongs to a single entity. So we call a series of key value pairs like that, an object. In this object, every value as a text string because a structured number like a phone number or postal code, isn't really an arithmetic value. It's not something you add to get useful results. We could generally think of a key value data storage as an array containing multiple objects. The array is delimited by brackets. Now for the really cool part, we talked about how a value may be a string or a number, or a boolean. Value can also be an object in curly braces or an array in brackets. This yields a delightful flexibility and a complex structure. We'll see an example in a moment. But first, let's look at relational data. Relational data resides in tables. A single entity stored in a table is called a record. Individual data items are stored in individual fields. A given database table contains a fixed set of uniquely named fields. Each one has a data format associated with it. The fields in this table are all text format. Each record contains one each of each field belonging in that table. They have a particular record doesn't have information for a particular field, the field remains blank. The underlying software for these tables is much simpler than for key-value arrays. This is especially true for things like phone numbers and country names and other small, well understood data items. We don't have to repeat the field names in each field. On the other hand, there's less flexibility because we can't store an object or an array in a relational database field. A relational table is perfect for a simple example from a contact list. But the contact lists just aren't simple anymore. They look a lot more like this. Yeah, we have home and work phone numbers now. We have multiple email addresses, we have cell phone numbers to save. The simplest way to expand is to add those values to each record in the object. We add fields and values in the relational database table, and we insert keys and values into each object. Kim Smithy's object now contains 10 key value pair. But this isn't how a modern contact list works, an individual isn't limited to one address, and three phone numbers, and two e-mail addresses. We can probably add as many as we need of each. Let's start with the key-value database. This is easy, since the keys value, can itself be an object, or an array. Kim Smithy's object now contains four, top-level keys. Three containing objects or arrays. The address key, contains a single object enclosed in curly braces. The phone and e-mail keys contain an array of objects, and enclosed in brackets. To do this in a relational database, we establish a group of tables. Since we search by name, we'll start with the name table. We only have one address in this example, will put the name and address together in a single table. We add a special field called, name index. It uniquely numbers each record in the name table. This type of field is called a key field. To provide Kim's three phone numbers, we store them in a separate phone table. The table has three fields, the phone number, the type of phone number, and the Name Index field. The Name Index field links the record back to Kim's name and address. Finally, we have a table to store e-mail addresses, it's a variant of the phone table. We add more people by adding to the name table, and then adding phone and e-mail records as needed. Kim and Bob's entries for phone and e-mail may look similar, but the Name Index keeps them straight. This is a relationship diagram for those three database and tables each. Some database systems will automatically produce these diagrams, it's useful, tells you a little bit about the basic structure. The key next to the Name Index, shows that it's a key field for the name table. The other two tables don't have key fields, aligned connecting the name table and phone table shows the name index value links those records together. We mark the name table connection with the one to show that each Name Index value has only one record in the name table. We mark the phone table connection with infinity because, the table could have many records with the same Name Index value, because a person could have multiple phone numbers of different kinds. We make a similar connection to the e-mail table. If we want to have multiple mailing addresses, we need another table, like the phone and e=mail tables. Most systems you encounter use relational databases. It's not because they're always best, but because they're mature. They've been around for decades and are already part of lots of older systems. For example, most popular web content management systems use relational databases. Examples in this course, we'll usually rely on relational databases, we'll see this mostly in examples using SQL, short for, the Structured Query Language. SQL is a common language and API for most relational databases. Now, let's see how database management and SQL fit into a typical cloud hosted service. Here's a generic cloud service structure we introduced in the previous course. We often use database servers for the back end. Here's how the user request turns into an SQL database query. The user asks for information, the user's client transmits a request across the internet to the web site, the application server constructs one or more SQL statements, to tell the database what to do, then the server sends the SQL statements to the database, the database formats the result into a database table, sends it back to the application server, the server formats it as a document for the client, and returns it to the user. Here are some basic SQL statements, SELECT, is the workhorse that retrieves data for queries, the others add, or modify records in a table. We can apply a few access controls to SQL databases. First, we might or might not allow particular client to connect to the server. We've seen ways of restricting connections in the previous course. Next, a typical database server may hold several databases with different access restrictions. The client needs the right access permission to get to the right database. Within an SQL database, different implementations provide different access control features. In many implementations, the database designer may restrict particular clients to only access particular tables within a permitted table. Clients might also be limited to particular SQL statements. There are a lot of applications, essentially retrieved data from tables without modifying anything. Clients running those applications wouldn't be allowed to use INSERT, UPDATE, DELETE, or other data changing statements. If we want to record a client's visit without modifying a central database tables, we could grant insert access to a separate table specifically intended to record client visits.