Hello everyone, welcome to big data and language. Today we will talk about POS Perser. What is POS Parser? The parsing is a syntax analysis or syntactic analysis. So it's a process of analyzing a string of symbols. It could be used either in natural language, computer language, or data structures. So the previous in the previous lectures, I asked you to find the POS, the terminologies and match the examples, right? So you will probably remember that, the right columns I had all the examples. And then the left column I had all the terminology. So I asked you to match all the terminologies and examples. Based on this knowledge, we can expand this knowledge. So it's pretty exciting, right? So, and also I asked you another task. I gave you the five sentences, so you have to identify the parts of speech, right? So that one was the previous activities, based on that knowledge and tasks. Now, let's move on, okay? So before the POS Parser, I want to introduce more linguistic terminologies about the sentence structure, phrases, and clauses. So let's take a look. Phrases are a group of words that functions as a constituent in the syntax of a sentence. So it's a single unit within grammatical hierarchy. Let me give you another definition. So the noun is a word that names a person, place, or thing. And that can be used as subject or an object. So examples could be like, a roommate, right? So I have a roommate at school, so in this sentence the noun is roommate, right? And my name is Mark, so name and Mark are nouns. And also he's from Hong Kong, Hong Kong could be the proper noun. And also Mark and I like the same music, Mark as the person's name, right? What about noun phrase? A noun phrase is a group of words ending with the noun, that belong together in meaning. So for example, he lives in that old house on the corner, right? Then that old house could be the noun phrase, right? And also the corner is now phrased. And let me give you another example, I am reading a really good book, right? Then a really good book, this one is now phrase, okay? So the whole chunk that we got noun phrase, okay? And what about clause? A clause is a group of related words that has a subject and a verb. For example, this is my book. Then this is my book, the whole chunk, subject and verb we called clause. And also we can say that I was absent because I woke up late, right? Then I was late, this one is one clause. And because I woke up late, this one is another class, right? So one subject verb combination, we called it clause, okay? And I was late, this one we called independent clause. Because that one is a main clause. But because I woke up late, this one we defined as dependent clause. So let me give you the definitions of a dependent clause and independent clause. First the dependent clause is a clause that cannot complete sentence. Cannot be alone, right? For example, because I woke up late, we cannot say only that dependent clause, okay? This one we called fragment, this one is grammatical error, right? Where only the independent clause can be alone as one sentence. So let me give you the definition of independent clause. Independent clause is a clause that is or it could be a complete sentence, okay? For example, dancing is fun, right? So itself is dancing is fun, itself could be one sentence, one single sentence. So we say that this one is independence clause, okay?. Now, let's move to adjective clauses, so adjective clauses like adverb clauses. These are depedent clauses. So they begin with the words who, whom, Which and that, among others. So the relative pronouns or relative clauses are connected, this adjective clauses, okay? The purpose of adjective clauses to describe nouns or and pronouns. For example, holiday is a word from Old English that originally meant a holy day, right? And so that originally meant holy day, modify, okay? The word from Old English, and also Halloween. Another example. Halloween, which is celebrated in the United States on October 31st, is a holiday with pagan origins. So now, which is celebrated in the United States on October 31st, modified a noun Halloween, okay? So we call these are adjective clauses, okay? So if you are not familiar or still get confused with the definition of clauses, then let's take a look and let's complete the practice. I will give you two sentences. So please underline the adjective clause in each sentence and draw an arrow to the noun that the clause modifyers. And write NI if they are a positive gives necessary information. A positive means we have now phrases in a row, right? So for example, Tom, my son, right? So my son is a positive, right? And write EI if it's an extra information. For example, Tom, my son, right? Tom is the proper noun, right? So my son is a extra information, right? So that is the example of EI, okay? And add commas around the unnecessary or positives. So there are two sentences. The one is some of the customs of Easter, which is a Christian holiday help pagan origins. And the second sentence, is before Christianity existed, people in northern and central Europe worship tile goddess whom they called the Eostre. So what is the answers? You might want to stop the video and complete the task. All right, are you ready? So now let's check the answers together. The first sentence, some of the customs of Easter, which is the Christian holiday have pagan origins. So in this sentence, which is a Christian holiday, that modifies a word. The noun Easter, okay? So this one is extra information, right? Because Easter is a very specific day, right? So this one we say EI, okay? Good, so EI because this one is extra information, we need to add comma before and after adjective clause. So what that mean is the grammatically complete sentence is, some of the customs of Easter comma, which is a Christian holiday, comma have pagan origins, okay? And the second, let's move to the second sentence. Before Christianity existed, people in northern and central Europe worshiped a Goddess whom they called Eostre. So the whom they called Eostre. This will modifies a noun goddess, okay? This time goddess it's pretty general terms, so we need more information, right? So whom they called Eostre. This one is necessary information. So we call it NI. In this case this is for the necessary information, no comma is needed, okay? So why we've talked about this order phrases and clauses, and part of speech. Then now it's time to talk about POS parser, okay? Chomsky in 1957, he mentioned about syntactic structures. So as could be MP plus VP, and so for example, the man hits a ball, right? This one we divided as very different terminologies. So for example the, is the determiner, man is noun and hits is a verb. At all is another determiner, and ball is noun, right? So we can put the man together as a noun phrase, and a ball could be another noun phrase and hits a ball. This one is a verb phrase. So VP, right? And once we combined MP Plus VP, it's the whole sentence, right? So we can actually make a tree. Chomsky introduced the idea of transform along formational generative grammar, which influenced on the theory of formal language in mathematics, computer science and linguistics. So based you might be if you ever taken any linguistic courses, syntactic courses, you might draw this tree diagrams before, but if not then you don't need to worry about it. The point here you need to understand that is that the sentence we can divide it like some phrases, parts or like the single parts of speech. Okay, so there are several softwares that the tags POS automatically. So for example Stanford NLP, so nlp dot stanford dot edu slash software slash tagger dot shtml, okay. However, the auto tag is not always correct, so you might want to double check and TagAnt is one of the POS parsers, okay. So let me give you an example of how the POS parser actually parsing the sentences? For example, if there is the sentence, I should an elephant in my pajamas, right? Then I, this one is the pronoun, right. And shoot is the verb, and on is determiner. And elephant is noun, and in is preposition, and my is a proper noun, and pajamas, this one is a noun, right. So I could be NP and also in my pajamas, that one could be prepositional phrase, right? And shoot an elephant, this could be a verb phrase, right? So shot an elephant in my pajama, the whole chunk could be verb phrase so MP plus VP could be the whole sentence. But there is another way we can analyze this sentence. For example, I shot an elephant in my pajamas, even though it's still the same sentence. But we can say that in my pajamas modify the noun phrase an elephant, right? So in that case, the tree grams are slightly different, so you will see that depending on the understanding or depending on the ways we analyzed the sentences, we have the different ways of POS parser. So let me give you an example how many ways can you analyze the structure of the following sentence? The man saw a bear in the room. Okay, so there are two different ways, right. The first way is the man saw a bear, right. Where? In the room, right. But you can also analyze that, a bear where is it? In the room, right? Yeah, so a bear in the room, he the man, saw that, right. So he really doesn't need to be in that room. Because the bear is in that group, right. So there are two different ways, right. So depending on you will see that the POS parser, or we could analyze in different ways. Okay so let's talk about the parsing problems. It maybe one or more syntactic structures for one sentence, as we've just seen in the previous sentences. So I will share the video clip about the syntactic ambiguities, so feel free to take a look. We've talked about syntactic ambiguities before, let me just remind you of those a little bit. So we have problems with prepositional phrase attachment. So in the sentence I saw the man with a telescope, we can have one interpretation where I use the telescope to see the man. And another interpretation where I saw a man who himself was carrying a telescope. So this is a PP attachment problem because the prepositional phrase with the telescope can attach to either the verb of the sentence, so, or to man, which is the direct object. We could have gaps, so for example the sentence Mary likes Physics but hates Chemistry. It is clear that the subject of the second verb hates is also Mary. However, this is not explicit from the structure, so a successful parser should be able to infer that Mary is the subject of both of ours, not just the first one. A symmetric relation that holds between a head and its dependents. So head of a sentence is usually taken to be the tensed verb. Every other word is either dependent on the sentence head or connects to it through a path of dependencies. So for example, I study, right? Study modify I, right? So draw an arrow from study to I, and let me give you another example. So labels of the relation indicate the grammatical function of the dependent as subject, object, or modifier. So I shot an elephant in my pajamas, for example, right? You can see that like I shoot, this one is subject, right? Shot an elephant. This one is object, right? And in my pajamas, this one is modifier. All right, so today we've talked about POS parser, and next time we will talk about analysis tools. Thank you or your attention.