Sie sind auf Seite 1von 7

In this video we'll be learning about XML schema.

Like document type descriptors, XML schema allows us a way to give content specific specifications for our XML data. As you may remember, we send to a validating XML parser or XML document as well as a description. We talked about DTDs in the last video. We'll talk about XSDs in this one. The validating XML parser will check that the document is well formed, and it will also check that it matches it's specification. If it does, XML comes out. If it doesn't we get an error that the document is not valid. XML schema is an extensive language, very powerful. Like document type descriptors we can specify the elements we want in our XML data, the attributes, the nesting of the elements, how elements need to be ordered, and and number of occurrences of elements. In addition we can specify data types we can specify keys, the pointers that we can specify are now typed like in DTDs and much, much more. Now, one difference between XML schema and DTDs is that the specification locations in XML schemas called XSD's are actually written in the xml language itself. That can be useful for example if we have a browser that nicely renders the XML. The languages I said is vast. In this video, we're going to show one sort of quote easy example. But that example will give very much the flavor of XML schema. And we'll try to highlight the differences between XML schema and using document type descriptors. Ok, here were are with our XML document on the left. On the right we have our XML schema descriptor or XSD and we have a little command line that we're gonna use for our validation command. Now let me just say up front that we're not going to be going through the XSD line by line in detail the way we did with DTDs. As you can see it's rather long and that would take us far too long and be rather boring. So what I highly suggest is that you download the file for the XSD so you can look at it yourself and look at the entire file as well as the XML and give it a try with validating. What I'm gonna do in this demo primarily is focus

on those aspects of the XSD that are different, are more powerful than we had in document type descriptors. First, let's take a look at the data itself. So we have our bookstore data as usual with two books and three authors. Its slightly restructured from any of the versions we've used before. It looks closest to the last one we used because the books and authors are separate and the authors are actually exactly the same. The have an identifier and a first name - last name sub element. But the primary difference is in the books, instead of using ID refs attributes to refer from books to authors, we still, we now back our back having an author's sub-element with the two authors underneath and then those authors themselves have what are effectively the pointers to the identifiers for the authors. And we'll see how that's going to mesh with the XML schema descriptor that we're using for this file. So, the other thing I want to mention is that right now we have the XML schema descriptor in one file and the XML in another. You might remember for the DTD, we simply placed the DTDs specification at the top of the file with the XML. For DTDs you can do it either way in the same file or in a separate file. For XSDs, we always put those in a separate file. Also notice that the XSD itself is in XML. It is using special tags. These are tags that are part of the XSD language, but we are still expressing it in XML. So we have two XML files, the data file and the schema file. To validate the data file against the schema file, we can use again the XML link feature. We specify the schema file, the data file and when we execute the command we can see that the file validates correctly. So I'm now going to highlight four features of XML schema that aren't present in DTD's. One of them is typed values. One of them is key declarations. Similar to IDs but a little bit more powerful. One is references which are again similar to pointers But a little more powerful and finally a currents constraints. So let's start with tights. In our data we see that the price attribute is

denoted with a string and when we had DTDs, all attribute values were in fact stringed. In excess fees we can say that we want to check that the values which are still look like strings actually confirm to specific types. For example we can say that the price must be in integer. Again I'm not going to be labor the syntactic details but rather I'm just going to highlight the places in the XSD where we're declaring things of interest. So specifically here's where we declare the attribute price and we say that the type of price must be an integer. So our document validated correctly, what if we change this one hundred to be foo instead. Of course with a DTD this would be fine because all attributes are treated as strings. But if we try to validate now we see an error, specifically foo is not a value of the correct type. So let's change that foo back to a hundred so that we validate correctly. Next, let's talk about keys. In DTD's, we were able to specify ID's. ID's were globally unique values that could be used to identify specific elements. For example, when we wanted to point to those elements using ID refs . Keys are a little bit more powerful or more specific I should say. If you think about the relational model a key in the relational model is an attribute or set of attributes that must be unique for each tuple in a table. So, we don't have tables or tuples right now but, we do have elements and we often have repeated elements. So similarly we can specify that a particular attribute or component must be unique within every element of the same type. And we have two keys in our specification, one key which we can see here for books and one for authors. Specifically we say for books that the ISBN attribute must be a key. And we say for authors that the ident attribute must be a key. So let's go over to our data and let's start by looking at the authors. So if we change, for example, U to HG then we should get a key violation because we'll have two authors that have the same ident attribute. Let's try to validate.

In fact, we do correctly get a key validation we also get a couple of other errors and those have to do with the fact that we are using these items as the destination of what are affect doubly pointers or references. So let's change that back to JU, make sure everything now validates fine, and it does. Now lets make another change. So we have the ident key here and we have the ISBN number, being the number for books, what if changed the ISBN number to one of the values we used as a key for the author, say 2HG. When we did something similar with DTDs we got an error because in DTDs, IDs have be globally unique. Here we should not get an error. HG should be a perfectly reasonable key for books because we don't have another value that's the same. And in fact it does validate. Now let's undo that change. Next, let's talk about references. So, references allow us to have what are Possibly typed pointers, using the dtd. So, they are called key refs, and here we have an example - let me just change this to the middle of the document. So, one of the reference types that we've defined in our DTD is a pointer to authors that we're using in our books. Specifically, we want to specify that this attribute here, the auth ident, has a value that is a key for the author elements. And we want to make sure it's author elements that its pointing to and not other types of elements. Now the syntax for doing this in XML schema is rather detailed. Its alright here and just to give you a flavor, this middle selector here is actually using the XPath language which we'll be using, which we'll be learning later but what it says is that when we navigate in the document down to one of these auth elements. Within that auth element, the auth ident attribute is a reference to what we have already defined as author keys. We've done something similar with books. We have our book /remark/bookref that brings us down to this element here.

And there we specified that the book attribute must be a reference to a book key, and the book key was earlier defined to be the ISBN number. Again, I know this is all complicated, and the syntax is very clunky, so I urge you to download the specification and spend time looking at it on your own. Now let's make a couple of changes to our document to demonstrate how the checking of these typed pointers works. For example lets change our first reference here to food. Let's validate the document and we should get an error and indeed we do, the author key rep is incorrect. Now lets change that FU to JW, so originally it was JU But now we're going to have two authors, both of whom refer to JW. Now this should not be a problem. It's simply two pointers to the same author, and we did not prohibit that in our XMLs schema specification and indeed our document validates. We'll change that one back. And the last, as a last change, we'll change our book reference here to refer to JW. This should not validate because this time, unlike with DTDs, we're, we've actually specified typed pointers. In other words, we've specified that this pointer or this reference must be to a book element and not to an author element. So we'll validate and indeed it fails. I've undone that change and now let's move to the last feature that we're gonna look at in this demonstration which is a currents constraint. So in, let me just bring up the first instance of it, in XML schema, we can specify how many times an element type is allowed to occur. Specifically we can specify the minimum number of occurrences and the maximum number of occurrences. As a default if we don't specify for an element the minOccurs or maxiOccurs the default for both of them is one. So here for books we've said that we can have zero Books and we can have any number. So this is the maximum flexibility, any number of elements. For authors we've also said we can have any number of authors that's in the actual database itself.

Remember that our book store consists of a set of books and a set of authors. But we are going to specify something a little different for how many authors we have within a specific book. So let's continue to look at other cases where we've specified occurrence constraints. Here is the case where we're specifying how many authors we have within a book and again few boy this is a lot of XML here so take your time when looking at it; or for now just take my word for it. What we're specifying here is how many sub elements, how many auth sub elements we have within each author's element. And here we have no minOccurs specification only a maxOccurs. That means by default minOccurs is one. So what this is saying specifically, is that every book has in it's authors sub element, atleast one off, but we can have. any number of them, that's the string unbounded. Looking at the remaining occurrence constraints, for remarks, we have the minimum number of occurrences is zero. In other words, we don't have to have a remark. And we haven't specified max occurs so the default max occurs is one. So what we're saying here is that every book may have either no remark or exactly one remark but it may not have more than that. And there's a few more occurrence constraints that you can take a look at again as you browse the XML schema description on your own. Now let's make some changes in the document to test these occurrence constraints. So first let's remove the authors from our first book. We won't remove the whole author sub element but just the two off sub elements of authors. We attempt to validate and we see that it doesn't validate. We're missing some child elements, specifically the off-child elements because we expected there to be at least one of them. Incidentally, if we took the entire author sub-element out, we'll also get an error since we've specified the books must have author sub element. So now we're missing the entire author structure in that book and again we don't validate. Let's put authors back; and now let's look at the remark occurrence constraint so we said that every book can

have zero or one remarks, so let's just add another remark to this book. Oh, hi, actually remarks are allowed to be empty. In any case, we have added a small remark. We validate and we see that we have too many remarks again because we specified that every book can have at most one remark. So that concludes our demonstration of XML schema again, it's been rather cursory we've only covered a few of the constructs but I did focus on the constructs that we have in XML schema that are not specifiable in DTDs. Finally one more time I urge you download the access fee and the document and play around with it yourself

Das könnte Ihnen auch gefallen