· 6 years ago · Mar 13, 2019, 10:52 AM
1Welcome, everyone.
2This is Dr. Soper And today we
3will be going through our first lecture for our database class.
4I need to operate under the assumption
5that not everyone in the class has had experience
6with databases in the past.
7And so this first lecture is intended to bring everyone up
8to the same basic level of understanding
9with respect to several important database concepts.
10We have several different objectives,
11which we will seek to achieve in this lecture.
12These are listed on the screen right now.
13First, we want to try to understand
14why storing data in a list is not necessarily
15a very good idea.
16It can cause many different types of problems.
17Second, we want to see if we can gain some insight into why
18an organization might want to use a database.
19Third, we will see how the notion of related tables
20of data, which forms a core part of the relational database
21model, provides a basis for eliminating
22many of the problems which are associated with storing data
23in a list.
24We'll also explore the various components and elements
25that comprise a database or a database system.
26And we will learn about the purpose of something
27that we call a DBMS, a database management system.
28And along the way, we're going to explore
29some of the functions that a robust database
30application can provide to us.
31What, then, is the purpose of a database?
32Well, it's important to remember that a database does not
33have just a single purpose.
34Instead, there are several key advantages
35that databases provide.
36First, a database provides a repository for storing data.
37That's kind of implicit in the name.
38Database implies that we have a place to store data.
39However, what might not be so obvious
40is that databases provide an organizational structure
41for data.
42That is, we don't just have a place to store data,
43but the database also provides an organized structure
44into which those data can be placed.
45Finally, a database provides us with a mechanism
46for interacting with our data.
47Now, interacting with data can generally
48be described in four different operations.
49Here they're listed as querying, creating, modifying,
50and deleting data.
51But there's another more interesting acronym, which
52may help you to remember this.
53And that acronym is CRUD, C-R-U-D.
54This stands for create, read, update, and delete.
55These are the four basic operations
56that we can use when interacting with data.
57A key point to remember here is that, in business, there
58are many natural, hierarchical relationships among data.
59For example, a customer can place many orders.
60Another way of saying that is many different orders
61can be associated with the same customer.
62Or another example is a department
63can have many different employees,
64but a given employee might work in one, and only
65one, department.
66So these are hierarchical relationships among the data.
67And a relational database allows us to model and represent
68these relationships.
69Let's take a look at a list of data.
70What I mean by a list here is a simple, two-dimensional table
71of data.
72And in this table, we store information
73that is important to us for some reason.
74An example might be, say that we have many different projects
75in our company, and we want to keep track of who the project
76manager is for each project.
77Therefore, for each project, we may
78track the project manager's name, their ID number,
79and maybe their phone extension within our company.
80Now, ordinarily this wouldn't be such a big problem.
81But imagine if the same person, the same project manager,
82is simultaneously managing multiple projects.
83In that case, using a simple list,
84the project manager's information
85would necessarily appear on the list multiple times.
86Now, what is the problem with this?
87Well, there isn't really any major problem,
88aside from the fact that it's redundant.
89Another way of saying that is we are using more space
90than is necessary to record which project manager is
91associated with which projects.
92Another problem with storing data in a list
93is that the list of data may contain more than one concept
94or more than one theme.
95I want you to remember this idea of a business concept,
96because we will see it again and again throughout our course.
97A business theme or concept refers
98to a number of different attributes or properties
99that are all naturally related to one type of business entity.
100An example might be an employee.
101An employee is a business concept.
102Employees have different attributes
103that we may want to track for each employee
104in our organization.
105For example, we may want to track the employee's ID
106number, the employee's name, the employee's salary.
107Other examples of business concepts or business themes
108include things like departments, products, customers, orders,
109locations, and so forth.
110Returning to lists, a major problem
111is with these multiple themes.
112That is, for each row of the list,
113we might be recording information about more than one
114of these business concepts.
115As an example, consider our list of project managers.
116If we also wanted to include project information-- that
117is, perhaps, a project name, a project
118ID, any additional information about the project--
119we might store that in the same row along with the project
120manager.
121Aside from redundancy, as I mentioned earlier,
122the problem here is not necessarily
123how we are storing this information,
124but what might happen to the information
125if it is stored in this way.
126When we store information in a list,
127we introduce the possibility of something
128called data anomalies.
129And specifically, there are three types of these anomalies.
130Here they're listed as a deletion problem, an update
131problem, and an insertion problem.
132Later in our course, we will refer
133to these as deletion anomalies, update anomalies,
134and insertion anomalies.
135So the structure of a list and its associated problems
136of redundancy and multiple themes
137can produce a situation in which these modification problems
138potentially could occur.
139So let's take a look at some of these modification issues.
140Here we have a list of data, which contains information
141on students.
142So we have a student's last name, first name,
143their email address.
144And then we see who that student's advisor is.
145So the last name of the student's
146advisor, the email of the student's advisor,
147and the department in which they work, along
148with some additional information.
149Let's take a look at how some of these modification anomalies
150might emerge.
151In the first case, imagine that we want
152to change a student's advisor.
153And for this example, we're going
154to change Advisor Baker to Advisor [? Tiang. ?]
155When we do this, not only are we going
156to have to update the last name of the advisor in the list,
157but to maintain the overall quality of the data, a concept
158that we call data integrity, we are
159going to have to also update the advisor's email address.
160Note, in this case, that we do not
161need to update the department or the administrator's last name,
162because those are the same for advisors [? Tiang ?] and Baker.
163However, if, for some reason, we wanted
164to change the student's advisor from, say, Baker to Valdez,
165well, now not only do we need to update
166the last name and the email address,
167but we also need to update the department
168and the admin last name attributes as well.
169Just to do something as simple as changing the student's
170advisor with this list approach requires
171that four separate pieces of information be updated.
172So this is what we might call an update anomaly or an update
173problem.
174Next, let's look at a deletion problem.
175Imagine that our student, listed here as Chip Marino,
176decides to drop out of the university.
177So we need to remove Chip from our list of students.
178Now, look what happens if we delete this row of data.
179When the row of data is deleted, not only
180do we lose the student's information,
181but we've also lost information about the advisor
182and the department in which that advisor works.
183That is, you will notice that Advisor [? Tran ?] is not
184currently assigned to advise any other students.
185So when we delete this row of data,
186we may entirely lose the knowledge
187that Advisor [? Tran ?] even exists.
188And that can be a problem.
189In this case, this is called a deletion problem or a deletion
190anomaly.
191Finally, let's look at an insertion anomaly.
192Let's say that at our university,
193we decide that we want to add a new department.
194So we create a biology department.
195Well, this is fine.
196We add information to our list.
197We now have a biology department and an administrator's
198last name.
199However, we now have all of this missing data in our table.
200There are empty cells.
201There's no information for a student.
202There's no information for an advisor.
203All we have here is information for the department.
204This means that data are missing,
205and we're not efficiently utilizing our available storage
206space.
207These are some of the problems with these lists.
208Now, remember, earlier I mentioned
209that in the business world there are complex relationships
210among different types of data.
211For example, a department may have many employees
212who work in that department.
213Or a product may be assembled from many different components.
214Or a customer might place many different orders
215with our company.
216So there are these natural complexities
217that arise in business.
218And relational databases, as we will see, not only solve
219the problems that are associated with lists,
220but also allow us to model these natural relationships
221among business data.
222Now, I've been using this term relational database.
223So it would probably be a good idea for us
224to consider what a relational database actually is.
225On a very basic level, relational databases
226store data in a table.
227Now, a table is simply a two-dimensional grid
228of data that contains columns, and it contains rows.
229The convention in the relational database world
230is that the columns represent different attributes
231of an entity.
232And each row in the table represents
233an instance of the entity.
234So for example, if I have an employee table
235with an employee ID, an employee name, and an employee telephone
236number, we would list those three attributes--
237ID, name, and telephone number-- as columns in the table.
238And then each row in the table would represent
239an individual employee.
240
241Again, we said that there are these natural relationships
242among different business concepts
243out there in the business world.
244In a relational database, data for each of these concepts
245is stored in its own table.
246So I may have an employee table.
247I may have a department table.
248And then I can create a relationship
249between those tables, which will allow
250me to figure out which employees work in which departments.
251So a good way of thinking about this
252is instead of storing all of the information in one big list,
253we break the list apart into separate pieces
254according to which information is
255associated with which business theme or business concept.
256Therefore, all of the information
257associated with an employee might
258go into an employee table.
259And all of the information associated with a department
260might go into a department table.
261Although, inside a relational database,
262information about different business concepts or business
263themes is stored in separate tables,
264it may be necessary-- say, for reporting purposes-- for us
265to reassemble all of the data from these different tables
266back into a list.
267Now, we can accomplish this in the relational database world
268by performing something called a join operation.
269That is, we join the tables together.
270Now, a very, very important concept for you to understand
271is that in the relational database world,
272we link the records in different tables
273together using matched pairs of values.
274These matched pairs of values allow
275us to associate a row in one table
276with one or more rows in another table.
277For example, if we have a project table
278and we have a customer table, and we
279use an attribute called a customer ID
280to uniquely identify each customer, then what we can do
281is we can place a customer ID attribute in the project table.
282And we can then use the ID of the customer
283in each table to link related rows together.
284Now, of course, you may be listening to this,
285and you may think, why do we want
286to do all of this extra work?
287And it is true that relational databases are more complicated
288than storing data in a list.
289However, as we will see, relational databases
290have many, many advantages over a list.
291Among these, first, a relational database
292minimizes data redundancy.
293In the examples we saw earlier, we
294saw that there exists with lists the possibility
295to have all sorts of redundant data.
296Relational databases eliminate this.
297Relational databases also allow us
298to model the natural relationships that
299exist among business concepts in the real world.
300And there are several other advantages as well,
301including providing a solid basis
302from which to do things like generate reports
303or build user interface forms.
304I mentioned a few minutes ago that there
305are four basic ways of interacting with data-- create,
306read, update, and delete.
307In the relational database world,
308we have access to something called the Structured Query
309Language, often called SQL, or SEQUEL, if you like.
310And this is the primary tool that we
311can use for performing these four
312basic operations on our data.
313In a few weeks, we will begin learning the Structured Query
314Language.
315This is an extremely valuable skill for you to develop.
316If you are able to learn the Structured Query Language well,
317then you will be able to sit down and work
318with virtually any modern database with a very, very
319short learning curve.
320It doesn't matter if you need to work with a SQL Server
321database, or an Access database, or an Oracle
322database, or a DB2, or a MySQL database, or even a database
323for, say, a mobile device.
324Most databases, including all of those that I mentioned,
325support the Structured Query Language.
326So if you can learn the Structured Query Language,
327you will be very well positioned to work
328with almost any database.
329Even though we're still early in our course,
330I want to give you an example of the Structured Query Language
331now so that you can see that it's really not
332that difficult to understand.
333This is certainly not a full scale computer programming
334language.
335The Structured Query Language was designed from the ground up
336to be simple to use and to be simple to understand.
337
338So what we have here is an example
339where we have three tables.
340There is a customer table, and a course table,
341and an enrollment table.
342So let's say that these are art courses.
343We have customers who have signed up to take different art
344courses to see, perhaps, if they can learn to paint
345or they can learn to sculpt.
346Now, although we start with three tables,
347if we need to combine all of the information
348together in order to restore the original list
349structure of the data, we can do that
350by using something called a SQL SELECT statement.
351Here we see an example of such a statement.
352I know this looks complicated.
353But hopefully, you are able to read this even right now
354at the beginning of our course and get
355a good idea for what this statement is supposed to do.
356So let's take a look.
357We are selecting the customer's last name, the customer's
358first name, the customer's phone number, and the date of the art
359course, the amount paid out of the enrollment table,
360the course itself, and the total course
361fee from our three different tables--
362customer, enrollment, and course.
363We are then linking the related records in each table together
364using these matched pairs of values
365that I mentioned earlier.
366In the first case, we are linking the customer table
367to the enrollment table using matched pairs of customer
368numbers.
369And in the second case, we're linking the course table
370to the enrollment table using matched pairs of course
371numbers.
372What this will do for us is restore
373the original list of data.
374So you can see our list.
375It contains the customer's last name and first name,
376customer's phone number, the course date, the amount
377that the customer has paid so far,
378the course they are taking, and the total course fee.
379Now, the SQL statement that we saw was a SELECT statement.
380And that is a statement which just pulls data
381from the database.
382There are many other things that we
383can do with SQL, like deleting data, inserting new data,
384updating data in the database, and so forth.
385And we will learn how to do all of those various tasks
386when we study the Structured Query Language in more
387depth in a few weeks.
388Now I would like to turn our attention
389to some conceptual information.
390Let's focus on a database system.
391It's important to remember that a database system is not
392just the database itself.
393Rather, it exists in a broader hierarchy, which
394includes users, application software programs which
395rely on the database, something called a database management
396system, a DBMS, and then, of course, the database itself.
397The broader picture of a database system
398includes these four components.
399Here we see the relationships between the four components
400of a database system.
401On the far left, we see users.
402Now, users are people, just like you
403or me, who might need to use the data in a database.
404But a user doesn't necessarily have to be a human being.
405It is possible, for example, for other software programs
406to be users of our database as well.
407Now, you will see that users interact
408with database applications.
409These are the programs that rely on the database.
410So if we are a human user, we might, for example,
411use a website, which interacts with a database.
412That website could be considered a database application.
413You will notice that the database application
414talks to the DBMS.
415The DBMS, the database management system,
416acts as a gatekeeper.
417All of the information flowing in or out of the database
418must travel through the DBMS.
419It is a critical mechanism for maintaining the quality
420of the data in the database.
421We do not allow our users or our database applications
422to directly access the data in the database.
423To work with the data, to access those data,
424database applications must go through the database management
425system.
426And then, finally, far over on the right side of our screen,
427we see the database itself.
428And this is where we actually store the raw data
429in their separate tables.
430So what do the users of a database system do?
431Well, they do many different things.
432You and I are users of database systems
433on almost a daily basis, even if we don't realize it.
434So we can look up our checking account balances online.
435We can produce reports.
436We can seek information on Wikipedia.
437All of these are examples of us as users interacting
438with database applications.
439A few interesting things to note about the database
440itself is that it is a self-describing collection
441of related records.
442What this means is that the database does not just
443contain the data, but it also contains a definition
444of its own structure.
445Databases also contain metadata.
446The simplest way to understand metadata
447is that metadata are data which describe other data.
448And in the context of the database world,
449metadata are data which describe the structure
450of the data in the database.
451So for example, I may say that an employee table contains
452an attribute called Employee ID, and that ID value is
453an integer.
454That knowledge that the Employee ID value is an integer
455is metadata.
456And of course, databases allow us to establish
457relationships between tables.
458So that means the tables within a database
459are related to each other according to the relationships
460that we establish.
461So it's important to remember that a database does not just
462contain the data that we have put into it.
463A database also contains metadata,
464which describe our data.
465It contains, or potentially can contain,
466what we might refer to as overhead data.
467These are tables of data that are
468used to improve performance or track
469the status of the database or how users are interacting
470with the database.
471And one of the major types of these overhead data
472are called indexes.
473Later in our course, we will examine database indexes.
474And of course, databases can also
475contain application metadata.
476So these are data values that are
477used by the applications which rely upon the database.
478An application, for example, might store user preferences
479in the database.
480Now, to reiterate, a DBMS, or a database management system,
481serves as a gatekeeper or an intermediary
482between the database and the applications
483which rely upon that database.
484The purpose of the DBMS is to control and manage
485the operations of the database.
486The DBMS is a special kind of software program, which
487is used to not only create databases,
488but also to process and administer those databases.
489So a DBMS is a type of application program.
490But it is not the database.
491The DBMS interacts with the database,
492but it itself is not the database.
493What, then, can a database management system do for us?
494Well, the answer is it can do many different things,
495as we see here on your screen.
496We can use the DBMS to create a new database.
497And then, within that database, we
498can create tables and establish relationships between tables.
499We can read, write, edit, and delete data using the DBMS.
500We can establish rules which regulate and control
501the data in the database.
502The DBMS provides an additional layer of security for us.
503And it also provides mechanisms through which
504we can backup our data and recover our data if anything
505ever goes wrong.
506So the DBMS is a very critical piece of software.
507Now, businesses often operate with many different types
508of business rules or constraints.
509And one of the great things about working with databases
510is that the DBMS can establish and enforce
511many different types of constraints.
512One of the most useful types of constraint
513that a DBMS can enforce is called a referential integrity
514constraint.
515The purpose of a referential integrity constraint
516is to ensure that values that we are entering in one table
517have matching values in another table.
518Now, that's a very abstract way of trying
519to convey a simple concept.
520As an example, imagine that we have a customer
521table and a project table.
522And we are trying to enter information
523about a new project in the project table.
524And we specify that the customer ID
525that is associated with this project is customer number 5.
526Referential integrity means that the database will actually
527check in the customer table to see
528if customer number 5 exists.
529If customer number 5 does not exist,
530then the database will not allow us to add that customer
531ID to our project table.
532And in this way, I hope you can see
533it helps to maintain the quality of the data
534in the database, what we call data integrity.
535
536Now, the fourth component of our database system
537were database applications.
538And database applications are simply
539those software programs that need to use or work
540with data in the database.
541So these might be data driven websites.
542They might be mobile apps.
543They might be desktop applications
544or productivity software.
545Just remember that these applications are not
546allowed to talk to or use the database directly.
547But rather, they must go through the DBMS
548to work with or use those data.
549Finally, for this lecture, I want to just talk about some
550of the different types of database systems
551that are available.
552First, we can consider something called a personal database
553system.
554These are very small databases.
555They are typically used by a single user at a time.
556They are not complicated.
557They are simple in design.
558And they're not really suitable for supporting
559medium or large businesses.
560They might be used by very small businesses just out
561of convenience.
562But as those companies grow, they will, at some point,
563certainly need to choose a more robust database solution.
564An example of a personal database system
565might be Microsoft Access.
566Convenient, easy to use, but not very powerful.
567One of the interesting characteristics
568of personal database systems is that they often
569contain the capability to build a database
570application, or at least interfaces to the database
571within the DBMS.
572So in Microsoft Access, for example,
573I can create data entry forms or generate
574reports for my Microsoft Access database.
575That is, it kind of combines the database application
576and the database management system into a single entity,
577conceptually speaking.
578On the other hand, we have a class of database systems
579that are called Enterprise-Level database systems.
580These are databases that are used by medium and large size
581companies to support their operations.
582And it is Enterprise-Level databases
583that we will learn to use and interact with in this class.
584Compared to personal database systems,
585Enterprise-Level databases are designed
586to support many different users at the same time.
587And this could be thousands or tens of thousands of users
588all at the same time.
589Enterprise-Level databases also serve data
590to more than one application.
591For example, if we are a retailer like Target
592or Walmart, we may have a database
593that provides data to our website
594and also provides data to the dashboards that
595are used by our managers.
596Enterprise-Level databases are commonly
597spread across multiple physical computing devices.
598So many different computers.
599So a single database might be running
600on many physical servers.
601Enterprise-Level databases can also
602be geographically distributed.
603So I might have part of my database
604in my corporate offices in Los Angeles,
605another part in Beijing, another part in Berlin in Germany.
606And Enterprise-Level database management systems
607support more than one database.
608So within the same database management system,
609we might, for example, have our operational database,
610which allows us to run our business
611and keep track of transactions in real time.
612And we might also create a data warehouse or data marts.
613And they can all be managed by the same Enterprise-Level
614database management system.
615These are large databases.
616As an example, consider an ERP system like SAP.
617A typical ERP implementation will
618have thousands of tables, all related
619to each other in some way.
620An Enterprise-Level database management system
621can handle this level of complexity with relative ease.
622So here we see a graphical example
623of these Enterprise-Level database systems.
624The DBMS, again, serves as an intermediary
625or a gatekeeper between the databases
626and all of the various database applications
627that want to rely on the data in those databases.
628And they can be mobile applications,
629applications that are written in Java, or C#,
630or even web applications.
631It might be ASP.NET or PHP apps.
632Just as examples of some commercial DBMS products,
633again, I mentioned that personal or desktop DBMS products
634might include Microsoft Access.
635Whereas, Enterprise-Level products
636include SQL Server, which is what
637we will be using in this class, Oracle, MySQL, and DB2.
638MySQL, if you're interested, is a completely free and open
639source database.
640Well, my friends, thus ends our first lecture
641for our database class.
642I hope you learned something.
643And come back soon, and we'll begin our exploration
644of our next topic.
645Until then, have a great day.