Database Basics

The simplest definition of a database is that it is one or more sets of persistent, related data. By that definition, text files and even a list in Excel could qualify as a database. However, generally when the term database is used, it also refers to the software used to create and manage the database. This type of software is referred to as database management systems (DBMS). Most databases created using a DBMS are relational databases. A relational database is a database that allows the database developer to create relationships between tables. For example, if you have a salesperson table and an orders table, you can define a relationship between the tables that would associate a salesperson with the orders she generated. Further, the database developer can specify certain rules that should be enforced by the relationship. For example, what would happen if you tried to add a record to the orders table that wasn't associated with a salesperson? You can create relationships that would either allow or disallow the record from being added. Benefits of a relational database include the following:

Relational integrity You can ensure that the data in a relational database conforms to certain business rules (i.e., all orders must be associated with a salesperson).

Data integrity A relational database allows you to define rules and data types specific to a particular field in a table. For example, you could specify that a particular field can only contain integers and that it can't contain a null value. Further, you can define constraints such that the field must be within a certain range of values.

Scalability Most database products are designed to handle a great deal more data than Excel can handle. For example, Excel has 65,536 rows. Have you ever used all of them? If so, what happened? The size of the spreadsheet increased dramatically and your computer's performance probably slowed to a crawl. Even basic database products such as Microsoft Access can handle hundreds of thousands of rows. Unlike with Excel, a database does not necessarily load the entire contents of the database into memory.

Performance Databases offer increased data retrieval speed, faster and more comprehensive sorting capabilities, and increased data manipulation performance.

Stability Many database products have extensive logging, backup, and transactional features that help ensure the integrity of a database in the event of software or hardware failure.

Collaboration Though Excel offers some multiuser capabilities, data in a database can be accessed by many people at once. By storing data in a central location (rather than distributing it among many spreadsheets), your organization gains the benefit of making the data available to many different kinds of applications in addition to Excel.

Once you've decided to use a database, you'll find many database products on the market from which to choose. There are not any hard and fast guidelines for choosing a database product. Many factors may weigh into the final decision including these:

♦ Expected database usage

♦ Number of concurrent users

♦ Amount of data to be stored

♦ Skill set of the technical staff

♦ The cost of the software

For learning purposes and departmental databases, Microsoft Access is a common choice. Access is an easy-to-use and widely available (it was probably installed along with Excel) database. Access and Excel are tightly integrated, which means that you have more options for moving data between the two applications. For example, you can copy/paste entire worksheets into Access and Access will create a table out of them.

Microsoft SQL Server is more of an industrial strength database. You can use SQL Server for everything from a personal database using SQL Server Personal Edition, to enterprise databases using SQL Server Standard or Enterprise Edition. SQL Server is not as easy to use as Access, but it offers much more in terms of database development flexibility, database management features, security, and scalability.

Of course, you have tons of other choices from other vendors including IBM, Oracle, and Sybase. You'll even come across open source products such as MySQL. In theory, you'll be able to write applications that will work with any of these databases. That said, you may encounter slight functional differences between products, so the actual mechanics involved may vary somewhat from product to product.

NOTE The examples in this chapter use Microsoft Access. The final section of the chapter demonstrates how to retrieve data from Microsoft Analysis Services, a special kind of database product that ships with Microsoft SQL Server.

Data in a database is manipulated using Structured Query Language (SQL). Although your efficiency will improve as you learn to craft SQL statements by hand, most database products ship with some sort of visual query tool that allows you to build SQL statements (aka queries) visually.

Though you can get by using visual query tools to build queries, I feel it is also important to learn how to write them by hand. One of the reasons it is important to learn the syntax of SQL so that you can write queries manually is that, as a developer, you'll typically need to write SQL statements that you pass to the database via an intermediate mechanism. If you can't write queries by hand, you have to build them in your visual query tool and then copy/paste them into the VBE. In the process of copying and pasting, chances are you'll need to do a little rearranging of the query once it is in the VBE. This whole process is rather inefficient.

NOTE The intermediate mechanism that will be presented in this chapter is known as ActiveX Data Objects (ADO). ADO is a set of objects that you can use programmatically to work with a database. Basically, it is an abstraction layer between your code and the database. One of the benefits of an abstraction layer is that it shields you from the necessity of becoming intimately familiar with the particular details of working with different database products. Instead you learn how to use ADO and let your ADO provider worry about handling the details. I'll go over this in more detail later in the chapter.

Another reason it is beneficial to learn how to write queries manually is that visual query tools don't always write the best SQL for the task at hand. In fact, sometimes you won't even be able to design a given query using a visual query tool. Armed with a better understanding of writing your own SQL, you can easily get around these query tool shortcomings. Without this understanding, you're at the mercy of your tool of choice.

0 0

Post a comment