Managing Files: Basic Concepts. A database is a logically organized collection of related data designed and built for a specific purpose. Data is organized in a data storage hierarchy of increasingly complex levels: bits, bytes (characters), fields, records, files, and databases. A bit is the smallest unit of data that the computer can store in a database – represented by 0 for off or 1 for on. A character is a letter, number, or special character. A field consists of one or more characters (bytes). A record is a collection of related fields. A file is a collection of related records. A database is, as mentioned, an organized collection of integrated files. Important to data organization is the key field, a field used to uniquely identify a record so that it can be easily retrieved and processed.
Files are given names—filenames. Filenames also have extension names, three-letter additions such as .doc and .txt. Among the types of files are the following. (1) Program files are files containing software instructions. The two most important are source program files, which contain instructions in the form written by the programmer, and executable files, which contain instructions that tell a computer how to perform a particular task. (2) Data files are files that contain data, and are categorized into two types: master files, which contain relatively permanent records that are updated periodically; and transaction files, that hold all changes to be made to the master file. (3) Other common files are ASCII files, image files, audio files, animation/video files, web files, desktop publishing, drivers, and Windows operating system files.
Two main ways in which a storage device accesses stored data are sequential access and direct access. Sequential storage means that data is stored and retrieved in sequence, as is the case with magnetic-tape storage. Direct access storage means that a computer can go directly to the information you want, as in a CD player; hard disks and other types of disks are of this nature.
Whether on magnetic tape or disk, data may be stored offline or online. Offline storage means that data is not directly accessible for processing until the tape or disk has been loaded onto an input device. Online storage means that stored data is randomly (directly) accessible for processing.
Database Management Systems. A database management system (DBMS) consists of programs that control the structure of a database and access to the data. The benefits of databases are reduced data redundancy, improved data integrity, increased security, and ease of data maintenance. Databases can be classified as two types. (1) An individual database, or single-user database, is a collection of integrated files used by one person. It could be a personal information manager, which helps people keep track of information they use daily. (2) A multiuser database, or centralized database, is shared by users in one organization in one location. A distributed database is shared by many users but is stored on different computers operating as equals in different locations. Large databases are managed by a database administrator, who coordinates all related activities and needs for an organization's databases.
Database Models. Databases can be organized in four ways. (1) In a hierarchical database, fields or records are arranged in related groups resembling a family tree, with child (lower-level) records subordinate to parent (higher-level) records. (2) A network database is similar to a hierarchical database but each child record can have more than one parent record. (3) A relational database relates, or connects, data in different files through the use of a key field. Structured query language is an easy-to-use computer language for making queries to a relational database and for retrieving selected records. One feature of most query languages is query by example (QBE), which allows users to ask for information in a relational database by using a sample record to define the qualifications they want for selected records. (4) An object-oriented database uses objects, software written in small, reusable chunks, as elements within database files. An object consists of data in any form and instructions on the action to be taken on the data.
Features of a Database Management System. A database management system may have a number of components. (1) A data dictionary, also called a repository is a procedures document or disk file that stores the data definitions or a description of the structure of data used in the database. (2) DBMS utilities are programs that allow you to maintain the database by creating, editing, and deleting data, records, and files. (3) A report generator is a program for producing an onscreen or printed document from all or part of a database. (4) Different users are given different user access privileges, as determined by the database administrator. (5) A DBMS should have system recovery features, so the database administrator can recover the contents of the database in the event of hardware or software failure. Four approaches are: mirroring, with two copies of the database in different locations; reprocessing, in which the processing can be redone from a known past point; rollforward, a variant on reprocessing; and rollback, which is used to undo unwanted changes to the database.
Databases & the New Economy: E-Commerce, Data Mining, & B2B Systems. Databases underpin the so-called New Economy of computer, telecommunications, and internet companies in three ways: e-commerce, data mining, and business-to-business (B2B) systems.
E-commerce, or electronic commerce, is the buying and selling of products and services through computer networks; an example is Amazon.com.
Data mining is the computer-assisted process of sifting through and analyzing vast amounts of data in order to extract meaning and discover new knowledge. Data mining begins with acquiring data and cleaning it of errors to yield cleaned-up data and a version of it called meta-data (which shows its origins and transformations), which are then sent to a data warehouse, a special database of cleaned-up data and meta-data. Data mining is used in applications ranging from marketing to health to science.
Business-to-business systems (B2B systems) allow businesses to sell to other businesses, using the internet or private network to cut transaction costs and increase efficiencies.
The Ethics of Using Databases: Concerns about Accuracy & Privacy. In morphing, a film image is altered pixel by pixel, so that the image becomes something else. This manipulation of digitized images and sounds raises some ethical issues. Sound performances can be misrepresented, photos may be manipulated, and video and TV images may be altered in undetectable ways and all stored in a database.
Databases are also limited in accuracy and completeness, since not all facts can be found in a database, nor are all data items true. In addition, databases raise several concerns about privacy. Finally, those who own databases may be in a position to monopolize information.