Database concepts
Database-management system (DBMS) |
A collection of interrelated data and a set of programs to access those data |
Database-system applications |
|
File-process systems |
The typical file-processing system is supported by a conventional operating system The system stores permanent records in various files, and it needs different application programs to extract records from, and add records to, the appropriate files |
Data inconsistency (*drawback of file system) |
The various copies of the same data may no longer agree ~比如一个学生选了双学位,那么在两个学位下面都要存储他的信息,造成冗余 ~而且,如果学生的地址改变了,那么可能一处进行了修改,一处没有,造成不一致 |
Consistency constraints (*drawback of file system) |
When new constraints are added, it is difficult to change the program to enforce them |
Data abstraction |
The need for efficiency has led designers to use complex data structures to represent data in the database Since users are stupid, developers hide the complexity from users through several levels of abstraction:c Physical level: how the data are actually stored Logical level: what data are stored in the database View level: users only need part of the database, so we need to simplify their interaction with the system |
Instance |
The collection of information stored in the database at a particular moment 实例,类比于某个时刻某个变量的值 |
Schema |
The overall design of the database 模式,类比于变量的声明 |
physical schema |
The database design at the physical level |
logical schema |
The database design at the logical level |
Physical data independence |
Although implementation of the simple structure at logical level may involve complex physical-level structures, the user of the logical level does not need to be aware of that |
Data models |
Underlying the structure of a database is the data model: a collection of conceptual tools for describing data, data relationships, data semantics, and consistency constraints A data model provides a way to describe the design of a database at the physical, logical and view levels |
entity-relationship model (chpt 7) |
E-R data model uses a collection of basic objects, called entities, and relationships among these objects An entity is a "thing" or "object" in the real world |
relational data model (chpt 2-6) |
The relation model uses a collection of tables to represent both data and the relationships among those data Tables are also known as relations |
object-based data model |
Extending the E-R model with notions of encapsulation, methods (functions), and object identity |
semistructured data model |
Permits the specification of data where individual data items of the same type may have different sets of attributes |
Database languages |
SQL is a database language |
data-defining language(DDL) |
To specify the database schema by a set of definitions Also used to specify additional properties of the data |
data-manipulation language(DML) |
A language that enables users to access or manipulate data as organized by the appropriate data model |
query language |
|
Metadata |
元数据是“数据的数据”,用于描述数据的特征或性质以及该数据的内容,是数据库的一部分 ~例如(数据名称、定义、长度)或者(数据来源、存储位置、拥有者) |
Application program |
A program that is used to interact with the database |
Normalization |
To generate a set of relation schema that allows us to retrieve information without unnecessary redundancy, yet also allows us to retrieve information easily |
Data dictionary |
DDL gets some input and returns output The output of DDL is called data dictionary, which contains metadata - that is, data about data The data dictionary is considered to be a special type of table that can only be access and update by the database system itself (not a regular user) |
Storage manager |
Provides the interface between the low-level data stored in the database and the application system and queries submitted to the system Including: authorization and integrity manager; transaction manager; file manager; buffer manager |
Query processor |
The query processor components include: DDL interpreter: interprets DDL statements DML compiler: translate DML statements into a low-level construction that engine understands Query evaluation engine: executes low-level instructions generated by the DML compiler |
Transactions |
事务 A transaction is a collection of operations that performs a single logical function in a database application |
atomicity |
All-or-none: Fund transferred or not at all |
failure recovery |
Detect system failures and restore the database to the state that existed prior to the occurrence of the failure |
concurrency control |
The consistency of data may no longer be preserved even though the individual transaction is correct Concurrency-control system is to control the interaction among the concurrent transactions |
Two- and three-tier database architectures |
2-tier: the application resides at the client machine 3-tier: the client machine acts as merely a front end and does not contain any direct database calls Instead, the client communicates with an application server through a forms interface; then the application server in turn communicates with the database system 3-tier is more suitable for large applications, and those applications running on the web |
Data mining |
Semiautomatically analyzing larger database to find useful patterns |
Database administrator (DBA) |
A person who has the central control of both the data and the programs that access those data |
Data |
The data describe one particular enterprise 数据是可以记录和存储在计算机介质上的关于对象和事件的事实 Or 数据是在用户环境中具有意义和重要性的对象和事件的存储表示 |
Information |
人能看懂的数据 |
Metadata |
元数据是“数据的数据”,用于描述数据的特征或性质以及该数据的内容,是数据库的一部分 ~例如(数据名称、定义、长度)或者(数据来源、存储位置、拥有者) |
Database |
Database is the collection of data, usually contains information relevant to an enterprise 长期存储在计算机内,有组织的,可共享的大量数据集合 |
Database-management system (DBMS) |
互相关联的数据集合(database)和一组用于访问这些数据的程序 目标:提供一种方便、高效地存储数据库信息的途径 |
Database system (DBS) |
在计算机系统中引入数据库后的系统构成 ~包括用户、应用系统、应用开发工具、操作系统、数据库等等 |
Major disadvantages of file system |
Data redundancy and inconsistency Difficulty in accessing data Data isolation (multiple files and formats) Integrity problem (integrity constraints) Atomicity of updates Concurrent access by multiple users Security problems (hard to provide some instead of all data) |
Database schema |
数据库模式,也被称为数据库内涵,是指对数据库的描述 是在设计过程中规定的,不会经常改变 |
Database instance |
某一时刻数据库中的数据,也被称为数据库的外延or数据库的状态 数据库的实例会经常变化 |
Relation schema |
The database schema describes how the tables (relations) connect and are built, while a relation schema is essentially the schema for a table |
Relation instance |
|
Data abstraction (3-layer architecture) |
对应书上第七页,实质上是data abstraction的三个层次 |
physical level (internal schema) |
内模式、内层、物理层。表示数据库的物理存储结构,和数据存储的路径及细节。 |
logical level (conceptual schema) |
概念模式、中间层、概念层。用于为用户描述数据库的结构,注重实体、数据类型、用户操作和约束,并且隐藏了存储的细节。 使用表示型数据模型(E-R模型)来描述概念模式 使用实现型数据模型(OO模型)来进行概念模式设计 |
view level (external schema) |
外模式、外层、视图层。描绘了用户感兴趣的部分数据库,隐藏其他部分。 使用表示型数据模型(E-R模型)加以实现 |
三层模式的主要目的是保证数据的独立性,即对底层的修改不会对高层造成影响。 |
|
physical data independence |
物理数据独立性:概念模式不会受内模式变化的影响。 |
logical data independence |
逻辑数据独立性:外模式不会受概念模式变化的影响。(x独立:x变化但上一层不受影响) |
Database languages |
|
procedural languages |
Procedural DMLs require a user to specify what data are needed and how to get those data |
declarative languages |
Declarative DMLs (non-procedural DMLs) require a user to specify what data are needed without specifying how to get those data |
SQL: |
|
DDL |
a data-definition language (DDL) to specify the database schema |
DML |
a data-manipulation language (DML) to express database queries and updates What DML can do: Retrieval, Insertion, Deletion, Modification |
Query language (Select) |
DML能做的是增删改(取),query能做的是(查),但是可以混为一谈 query是请求查询的语句,query language是DML中的取数据操作 A query is a statement requesting the retrieval if the information A query language is the portion of DML that involves information retrieval |
Types of DBS users |
Database users: 4 types differentiated by the way they expect to interact with the system: Naïve users: too young too naïve, use interfaces Application programmers: they use rapid application development tools to program Sophisticated users: interact with the system without writing programs, but they use database query languages to form their request ~an analyst Specialized users: write specialized database systems Database administrators |
Functions of DBAs |
Schema definition Storage structure and access-method definition Schema and physical-organization modification Granting of authorization for data access Routine maintenance |
DBMS architectures |
|
ACID of transactions and meaning |
Atomicity: either all operations of the transaction are reflected properly in the database, or none are Consistency: execution of a transaction in isolation (no other concurrent transaction) preserves the consistency of the database Isolation: though Ti and Tj execute concurrently, it appears to Ti that either Tj has not begun or has finished That is, each transaction is unaware of other transactions executing concurrently Durability: after a transaction completes successfully, the changes it has made to the database persist, even if there are system failures |