Journal

=Journal= A weekly summary of the database concepts we learned at class. Go to Home, Design, or Implementation instead.

Information Resource Management (IRM) - the database environment
Data and information are resources as precious as time and money. IRM as a concept guides the use of a company's information similar in fashion to physical resources like manpower and finances.
 * 1) Resources flow into and out of a company in a typical input-output manner. A student archive may take in a set of students' names, classes, and grades, and output various reports based on these data.
 * 2) //Physical resources// such as personnel becomes difficult to monitor as a business grows, so //conceptual resources// (which is the relevant data and information) are used.
 * 3) Both types of resources can be managed the same way.
 * 4) Data management consists of: data acquisition, protection, quality assurance and removal, and organizational commitment is required to perform these basic tasks.

What is a database?
Data ≠ information. Data -> raw facts, with no inherent meaning. A name is just a noun until it is connected to a person, object or place, and then it becomes... Information -> processed data suitable for human interpretation. To convert data into useful information, after acquisition, the data is stored, manipulated, retrieved, and distributed according to need. To support this functionality, a database is employed. A database is a "shared collection of logically related data, designed to meet the information needs to multiple users in an organization." Two generic database architectures are used:
 * 1) //Centralized// - all data is found in a single site, and may be accessed via networks or other communication protocol. While accessing and updating data is simplified, it is a point of failure, depending on the availability of the resource at the central location.
 * 2) //Distributed// - the database is logically just one unit, but the actual data it holds are physically spread across many computers. Physically managing this type of database entails some complexity.
 * 3) //Homogeneous Database// - a type of distributed database wherein the technology (computer OS, data models, data management systems, data definitions/formats) used on the separate systems is the same or is similar to others used.
 * 4) //Heterogeneous Database// - the technologies may vary.

Information System Architecture (ISA) - developing a database system
The Information System Architecture is a framework that is the basis for storage, planning, development and use of information a business uses.

The above table (as seen in "Database Systems") promptly summarizes the ISA framework. The columns, composed of //Data//, //Processes//, and //Network//, are the major components of the system. Data is the "what" of the information system: the entities and relationships. Processes are the steps that produce output from given data, representing the "how" of the system. The Network describes "where" data is stored and computations are performed. The rows represent the architectural layers of an organization's information system, namely the business scope (an overview), business model (definition of entities and relationships), information system model (detailed business data, process flow, and network definitions), technology model (conversion from model to design), technology definition (conversion from model to actual statements that generate the actual information system), and the information system itself (manages, operates and uses the completed system).

Information Engineering Methodology (IEM)
A framework provides necessary steps up to completing a model. Methodology then steps in, presenting a series of steps to accomplish a design goal. Each methodology supports different modelling tools (e.g., Computer-Aided Software Engineering or CASE) and disciplines. IEM is one such formal methodology; it emphasizes a top-down, data-driven approach. It is also compatible with the ISA framework. It is divided into four frameworks:
 * 1) //Planning phase// - the information technology is aligned with an organization's business strategies. This corresponds to the ISA framework's Business Scope layer.
 * 2) First, strategic planning factors must be identified; they are the business goals, critical success factors and business problem areas.
 * 3) Second, corporate planning objects must also be identified: organization units, locations, business functions and entity types.
 * 4) Lastly, an enterprise model must be developed.
 * 5) //Analysis/Requirements Engineering Phase// - the detailed specifications for the information system are developed. This maps to the Business Model and Information System Model layers of the ISA framework.
 * 6) A conceptual model is built to capture the organization's structure of data, usually using Entity-Relationship Diagram or ERD.
 * 7) A process model is constructed to provide a logical description of the processes performed by organizational functions and the flow of data between processes. Processes, which convert physical or data inputs to output, are extracted from decomposed business functions, and are modelled using Data Flow Diagrams, or DFD.
 * 8) //Design// - the information system's target technologies are transformed from the conceptual and process model. The database design (both logical and physical) and the process design (the logic to be used) is created.
 * 9) //Implementation// - here, the information system is constructed and installed according to the plans and designs. This includes coding, testing, and documentation.


 * Reference: Solamo, Ma. Rowena C. Database Systems. 2008.

Primary Keys, Foreign Keys, and Attributes of the Organic Shop Database
Source: [| 4.1 The Organic Shop Database Script.zip]



Shown here (quite messily) are the nine tables of the Organic Shop database. These are, in alphabetical order: CONSULTANT, EMPLOYEE , HOURLY_EMPLOYEE , INVENTORY , ITEM , ORDER_DETAILS , ORDERS , SALARIED_EMPLOYEE , and STORE.

Each of these tables represent an **//entity type//**, which can be a person (EMPLOYEE, HOURLY_EMPLOYEE , SALARIED_EMPLOYEE , CONSULTANT ), place (STORE ), object (ITEM ), event (ORDERS ) or concept (<span style="font-family: 'Lucida Console',Monaco,monospace;">ORDER_DETAILS , <span style="font-family: 'Lucida Console',Monaco,monospace;">INVENTORY ) in the user environment about which the organization wishes to maintain data. The reason that we must create these is that we should not overload a single table with too much information. Putting everything in a single table results to a less logical structure, and is a sure-fire way to get a low grade in CS 165. (Don't do it!)

For the next concept, here is the legend again for your convenience:



The **Attributes** of the entity types are shown in blue. Attributes are //properties or characteristics of an entity// that is of interest to the organization. If I own an organic shop (however unlikely that may be), I will be interested in knowing my employees' names, for example.

The **Primary Keys** are shown in orange (and sometimes yellow ). These are attributes that have been selected as //the unique identifier of an entity type//.

<span style="font-family: 'Lucida Console',Monaco,monospace;">Number is a primary key for five entity types, four relating to employee numbers (which may be compared to our student numbers), and one relating to a store number. The entity type <span style="font-family: 'Lucida Console',Monaco,monospace;">ITEM has <span style="font-family: 'Lucida Console',Monaco,monospace;">code as a primary key, which makes sense because we encounter barcodes on products on a daily basis. To uniquely determine an <span style="font-family: 'Lucida Console',Monaco,monospace;">ORDER, instead of the <span style="font-family: 'Lucida Console',Monaco,monospace;">customerid , who may order more than once, and instead of the <span style="font-family: 'Lucida Console',Monaco,monospace;">orderdate , which may incur more than one order on the day, we create and assign an <span style="font-family: 'Lucida Console',Monaco,monospace;">ordernbr to every element. As long as we will not change its value over time, this will not result to world destruction and your database will be fine.

The <span style="font-family: 'Lucida Console',Monaco,monospace;">ORDER_DETAILS and the <span style="font-family: 'Lucida Console',Monaco,monospace;">INVENTORY entity types show that when one does not suffice, it is possible and it is sometimes desirable to assign 2 (or more) primary keys. The idea is that an **//entity instance//** is unique if the combination of its primary key values is distinct from every other instance of the same type.

The items in green (and sometimes yellow ) are the **Foreign Keys**. A foreign key is an attribute or group of attributes that is //the primary key of another entity//.

<span style="font-family: 'Lucida Console',Monaco,monospace;">INVENTORY 's <span style="font-family: 'Lucida Console',Monaco,monospace;">item_code is a foreign key because it is the primary key of the entity type <span style="font-family: 'Lucida Console',Monaco,monospace;">ITEM (whose primary key, you can tell is <span style="font-family: 'Lucida Console',Monaco,monospace;">code ). The astute reader will see that <span style="font-family: 'Lucida Console',Monaco,monospace;">EMPLOYEE 's <span style="font-family: 'Lucida Console',Monaco,monospace;">manager is a foreign key because it is the primary key of <span style="font-family: 'Lucida Console',Monaco,monospace;">EMPLOYEE itself!

Finally, as you can tell by the legend, having foreign keys as primary keys is perfectly fine and acceptable. :)

This concludes the section on primary keys, foreign keys, and attributes. For more information, read the [|JEDI Database Systems]authored by Ma. Rowena C. Solamo.


 * Configuring JavaDB on Netbeans**

media type="file" key="Configuring JavaDB.mp4" width="780" height="780" align="center"

This video is prepared by the group.

<span style="font-size: 1.3em; margin: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 5px;">**Week 2-3 - Introduction to Database Systems**
**Entity-Relationship Modeling** is a technique for defining the information needs of an organization. The focus is on the __entities__ (important things in an organization), their __attributes__ (or properties), and their __relationships__. The ER model presents these data logically and aids objective decisions unhindered by procedural constraints (such as storage and access methods, etc). **ER Objects:** Also, attributes may be derived (these are called derivative data). They may be multi-valued (contains more than one value per entity instance). **Other ER Notes:** **Situation Analysis** Situation - a well-defined set of circumstances that can be described using a sufficiently complete natural language. The entities, attributes and relationships can be discovered by looking at the **nouns**, **verbs**, and **adjectives** (respectively) in an interview transcript. Gerunds become many-to-many relationships. Multi-valued attributes and repeating groups can be modeled by using the "has" relation with one-to-many cardinality in an ERD. As a result of generalization (concept that some things are subtypes of other things) and categorization (concept that things come in various subtypes), the __supertype-subtype__ relation is defined with an "is-a" relation. This relation is said to be //exclusive// if all the subtypes are mutually exclusive and all instances of the supertypes is categorized as one subtype; otherwise, if the subtypes overlap, the relation is //nonexclusive//. They may also be classified as //exhaustive,// if all subtypes are listed in the ERD, or //nonexhaustive//, if only some but not all subtypes are defined.
 * 1) <span style="direction: ltr; font-family: Calibri; font-size: 11pt; margin-bottom: 0in; margin-left: 0.375in; margin-top: 0in; unicode-bidi: embed; vertical-align: middle;">__Entity__- a person, place, object, event, or concept used in a business. A singular noun, enclosed in a rectangular box.
 * <span style="direction: ltr; font-family: Calibri; font-size: 11pt; margin-bottom: 0in; margin-left: 0.375in; margin-top: 0in; unicode-bidi: embed; vertical-align: middle;">Entity Type/Entity Class - collections of entities with common properties.
 * <span style="direction: ltr; font-family: Calibri; font-size: 11pt; margin-bottom: 0in; margin-left: 0.375in; margin-top: 0in; unicode-bidi: embed; vertical-align: middle;">Entity Instance - an occurrence of an entity type.
 * 1) <span style="direction: ltr; font-family: Calibri; font-size: 11pt; margin-bottom: 0in; margin-left: 0.375in; margin-top: 0in; unicode-bidi: embed; vertical-align: middle;">__Attribute__ - properties or characteristics of an entity in connection with the organization. Enclosed in ellipses and connected to its associated entity with a line.
 * <span style="direction: ltr; font-family: Calibri; font-size: 11pt; margin-bottom: 0in; margin-left: 0.375in; margin-top: 0in; unicode-bidi: embed; vertical-align: middle;">Candidate Key - an attribute or attributes that uniquely identifies an instance of an entity
 * <span style="direction: ltr; font-family: Calibri; font-size: 11pt; margin-bottom: 0in; margin-left: 0.375in; margin-top: 0in; unicode-bidi: embed; vertical-align: middle;">Composite Key - a candidate key with more than one attribute
 * <span style="direction: ltr; font-family: Calibri; font-size: 11pt; margin-bottom: 0in; margin-left: 0.375in; margin-top: 0in; unicode-bidi: embed; vertical-align: middle;">Primary Key - a candidate key that uniquely identifies an entity //type.// This attribute is chosen such that its value does not change over the lifetime of the entity instance. It must never be null and must always have a valid value (or values, for composite keys). May be assigned by the system or by the user. Underlined.
 * <span style="direction: ltr; font-family: Calibri; font-size: 11pt; margin-bottom: 0in; margin-left: 0.375in; margin-top: 0in; unicode-bidi: embed; vertical-align: middle;">Foreign Key - attribute(s) that is/are the primary key/s of another entity. Dash-underlined.
 * 1) <span style="direction: ltr; font-family: Calibri; font-size: 11pt; margin-bottom: 0in; margin-left: 0.375in; margin-top: 0in; unicode-bidi: embed; vertical-align: middle;">__Relation__ - associations between entities. A verb (phrase) inside a diamond or along a connecting line.
 * 1) <span style="direction: ltr; font-family: Calibri; font-size: 11pt; margin-bottom: 0in; margin-left: 0.375in; margin-top: 0in; unicode-bidi: embed; vertical-align: middle;">__Degree of a relationship__ - defines how many entities participate in a relation.
 * 2) <span style="direction: ltr; font-family: Calibri; font-size: 11pt; margin-bottom: 0in; margin-left: 0.375in; margin-top: 0in; unicode-bidi: embed; vertical-align: middle;">Unary - relationship between instances of one entity type. A "recursive" relation.
 * 3) <span style="direction: ltr; font-family: Calibri; font-size: 11pt; margin-bottom: 0in; margin-left: 0.375in; margin-top: 0in; unicode-bidi: embed; vertical-align: middle;">Binary - between instances of two entity types.
 * 4) <span style="direction: ltr; font-family: Calibri; font-size: 11pt; margin-bottom: 0in; margin-left: 0.375in; margin-top: 0in; unicode-bidi: embed; vertical-align: middle;">Ternary - between instances of three entity types.
 * 5) <span style="direction: ltr; font-family: Calibri; font-size: 11pt; margin-bottom: 0in; margin-left: 0.375in; margin-top: 0in; unicode-bidi: embed; vertical-align: middle;">__Cardinality__ - if A and B are related entities, the cardinality is the number of instances of B that is associated with each instance of A. The minimum/maximum cardinality is the min/max number of instances of B that may be related to A. If minimum cardinality is 0, B is optional to A; otherwise, it is mandatory.
 * 6) <span style="direction: ltr; font-family: Calibri; font-size: 11pt; margin-bottom: 0in; margin-left: 0.375in; margin-top: 0in; unicode-bidi: embed; vertical-align: middle;">Mandatory One - "one and only one" B is related to A. Modeled as two vertical lines.
 * 7) <span style="direction: ltr; font-family: Calibri; font-size: 11pt; margin-bottom: 0in; margin-left: 0.375in; margin-top: 0in; unicode-bidi: embed; vertical-align: middle;">Many - one or more B is related to A. Modeled as a vertical line next to "crow-feet" lines.
 * 8) <span style="direction: ltr; font-family: Calibri; font-size: 11pt; margin-bottom: 0in; margin-left: 0.375in; margin-top: 0in; unicode-bidi: embed; vertical-align: middle;">Optional 1 or 0 - either zero or one B is related to A. Modeled as a circle beside a vertical line.
 * 9) <span style="direction: ltr; font-family: Calibri; font-size: 11pt; margin-bottom: 0in; margin-left: 0.375in; margin-top: 0in; unicode-bidi: embed; vertical-align: middle;">Optional zero-many - zero or more B is related to A. Represented by a circle next to "crow-feet" lines.
 * 10) <span style="direction: ltr; font-family: Calibri; font-size: 11pt; margin-bottom: 0in; margin-left: 0.375in; margin-top: 0in; unicode-bidi: embed; vertical-align: middle;">__Existence dependency__ - If A and B are related entities, and B cannot exist without A, then B has an existence dependency. B is called a weak entity . An __identifying relationship__ exists if a parent class' primary key is used as part of the dependent entity's primary key.
 * <span style="font-family: arial,helvetica,sans-serif; font-size: 13px; line-height: 19px;">Reference: Solamo, Ma. Rowena C. Database Systems. 2008.

__Group Progress__ The group had not yet acquired a suitable client for the database project by this week. A list of options included: a database for a flash game, a db software for a small healthcare company, or a db for a small loaning company. Shortly, it was decided that the loan database presented the least communication constraints and was most feasible. Please see the Design page for the group's interview transcript and ERD.


 * <span style="font-family: Verdana,Geneva,sans-serif; font-size: 120%;">Week 6-7 //Physical Database Design// **

<span style="color: black; font-family: Verdana,sans-serif; font-size: 10pt; line-height: 14.25pt; margin-bottom: 0.0001pt;">The **Physical Database Design** is the process of mapping the database structures from logical design into physical storage structures such as files and tables. Indexes are also specified as well as access methods and other physical factors. The major objective is to implement the database as a set of stored records, files, indexes and other data structures that will provide adequate performance and ensure database integrity, security and recoverability.

<span style="color: black; font-family: Verdana,sans-serif; font-size: 10pt; line-height: 14.25pt; margin-bottom: 0.0001pt;">To specify the physical design of the tables, one will need to consider the following: <span style="font-family: Verdana,Geneva,sans-serif; line-height: 14.25pt; margin-bottom: 0.0001pt;">**<span style="color: black; font-family: Verdana,sans-serif; font-size: 10pt;"> 1. ****<span style="color: black; font-family: Verdana,sans-serif; font-size: 10pt;">Business Rules or Integrity Constraints ** <span style="color: black; font-family: Verdana,sans-serif; font-size: 10pt; line-height: 14.25pt; margin-bottom: 0.0001pt; margin-left: 0.5in; margin-right: 0in; margin-top: 0in;">The term //business rules// are usually used in the context of the analysis phase while the term //integrity constraints// are used in the context of the design phase. <span style="font-family: Verdana,Geneva,sans-serif; line-height: 14.25pt; margin-bottom: 0.0001pt; margin-left: 0.25in; margin-right: 0in; margin-top: 0in; text-indent: 0.25in;">//<span style="color: black; font-family: Verdana,sans-serif; font-size: 10pt;"> Categorization of Integrity Constraints //<span style="color: black; font-family: Verdana,sans-serif; font-size: 10pt;">: <span style="font-family: Verdana,Geneva,sans-serif;"> · <span style="color: black; font-family: Verdana,sans-serif; font-size: 10pt;">Domain Constraints <span style="font-family: Verdana,sans-serif;"> It defines the set of all //data types//, //allowable values (uniqueness, null support, e.g.)//, //ranges of values, key type (primary key, foreign key, e.g.)//, and //format and code design// that attributes may assume. <span style="font-family: Verdana,Geneva,sans-serif;"> · <span style="color: black; font-family: Verdana,sans-serif; font-size: 10pt;">Entity Integrity <span style="font-family: Verdana,sans-serif;"> It also known as //primary key constraint// meaning that the base relation's primary key (whether single or composite) cannot be null. <span style="font-family: Verdana,Geneva,sans-serif;"> · <span style="color: black; font-family: Verdana,sans-serif; font-size: 10pt;">Referential Integrity <span style="font-family: Verdana,sans-serif;"> It defines the constraints that address the validity of references by one table in a database to some other table or tables in a database. It has two rules: //insertion rule// and //deletion rule (restrict, nullify or cascade).// <span style="font-family: Verdana,sans-serif;"> <span style="font-family: Verdana,Geneva,sans-serif; line-height: 14.25pt; margin-bottom: 0.0001pt; text-indent: -0.25in;">**<span style="color: black; font-family: Verdana,sans-serif; font-size: 10pt;"> 2. ****<span style="color: black; font-family: Verdana,sans-serif; font-size: 10pt;">Data Volume and Usage Analysis ** <span style="color: black; font-family: Verdana,sans-serif; font-size: 10pt; line-height: 14.25pt; margin-bottom: 0.0001pt; margin-left: 0.5in; margin-right: 0in; margin-top: 0in;">When performing //data volume analysis//, estimates of the //database size// are used to select physical storage devices and estimate the cost of storage. When performing //usage analysis//, estimates of //usage paths or patterns// are used to select file organizations and access methods to plan for the use of indexes and to plan a strategy for data distribution.

<span style="font-family: Verdana,Geneva,sans-serif; line-height: 14.25pt; margin-bottom: 0.0001pt; text-indent: -0.25in;">**<span style="color: black; font-family: Verdana,sans-serif; font-size: 10pt;"> 3. ****<span style="color: black; font-family: Verdana,sans-serif; font-size: 10pt;">Data Distribution Strategies ** <span style="font-family: Verdana,Geneva,sans-serif; line-height: 14.25pt; margin-bottom: 0.0001pt; text-indent: -0.25in;"> <span style="font-family: Verdana,Geneva,sans-serif; line-height: 14.25pt; margin-bottom: 0.0001pt; text-indent: -0.25in;">**<span style="color: black; font-family: Verdana,sans-serif; font-size: 10pt;"> 4. ****<span style="color: black; font-family: Verdana,sans-serif; font-size: 10pt;">File Distribution and File Access Methods ** <span style="color: black; font-family: Verdana,sans-serif; font-size: 10pt; line-height: 14.25pt; margin-bottom: 0.0001pt; margin-left: 0.5in; margin-right: 0in; margin-top: 0in;">The //physical record// is the unit of transfer between disk and primary storage, and vice versa. It is also known as a //block// or //page//. It is a technique for physically rearranging the records of a file on a secondary storage device. <span style="font-family: Verdana,Geneva,sans-serif;"> · <span style="color: black; font-family: Verdana,sans-serif; font-size: 10pt;">File Organization (Sequential and Hashed) <span style="font-family: Verdana,sans-serif;"> It is the physical arrangement of data in a file into records and pages on secondary storage. <span style="font-family: Verdana,Geneva,sans-serif;"> · <span style="color: black; font-family: Verdana,sans-serif; font-size: 10pt;">File Access Method (Sequential, Indexed, and Random-access or Direct-access) <span style="font-family: Verdana,sans-serif;"> Defines the steps involved in storing and retrieving records from a file.

<span style="font-family: Verdana,Geneva,sans-serif; line-height: 14.25pt; margin-bottom: 0.0001pt; text-indent: -0.25in;">**<span style="color: black; font-family: Verdana,sans-serif; font-size: 10pt;"> 5. ****<span style="color: black; font-family: Verdana,sans-serif; font-size: 10pt;">Indexes ** <span style="font-family: Verdana,Geneva,sans-serif; line-height: 14.25pt; margin-bottom: 0.0001pt; text-indent: -0.25in;"> <span style="font-family: Verdana,Geneva,sans-serif; line-height: 14.25pt; margin-bottom: 0.0001pt; text-indent: -0.25in;">**<span style="color: black; font-family: Verdana,sans-serif; font-size: 10pt;"> 6. ****<span style="color: black; font-family: Verdana,sans-serif; font-size: 10pt;">Denormalization **