Databases

08:53
                      Data is raw collection of matter.If it is well defined and particular order is maintained, then it is called as information.Day by day the information is getting bigger and bigger.There needs a mechanism to store this whole data and maintain it efficiently.


Initial method: File implementation.

1. The whole info is stored in a file.
2. It is advantageous as it gives us accurate results.
3. But it has many overheards like huge time updation, no flexibility, lots of manual work etc etc.

Then came the notion of relational database(RDBMS). After
DBMS is a software to manage all these insertions, updates etc

Database systems = (Data + DBMS)

DBMS can be viewed in 3 levels. They are lower level(physical level), conceptual level, and external data level.

Physical data level deals with how exactly our data is getting stored.
The conceptual schema presents data as a set of tables.
The external schema also presents data as a set of relations. An external schema specifies a view of the data in terms of the conceptual level.

It is very important to understand the difference among them.
To get more info go through the below link

http://jcsites.juniata.edu/faculty/rhodes/dbms/dbarch.htm


The change in one level should not effect the other level. This is what we call data independence

SQL:
    Certainly a language is required to talk to database. SQL(Structured Query Language) is one such language.

SQL statements can be divided into two categories: They are Data Definition Language(DDL) and Data Manipulation Language(DML).

For more info go through this link:

http://www.tomjewett.com/dbdesign/dbdesign.php?page=ddldml.php
It tells about DDL and DML

Few more important terms are:

Data dictionary: It is matedata. Means data about data. It talks about the underlying data.
 
Database Schema: The formal definition of database schema is a set of formulas (sentences) called integrity constraints imposed on a database. These integrity constraints ensure compatibility between parts of the schema.

Keys: The entire strength of DBMS lies in its keys.

It is important to get handy with these keys, before we move forward.
Go through this link: http://www.jkinfoline.com/keys.html

For integrity constriants : http://www2.amk.fi/digma.fi/www.amk.fi/opintojaksot/0303011/1146161367915/1146161783414/1146163065754/1146163167961.html


Yep, finally the stage is ready to talk about what we actually wanted.

Basically the interface to database is directly provided to the user or through some application. So, whenever the user enters a query. It  goes through the following steps:




The below link gives detailed explanation.


http://cnx.org/content/m28213/latest/

The query optimizer is given the utmost importance. The o/p parser is given to optimizer which optimizes the query which would result in faster processing of query , lesser cost per query, high performance and lesser stress on db.

 It is because, it analyses the entire query and come up with an optimal query which would better all the mentioned above criteria.

There would be many execution plans and best one among them is selected. Then the execution engine executes those plans and gives the query result.



Query Processing Phases:

Phase 1 Annotation Phase:
          
           When the database server receives a query, it uses a parser to parse the statement and transform it into an algebraic representation of the query, also known as a parse tree. At this stage, the parse tree is used for semantic and syntactic checking (for example, validating that objects referenced in the query exist in the catalog). And other checks like permissions, whether following integrity constraints or not(like student ID should not be NULL etc).

Parse tree looks like this



Syntactic checking: 

                               SQL follows some syntaxes like CREATE keyword should be followed by table keyword, then by table_name etc. . All these checks are done by using this parse tree.



Semantic checking:

                              It is used to check whether the attribute that the user is asking for is really in the table that he mentioned in that query or not. These sort of checkings come under semantics.


 Phase 2 : Semantic transformation Phase:

   
                       Go for optimizations, such as join elimination, DISTINCT elimination, and predicate normalization. Queries with plans already cached by the database server skip this phase of query processing. JOIN and DISTINCT operations requires to retrieve all the tuples of the tables. It is always advisable to eliminate them if possible.


Phase 3 : Query Optimization:

Two main steps:
Pre-optimization phase:



                    During this phase, the query is analyzed to find all relevant indexes and materialized views that may be used in the query access plan. For example, in this phase, the View Matching algorithm determines all the materialized views that may be used to satisfy all, or part of the query.

Views: A view contains rows and columns, just like a real table. The fields in a view are fields from one or more real tables in the database. It is not really stored at low level, but constructed from the already available data. It helps in executing our query.
In addition, based on query predicate analysis, the optimizer builds alternative join methods that may be used in the enumeration phase to join the query's tables.


Enumeration phase:

The optimizer enumerates possible access plans for the query using the building blocks generated in the pre-optimization phase. The search space is very large and the optimizer uses a proprietary enumeration algorithm to generate and prune the generated access plans. For each plan, cost estimation is computed,
and the best one is selected. Cost estimation takes into account resource utilization such as disk and CPU operations, the estimated number of rows of the intermediate results, optimization goals, cache size, and so on. The output of this phase is the best access plan of the query.

Execution Plan: It says about which operation should be performed on the attributes like sorting or hashing or the both of them and if so, in which order etc. It gives the ordering to the operations that are to be executed, with effective cost.

Plan building phase:

The plan building phase takes the best access plan and builds the corresponding final representation of the query execution plan used to execute the query.


Plan execution phase:

The result of the query is computed using the query execution plan built in the plan building phase.





 

 
                     




0 comments: