Methods to Build Optimal Database Model

The paper considers the search for optimal database models and effective ways of their building. Traditional data models are considered and their limitations and shortcomings are revealed during implementation in modern information systems. The main points of relational database normalization theory...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Математичні машини і системи
Datum:	2018
1. Verfasser:	Lisetskyi Yu.M., Yu.M.
Format:	Artikel
Sprache:	English
Veröffentlicht:	Інститут проблем математичних машин і систем НАН України 2018
Schlagworte:	Інформаційні і телекомунікаційні технології
Online Zugang:	https://nasplib.isofts.kiev.ua/handle/123456789/144939
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!
Назва журналу:	Digital Library of Periodicals of National Academy of Sciences of Ukraine
Zitieren:	Methods to Build Optimal Database Model / Yu.M. Lisetskyi // Математичні машини і системи. — 2018. — № 2. — С. 83-87. — Бібліогр.: 6 назв. — англ.

Institution

Digital Library of Periodicals of National Academy of Sciences of Ukraine

id	nasplib_isofts_kiev_ua-123456789-144939
record_format	dspace
spelling	Lisetskyi Yu.M., Yu.M. 2019-01-10T16:57:48Z 2019-01-10T16:57:48Z 2018 Methods to Build Optimal Database Model / Yu.M. Lisetskyi // Математичні машини і системи. — 2018. — № 2. — С. 83-87. — Бібліогр.: 6 назв. — англ. 1028-9763 https://nasplib.isofts.kiev.ua/handle/123456789/144939 004.9:004.75 The paper considers the search for optimal database models and effective ways of their building. Traditional data models are considered and their limitations and shortcomings are revealed during implementation in modern information systems. The main points of relational database normalization theory are studied as well as essential requirements to optimal database model are defined. У статті розглядається пошук оптимальних моделей баз даних та ефективних способів їх побудови. Розглядаються традиційні моделі даних, а також виявляються їх обмеження та недоліки при впровадженні в сучасні інформаційні системи. Вивчаються основні моменти теорії нормалізації реляційних баз даних, а також визначаються основні вимоги до оптимальної моделі бази даних. В статье рассматривается поиск оптимальных моделей баз данных и эффективных способов их построения. Рассматриваются традиционные модели данных, а также выявляются их ограничения и недостатки при внедрении в современные информационные системы. Изучаются основные моменты теории нормализации реляционных баз данных, а также определяются основные требования к оптимальной модели базы данных. en Інститут проблем математичних машин і систем НАН України Математичні машини і системи Інформаційні і телекомунікаційні технології Methods to Build Optimal Database Model Методи створення оптимальної моделі баз даних Методы создания оптимальной модели баз данных Article published earlier
institution	Digital Library of Periodicals of National Academy of Sciences of Ukraine
collection	DSpace DC
title	Methods to Build Optimal Database Model
spellingShingle	Methods to Build Optimal Database Model Lisetskyi Yu.M., Yu.M. Інформаційні і телекомунікаційні технології
title_short	Methods to Build Optimal Database Model
title_full	Methods to Build Optimal Database Model
title_fullStr	Methods to Build Optimal Database Model
title_full_unstemmed	Methods to Build Optimal Database Model
title_sort	methods to build optimal database model
author	Lisetskyi Yu.M., Yu.M.
author_facet	Lisetskyi Yu.M., Yu.M.
topic	Інформаційні і телекомунікаційні технології
topic_facet	Інформаційні і телекомунікаційні технології
publishDate	2018
language	English
container_title	Математичні машини і системи
publisher	Інститут проблем математичних машин і систем НАН України
format	Article
title_alt	Методи створення оптимальної моделі баз даних Методы создания оптимальной модели баз данных
description	The paper considers the search for optimal database models and effective ways of their building. Traditional data models are considered and their limitations and shortcomings are revealed during implementation in modern information systems. The main points of relational database normalization theory are studied as well as essential requirements to optimal database model are defined. У статті розглядається пошук оптимальних моделей баз даних та ефективних способів їх побудови. Розглядаються традиційні моделі даних, а також виявляються їх обмеження та недоліки при впровадженні в сучасні інформаційні системи. Вивчаються основні моменти теорії нормалізації реляційних баз даних, а також визначаються основні вимоги до оптимальної моделі бази даних. В статье рассматривается поиск оптимальных моделей баз данных и эффективных способов их построения. Рассматриваются традиционные модели данных, а также выявляются их ограничения и недостатки при внедрении в современные информационные системы. Изучаются основные моменты теории нормализации реляционных баз данных, а также определяются основные требования к оптимальной модели базы данных.
issn	1028-9763
url	https://nasplib.isofts.kiev.ua/handle/123456789/144939
citation_txt	Methods to Build Optimal Database Model / Yu.M. Lisetskyi // Математичні машини і системи. — 2018. — № 2. — С. 83-87. — Бібліогр.: 6 назв. — англ.
work_keys_str_mv	AT lisetskyiyumyum methodstobuildoptimaldatabasemodel AT lisetskyiyumyum metodistvorennâoptimalʹnoímodelíbazdanih AT lisetskyiyumyum metodysozdaniâoptimalʹnoimodelibazdannyh
first_indexed	2025-11-25T01:47:01Z
last_indexed	2025-11-25T01:47:01Z
_version_	1850504021266137088
fulltext	© Lysetskyi Yu.M., 2018 83 ISSN 1028-9763. Математичні машини і системи, 2018, № 2 UDC 004.9:004.75 Yu.M. LISETSKYI * METHODS TO BUILD OPTIMAL DATABASE MODEL * S&T Ukraine, Kiev, Ukraine Анотація. У статті розглядається пошук оптимальних моделей баз даних та ефективних спосо- бів їх побудови. Розглядаються традиційні моделі даних, а також виявляються їх обмеження та недоліки при впровадженні в сучасні інформаційні системи. Вивчаються основні моменти теорії нормалізації реляційних баз даних, а також визначаються основні вимоги до оптимальної моделі бази даних. Ключові слова: модель, структура даних, база даних, система управління базами даних, відноси- ни, нормалізація, аномалії. Аннотация. В статье рассматривается поиск оптимальных моделей баз данных и эффективных способов их построения. Рассматриваются традиционные модели данных, а также выявляются их ограничения и недостатки при внедрении в современные информационные системы. Изучаются основные моменты теории нормализации реляционных баз данных, а также определяются основ- ные требования к оптимальной модели базы данных. Ключевые слова: модель, структура данных, база данных, система управления базами данных, отношения, нормализация, аномалии. Abstract. The paper considers the search for optimal database models and effective ways of their building. Traditional data models are considered and their limitations and shortcomings are revealed during im- plementation in modern information systems. The main points of relational database normalization theory are studied as well as essential requirements to optimal database model are defined. Keywords: model, data structure, database, database management system, relations, normalization, anomalies. 1. Introduction It has already been for quite a long time that experts are discussing the crisis of traditional data- bases which include not only relational but also the object-oriented model [1–3]. In particular, they note that it is the underlying primitive data structure which imposes significant limitations on the relational model. This structure is not effective enough for implementation in the modern information systems working with heterogeneous data and dynamically changed data structures. Other significant limitations of relational databases are their limited capacity in representing ap- plication semantics and insufficient connection between conceptual and physical layers of data representation which leads to abrupt leaps between different phases of software development pro- cess. In this connection, it appears reasonable to search for optimal database models and most effective ways of their building. 2. Main Part The optimal logical database model is the model free from anomalies caused by database (DB) modification being the issues connected with data updates, insertions or deletions. To build such a database model the relational database normalization model is applied, ir- respective of what database management system (DBMS) is used: hierarchical, network or rela- tional. Normalization of relationships of the informational model of the subject matter is the mechanism to build the relational database logical model. From the mathematical point of view the task of building both informational model of the subject matter and relational database logical model is addressed by resolution of the following combinatory tasks: 84 ISSN 1028-9763. Математичні машини і системи, 2018, № 2 Building relations as required Elimination of functional dependencies which are not full Elimination of transitive dependencies Elimination of multivalued dependencies Elimination of redundancy and anomalies of data adding 5 NF 4 NF 3 NF 2 NF 1 NF Fig. 1. Steps of Relation Normalization • attributes grouping in relation to subject matter; • attributes distribution as to database relations. Normalization of relations is the backward iterative process of initial relation decomposi- tion into several simpler and smaller relations. The reversibility suggests that composition of relations resulted from decomposition should produce the initial relation. As the result of normalization the database relations attributes have to meet the following criteria: – undesirable functional dependencies between attributes should be eliminated; – attributes grouping should be free from unnecessary data duplication; – attributes procession and restoration should be free from complications. Normalization apparatus had been developed by E.F. Codd [4]. Every normal form limits the type of permitted dependen- cies between attributes. Е. Codd described three normal forms (abbreviations: 1NF, 2NF, 3NF). Today the 4NF and 5NF are known and described as well. Relations nor- malization is completed in several steps (fig. 1). 1 st step is the transformation of rela- tions to First Normal Form (1NF). The 1NF relation should meet the following require- ments: – all relation attributes should be atomic; – all table rows should have the same structure or have the same number of attrib- utes with identical names; – column names should be different while values should be homogeneous (have the same format); – the sequence of rows is insignificant. Each database relation contains both structural and semantic information. Structural in- formation is set by relation schema while semantic information expresses functional connections of attributes. The relation keys are derived during the 2 nd step of normalization as well as correspond- ing dependencies are analyzed to eliminate functional dependencies which are not full. Definition 1. Attribute B depends on A in relation R when every moment of time not more than one value of B corresponds to the same value of A. Functional dependency corresponds to the 1:1 relationship of attributes. Definition 2. The attribute is in full functional dependency if it depends on the whole key and does not depend on its components. If relation has functional dependencies which are not full then it is decomposed into two or more other relations without functional dependencies which are not full and composition of thereof will produce initial relation. 2NF benefits: convenience of modifications. It is significantly easier to modify 2NF data- base compared to non-normalized database. ISSN 1028-9763. Математичні машини і системи, 2018, № 2 85 3 rd step of normalization provides for elimination of transitive dependencies. 3NF rela- tions should be analyzed for presence of transitive dependencies. Transitive dependency is the dependency of non-key attributes. For instance, in relation R(A*,B,C,D) where attribute D is not immediately dependent on key but is dependent on non-key attribute C which is dependent on А, the D is said to be transi- tively dependent on А. Transitive dependencies are eliminated by relation decomposition into two or more rela- tions with no transitive dependencies and composition of which will produce initial relation. The 4 th step of normalization which is also called 4NF or Boyce-Codd Normal Form pro- vides for analysis for presence of independent multivalued dependencies in relation. If there are then the relation is decomposed. Multivalued dependency is the type of functional dependency. It is corresponded by 1:B relationship of attributes. Attribute B has multivalued dependency on attribute A in relation R(A,B,C) if В depends only on А in any of its combinations with other relation attributes. If the relation has А®В and А®С then it should be decomposed into two other relations R(А,В) and R(A,С). The notion of multivalued dependency is more complicated than the notion of functional dependency. Its revelation requires a significantly deeper semantic analysis of at- tributes. There exist trivial and non-trivial multivalued dependencies. Dependency of Х®Y and Y®X type is trivial while Х®Y and Y®X dependency is non- trivial. Presence of non-trivial multivalued dependencies in relation schema and independency of their right sides defines the combinatorics of the right sides of relation. Definition 3. Relation R is in 4NF when the structure of multivalued dependency defined on the multitude of attributes contains only trivial or non-trivial multivalued dependencies with left side of each of them being the key. Decomposition of initial relation into several other should guarantee its reversibility, or provide for producing initial relation through composition of relation found through decomposi- tion. However, decomposition does not always guarantee reversibility. Relations with more than three multivalued dependencies require special measures to guarantee decomposition reversibil- ity. For this purpose there exists 5NF. The 4NF decomposition produces projections containing at least one possible key and at least one non-key attribute of initial relation. The 5 th step of normalization eliminates redundancy and anomalies of updating the data- base. Anomaly is a scientific term for issues which might possibly arise out of work with non- normalized tables. This is why the whole hierarchy of normalized forms is built in the manner where every next form limits the list of possible anomalies of previous form. This process corre- sponds to the process of decreasing database entropy or presence of redundant information. The pointed dependency relationships between three attributes are a very rare occasion. Dependency relationships between four, five, and more attributes are practically impossible to identify. Certain DBMS have special mechanisms eliminating possibility of retrieval of unreliable information. However, there should be followed the general recommendation that database struc- ture be built in such a manner that 4NF and 5NF become unnecessary. We have considered five normal forms; however, they do not exhaust the list. In 1981 R. Fagin published the paper [5] introducing the notion of Domain/Key Normal Form (DKNF). He demonstrated that DKNF relation has no modification anomalies. So whatever the changes are there are no losses in DKNF if all the key and domain constraints are observed. In fact, the definition is quite general but its essence is that if all rules are followed then whatever the actions with the table can be its consistency and all necessary information is pre- served. DKNF. The relation variable is in DKNF if and only its every constraint is a logical con- sequence of domain and key constraints for relation variable. Domain constraint is the constraint 86 ISSN 1028-9763. Математичні машини і системи, 2018, № 2 requiring usage of only the values from set domain for certain attribute. The constraint in itself is only the list (or logical equivalent of the list) of permitted values of type and declaration that the attribute has this type. The key constraint is the constraint declaring that certain attribute or combination of at- tributes is the potential key. Any relation variable in DKNF is necessarily in 5NF. However, not every relation variable can be reduced to DKNF. Sometimes DKNF is called 6 th Normal Form [6]. The relation variable is in 6NF when and only when it meets all non-trivial constraints of relationship. The definition suggests that variable is in 6NF when and only when it is irreducible or cannot be further decomposed without losses. Every relation variable which is in 6NF is also in 5NF. The idea of final decomposition arouses before the studies on chronological data but found no support. However, maximal possible decomposition of chronological database enables fighting redundancy and simplifies maintaining database consistency. To summarize it we would like to note that relations normalization eliminates the follow- ing dependencies of attributes: non-full functional, transitive, non-trivial (independent) multi- valued. By eliminating these dependencies we avoid data duplication as well as anomalies during data updates, replacements and deletions. Here are the essential requirements to optimal database: 1. Adequate representation of subject matter logics in respective data model. 2. Reasonable data redundancy. The database should be the single aggregate of integrated data. In the systems which do not use databases every application will have its files. For in- stance, the application for human resource management and application for personnel training can both have their own files with information on personnel. It leads to redundancy in data stor- age. The inconsistency may arise out of data redundancy when, for instance, two records for the same employee differ. 3. Availability of effective database management tools (creation, addition, modification, deletion and search). Data creation tools are designed to upload data from external user-oriented representation into a system one. 4. Data consistency (meeting the requirement of uniqueness of all database records and their consistency during users’ operation as well as simultaneous modifications management). Consistency requires correctness and accuracy of database data. The conflict of two rec- ords representing the same fact is an example of insufficient consistency. Majority of existing databases are characterized by lack of sufficient consistency support control. 5. Data security. It is the protection from unauthorized data access and database destruc- tion (willful or occasional). Centralized nature of database system requires existence of a security system. The data access is only permitted for authorized users. 6. Database restructuring. There should be tools for data restructuring required when da- tabase queries are changed. 7. Availability of language tools for data definition and manipulation which are concise, convenient and easy to learn. These are data definition and data manipulation languages. Auton- omous data language or the language not included into universal language is also called the query language. 8. Availability of documentation. 9. Simplicity of learning. 10. Mutual independence of programs and data. ISSN 1028-9763. Математичні машини і системи, 2018, № 2 87 3. Summary Thus, the optimal database should preserve operability in case and in the course of software and hardware development. Changes to physical data organization or storage device parameters should have no impact on a user or rather on application. Changes to user representation should require no expenses on reorganization and modification of mechanism of access to physical data files. Data independency enables system functioning during changes from both sides (user and physical data) which is the most important feature and main database goal. Also, it impacts other features including data redundancy, availability of security, consistency etc. Data independency can be defined as the application immunity to changes in data storage structures and data access methods. REFERENCES 1. Codd E.F. Recent Investigations in Relational Data Base Systems / Codd E.F. – San Jose: IBM Corporation, 1974. – 42 р. 2. Когаловский М.Р. Энциклопедия технологий баз данных / Когаловский М.Р. – М.: Финансы и статистика, 2002. – 800 с. 3. Крёнке Д. Теория и практика построения баз данных / Крёнке Д. – [8-е изд.]. – Питер, 2003. – 800 с. 4. Codd E.F. Further Normalization of the Data Base Relational Model / E.F. Codd // In Data base sys- tems, Englewood Cliffs. – N.J. Prentice-Hall, 1972. – P. 33 – 64. 5. Fagin R. A Normal Form for Relational Databases That Is Based on Domians and Keys / R. Fagin // TODS. – 1981. – Vol. 6, N 3. – P. 387 – 415. 6. Дейт К.Дж. Введение в системы баз данных / Дейт К.Дж.; пер. с англ. М.Л. Степановой. – [8-е изд.]. – М.: Вильямс, 2005. – 1328 с. Стаття надійшла до редакції 29.01.2018

Methods to Build Optimal Database Model

Institution

Ähnliche Einträge