Heterogeneous Databases

Distributed databases can be classified into heterogeneous and homogeneous databases. A heterogeneous database system is considered as an automated system (semi-automated) intended for the integration of heterogeneous, disparate database management systems that provides a user with a single, unified interface. The heterogeneous database systems are computational software and models implementations which offer an integration of the database.

Properties of a Heterogeneous Distributed Database

In a heterogeneous distributed database, there are different sites with a different operating system, data models, and DBMS products. The heterogeneous distributed databases have specific properties associated with its functioning.

  • It has different sites that applydifferent software and schemas.
  • At the same time, the system may compriseof various DBMS such network, relational, object or hierarchical oriented.
  • Query processing is a complex process because of the existence of different schemas.
  • It is also important to note that transactional processing is also a complex process mostly due to the complexity arising from different software.
  • Another property of heterogeneousdistributed databases is that a site may not understand the existence of other sites bringing limited co-operation in the process of handling user requests.

Types of HeterogeneousDistributed Database

Heterogeneous distributed databases are divided into two key elements: federated and un-federated heterogeneous distributed database. The heterogeneousdatabases are systems that are independent and incorporated together to enable them to work as a single database system.

The architecture of the Heterogeneous Database

The primary databases of a heterogeneous system areconsidered as a different element in nature and can offer different interfaces to the external world. One of the initial challenges in the integration of heterogeneous databases is the hidethe differences existing in the interfaces that are exposed by these systems. In most cases, wrappers have often been used to universally in the industry as a method for making this come true. A wrapper is treated as a software module which applies the open interface of an underlying database and offers a uniform interface to the external world with regard to the strengths that the database gives. Because the de facto standard for processing is the SQL in any heterogeneous system, a wrapper is capable of exposing a relational model, and the SQL interface for the system wrapped. Based on the abilities of the underlying component of the database, wrappers often offer various sets of functionalities.

A relational database system wrapper, for example, can support all the capabilities that can be provided by a relational database system. Due to the restraint of the drawback of the BigBookdatabase, a wrapper for the same can only offer the capability of choosing limited information like the business type and the location of the business. The BigBook wrapper does not have the power to provide the ability to enumerate rows, hence, cannot support joins.

Types of Heterogeneity

Database heterogeneityoccurs in different ways. These include technical heterogeneity which entails different file formats, query languages, access procedures, etc.

Data models heterogeneity describes different approaches to the representation and storage of the same data. There are different table decomposition, and data labels (name of columns) may be different though with similar semantics. At the same time, data encoding schemes can vary based on whether a measurement scale should be openly incorporated in a field or whether it should be used elsewhere. This is also known as schematic heterogeneity.

Semantic heterogeneity is another typeof heterogeneity where data across constituent database is related but has different characteristics. Maybe a database system should be able to integrate proteomic and genomic data. And because a gene can have numerous protein products, they are related, but the data are different based on the amino acid sequences and nucleotide sequences, negatively or positively charged amino acids, or phobic amino acid sequences. Semantically same but different databases can be looked at using various ways. The system may as well be needed to give new knowledge to the users. In the process, a relationship may be inferred between data with regard to the rules specified in the ontologies domain.

Heterogeneous database management systems are generally developed based on three parameters: distribution, autonomy, and heterogeneity. Distribution describes the physical distribution of the data across the various sites. Autonomyshows the distribution of the database system control and the level to which every constituent database can function independently. Additionally, heterogeneity entails the dissimilarity and uniformity of the models of data, databases, and system components. Some of the most common architectural models of the heterogeneous database include the client-server, peer-to-peer, and multi-database management system architecture.

References

Sujansky, W. (2001).”Heterogeneous Database Integration in Biomedicine.” Journal of Biomedical Informatics. 34 (4): 285–298.

Sheth, P. & James A. (1990). “Federated Database Systems for Managing Distributed, Heterogeneous, and Autonomous Databases” (PDF). ACM Computing Surveys. 22 (3): 183–236.

 
Do you need high quality Custom Essay Writing Services?

Custom Essay writing Service