Data is very crucial to organizations because it is the major reference tool that can be used when decisions are to be made. There are two elements of data that go hand in hand and they are namely data integration and data quality. Data integration involves merging information from various sources to make a single pile of information that can be understood by users. This is because when a user has to extract information from different sources it might be confusing to him/her hence may not be helpful. On the other hand, data quality entails accuracy and appropriateness of the information obtained from databases. This paper focuses on the differences between the two aspects of data and reviews their importance to companies. Most people think that data quality and integration is one and the same thing but this perception is only from the perspective of a lay man. The two elements of data, that is, integration and quality are different but we need them both hence we cannot ignore either of them because they are equally important to companies and individuals.
In this age of electronic transactions we need to make sure that we minimize errors and also make it simple for our consumers to grasp something from the information that they get while interacting with our systems in their day to day operations. Imhoff (2005) argues that failure to emphasize on quality and integration can lead to decline in business because customers will opt to revert to old manual techniques. Most people emphasize quality more than integration. In the final end the latest developments in technology will loose its importance if the systems are not user friendly.
The systems that most companies and other institutions have rolled out were meant to make work easier both to consumers and companies’ staff. They were invented with the aim of improving company’s’ performance and productivity but the improvements cannot be realized if the systems don’t work as they were designed to perform. This refers to the way they process information that is input by the users. When system errors occur it is the software developers who bear the most burdens.
It should be known that erroneous transactions and information occur when users don’t provide the correct details as required by the system. For instance, when one is making cash withdrawal from an automatic teller machine he/she is responsible of the response that will be provided by the ATM machine. If the customer does not enter the correct pin number, the transaction will not take place and the ATM card will not be dispensed by the machine.
This is one of the best examples of how systems can be designed to deal with data quality. Data quality is one of the protective measures that is highly valued by most companies because the records that are stored in databases can be used to trace events and for the referencing to be useful the data has to be free from errors. The data must be encoded appropriately because data can have common characters such as customer names which can be shared by various people. According to findings by META Group (2004), in data quality the main areas that attract attention are correctness, accuracy, completeness, and relevance. The correctness bit of data has been covered in the previous paragraphs. For data to be of good quality it must be complete because if some bits are missing the systems may encounter errors and fail to process the provided data. For instance, if customer’s date of birth is missing in its respective field, we can say that the data is not complete because the system cannot generate such information on its own unless it is provided by the user.
Most of the time the missing bits could have been omitted erroneously when the data base was being developed. At times some bits of data are erased by mistake hence the ones that are left don’t make sense. In such cases the lost or missing data has to be recovered using the relevant technologies. This is because companies cannot make important decisions when the data they are referring to is incomplete and incase a company goes ahead to make decisions without complete data the company could incur losses owed to poor strategies which are caused by misleading information (Barney, 2004). We know that records are used when making budgets and can also be referenced in future while making decisions hence companies of these days don’t compromise on data quality. In fact there are software programs that can be very useful in tracking events leading to data loss. Moreover, data should be backed up and stored in a different location so that incase a fire destroys the company premises its data will remain intact.
Data should also be updated regularly for it to retain its quality. This is because outdated data can not be useful when making decisions. It is therefore recommended that data should be updated with change of events.
For instance, in health records the data should incorporate current events in the patients’ data such as the medical procedures and health developments that have been reported by the patient such as number of births. In addition, it would be best to include new items that have been introduced by the company in the data base. When this information is not added to the existing information concerning a particular patient it might be difficult for medical personnel to make decisions concerning that patient and problems can occur due to lack of certain information. Besides, the information in the database needs to be correct for it to be useful because if it is based on estimation it could be misleading.
This implies that the figures and other characters should be exact. For instance, the age of the customer should be accurate and this accurateness should also be applied in company information such as location. This is because sometimes the customers want to visit the company premises but then when they access the site map of the company location it does not reflect what is on the ground (Pipino, Lee, & Wang, 2002).
Data integration incorporates data from other records to make a comprehensive data depending on who will need that data and how it will be used. Integrating data saves both time and money that would be spent viewing individual records. It comes with convenience for the company and its clients. The information can be arranged in a way that only data within a specific range is available to the user instead of availing all the data while the majority of data is irrelevant to the user (Berson & Dubov, 2007). Before data can be integrated, it is important to focus on how the data is laid out in their respective sources. This means that experts have to analyze the structure of data and design a newer structure based on the compatibility of data from various foundations. This is because there are some fields which do not match and hence they may cause the data to loose its relevance to the user.
For instance, in institutional records the data fields about lecturer’s date of birth may not match with students records hence the person who is deriving the integration should incorporate data based on the relationship between the data fields. Data integration is useful while making decisions because it allows the users to make comparisons of various departments in their companies. For instance, for a company to make decisions regarding loss and profits it has to focus on the records from purchases and compare the information provided to that of sales records. Data integration fosters reliable presentations because repetitions are avoided hence the system is able to deliver the required results and thus achieve customer satisfaction (META Group, 2004).
For instance, when a customer is opening a new account in a company’s website if she/he enters a username that already exists, the system will notify him/her and will probably provide suggestions. When data from various databases is merged it is easy for administrators of a company to foresee events before they take place. These events include decline in sales and other disciplines that are linked to company performance. For instance, if the performance from a particular department starts to fall the management will realize in good time and thus take the best action to caution the company. When data is not integrated risk management can be difficult to analyze because people would access the records that are relevant to them and ignore the rest. Data integration promotes simplicity because processes take a shorter time to be completed unlike when data is stored separately. When data is stored separately systems are forced to check for information from every individual database which takes a lot of time.
This is because it has to select the data that is required from a huge amount of data. When data is integrated the required information is already grouped in one category hence processing it becomes easier. In conclusion, data quality and data integration are two different aspects but they are related because they belong to the same entity which is data. The two aspects are very important and thus should be given appropriate importance because they are the elements of reliable data. Organizations such as IBM have been able to develop applications that help companies to integrate their data from different sources. This fosters data quality and thus improves productivity. Therefore, companies should manage their data well without compromising the two aspects of data.
(2004, June 18). Put your faith in CRM’s stewards. TechTarget. Retrieved from http://www.searchcrm.techtarget.
com/news/970869/put-your-faith-in-CRMs-data-stewards Berson, A. & Dubov, L. (2007). Master Data Management and Customer Data Integration for a Global Enterprise. New York: McGraw-Hill Professional. Imhoff, C.
(2005). Data quality or integration-which is more difficult?. BeyeNetwork. Retrieved from http://www.b-eye-network.com/blogs/imhoff/archives/2005/04/data_quality_or.
php META Group (2004) The future of data integration technologies. sunopsis. Retrieved from https://portal.erp-link.
com/C10/Whitepapers/Document%20Library/The%20Future%20of%20Data%20Integration%20Technologies.pdf Pipino, L., Lee, Y., and Wang, R. (2002).
Data Quality Assessment. Communications of the ACM. 45(4ve): 211-218.
Retrieved from http://web.mit.edu/tdqm/www/tdqmpub/PipinoLeeWangCACMApr02.pdf