Network Computing is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Business Intelligence with Smarts: Page 12 of 19

1 What is the level of interest in Internet/intranet products across industries as it relates to the organization's IT budget and the part of that budget spent on infrastructure products?

2 Is there a relationship between the organization's industry and IT budget and the job title of the subscriber?

We evaluated how the product answered these questions and the ways in which the results were presented--graphs, columnar data and so on--as well as how much dynamic control the business analyst had over the data presentation.

Normalizing data is an important part of database design. And cleaning data is critical for effective analysis. Essentially, normalization is the process of removing duplicate tuples (a tuple is a collection of attributes). This procedure reduces database size and helps ensure data integrity.
Notice that several cells in the non-normalized chart below contain the same value multiple times. To normalize this data, the values are removed and placed in a separate table'commonly referred to as a lookup table'and then referenced from the original table.

When a data query is performed on the second set of tables, a join must be performed. Data residing in a data warehouse is often non-normalized because of the amount of data stored and the performance degradation resulting from joins across more than one table when dealing with excessively large data sets. The data used in testing was non-normalized.

It was also very, very dirty. But clean data is essential to ensuring that the information a business-intelligence tool generates is valid and useful. An enterprise application-s native database almost never contains a clean data set. Changes from migration, upgrades and day-to-day interaction introduce errors. As an example, imagine a database in whose cells an X is supposed to represent "yes" and blank spaces represent "no." If, instead, "Y" or "N" appears in place of X or the empty cell'a common occurrence'your data is dirty.