My.ADVISOR.com Sign-In
ID
Password

Member Center / Sign-Up
Go to Article
Advanced Search 

PERSPECTIVE

11 Steps to Successful Data Warehousing

Mining your corporate data for valuable customer information can improve your business performance. But it's not as simple as it sounds.

By Phillip Blackwood

More and more companies are using data warehousing as a strategy tool to help them win new customers, develop new products, and lower costs. Searching through mountains of data generated by corporate transaction systems can provide insights and highlight critical facts that can significantly improve business performance.

Until recently, data warehousing has been an option mostly for big companies, but the reduced costs of warehousing technology make it practical -- often even a competitive requirement for -- smaller companies as well. Turnkey integrated analytical solutions are reducing the cost, time, and risk involved in data warehouse implementations. While access to the warehouse was previously limited to highly trained analytical specialists, corporate portals now make it possible to grant data access to hundreds or thousands of employees.

Following are some steps to consider in implementing your data warehousing solution.

1. Recognize that the job is probably harder than you expect.

Experts frequently report that 30-to-50 percent of the information in a typical database is missing or incorrect. That situation may not be noticeable -- or may even be acceptable -- in an operational system that focuses on swiftly and accurately processing current transactions. But that percentage of error is totally unacceptable in a data warehousing system designed to sort through millions of historical records to identify trends or select potential customers for a new product. And, even when the data is correct, it may not be usable in a data warehouse environment.

For example, legacy system programmers often use shortcuts to save disk space or CPU cycles, such as using numbers instead of names of cities, that make the data meaningless in a generic environment. Another challenge is that database schema often change over the lifecycle of a project, yet few companies take the time to rebuild historical databases to account for those changes.

2. Understand the data in your existing systems.

The first step in any data warehousing project is to perform a detailed analysis of the status of all databases that will either contribute -- or potentially contribute -- to the data warehouse. An important part of understanding the existing data is determining interrelationships between various systems. Interrelationships must be maintained as the data is moved into the warehouse. In addition, the data warehouse implementation often involves making changes to database schema. You must have a clear understanding of data relationships among heterogeneous systems to determine -- in advance -- how any change may impact the system. Otherwise, it's possible for changes to create inconsistencies that ripple across the entire enterprise, creating enormous headaches.

3. Be sure to recognize equivalent entities.

One of the most important aspects of preparing for a data warehousing project is identifying equivalent entities and heterogeneous systems. The problem arises because the same essential piece of information may appear under different field names in different parts of the organization. For example, two different divisions may be servicing the same customer but have the name entered in a slightly different manner (such as AIG and American International Group). You can use a data transformation product that's capable of fuzzy matching to identify and correct this and similar problems.

More complicated issues arise where corporate entities take a conceptually different approach in the way they manage in-store data. This situation frequently occurs in cases of a merger or acquisition. The database schema of the two organizations were established under two entirely different corporate cultures. Establishing a common database structure can be just as important as merging the corporate cultures after a merger, which is crucial to obtaining the full effects of synergy.

4. Use metadata to support data quality.

Metadata is crucial to a successful data warehousing implementation. By definition, metadata is data about data, such as the tags that indicate the subject of a web document. There are many types of metadata that can be associated with a database to characterize and index data, facilitate or restrict access to data, determine the source and currency of data, etc. One major challenge is trying to synchronize the metadata between different vendor products, different functions, and different metadata stores. A major data center might easily have a 500-page schema layout or COBOL copybooks with 200,000 or more lines. But the chances are that many undocumented changes were made to minimize the work involved in dealing with historical data.

This emphasizes the importance of starting as soon as possible to create and capture metadata for interfaces, business processes, and database requirements. Several vendors have developed products that have the potential to integrate metadata from disparate sources and establish a central repository that can provide information to both administrators and users.

5. Select the right data transformation tools.

Data transformation tools extract data from the operational sources, clean it, and load it into the data warehouse while capturing the history of that process. This transformation process may include the creation and population of new fields from the operational data, summarizing data to an appropriate level for analysis, performing error checking operations to validate the integrity of the data, etc. Look for a tool that makes it possible to map data from source to target with a simple point-and-click interface. It's also useful to track and manage the relationships of interrelated data entities that are affected when changing database structures. Finally, try to find a tool that can capture and store metadata during the conversion process.

6. Take advantage of external resources.

External sources of information, such as data from a customer's transaction processing systems, or market research data provided by a third party, can greatly increase the value of internal information. Rather than simply comparing sales against the same month last year, using external data might make it possible, for example, to compare sales growth against the increase in the overall market. Or suppose that you want to estimate the income of each of your customers. You may be able to obtain a database containing the average income for every ZIP code in the country. This won't provide accurate answers for individual customers, but it will let you obtain a good estimate of an aggregate grouping, such as everyone from 30-to-45 years old in Kansas, that might be used in a targeted marketing campaign.

The integration challenge is even greater when external data sources are involved. In some cases, external data will differ so drastically from existing schema that data transformation algorithms will be required to make use of the external resource.

7. Use new information distribution methods.

The biggest technological improvements in recent years in the data warehousing field have come in the area of information delivery. In the past, highly skilled analysts were needed to prepare information in the format requested by users. Today, information can be delivered in several different ways directly to people who need it. Users can subscribe to regular reports and have them delivered via e-mail. If the report contains data that meets the subscription criteria, a customer-filtered view of the data can be delivered to the user by means of an e-mail attachment. The report data stays securely and economically on the server, and only the pages requested by the authorized user are sent on demand. Another option is to let the user log in, search for data, and open the reports that provide the needed information. Only the pages requested by the user are sent over the network to minimize traffic.

8. Focus on high-payback marketing applications.

Most of the really hot applications in data warehousing right now involve marketing because of the potential for an immediate payback in increased revenues. For example, catalog manufacturers are using data warehousing to match personal characteristics to purchases of specific items, and then they use this empirical data to produce special demographic editions that generate higher sales because they're targeted more closely to customers' needs.

Companies selling more complicated and higher value products, such as brokerages, insurance companies, and credit card firms, are weaving data warehousing into their relationship-based businesses in order to dramatically increase the information flow between their firm and their customer. Innovative banks are even using similar methods to sort out customers whose trading characteristics match the profile of known money launderers.

9. Emphasize early wins to build support throughout the organization.

The wide range of off-the-shelf solutions has made it possible to drastically reduce cost and lead-time requirements for data warehousing applications. Off-the-shelf solutions won't usually complete project objectives, but they often can be used to provide point solutions in a short time that serve as a training and demonstration platform and, most importantly, build momentum for full-scale implementation. Even for the largest-scale applications, you should perform technology surveys to maximize the use of pre-built technology.

10. Don't underestimate hardware requirements.

The hardware requirements for data warehousing database servers are high. The primary reason is the large number of CPU cycles required to slice and dice data over and over again to meet the varying needs of users throughout the organization. Database size also plays in server performance requirements, ranging up to terabytes and higher. For those reasons, be sure to select a scalable platform regardless of how much headroom you have provided in your server specification. The typical data warehouse implementation starts out at the departmental level and grows over time to an enterprise-wide solution. Purchasing servers that can be expanded with additional processors is one possible approach. A more ambitious idea is to combine loosely-coupled systems that let companies spread the database over multiple servers, although the database still appears as a single entity to users.

11. Consider outsourcing your data warehouse development and maintenance.

A large percentage of medium and large companies use outsourcing to avoid the difficulty of locating and the high cost of retaining skilled IT staff members. Most data warehousing applications fit the main criteria for a good outsourcing project -- a large project that has been defined to the point that it doesn't require day-to-day interaction between business and development teams. There have been many cases where an outsourcing team is able to make dramatic improvements in a new or existing data warehouse. Typically these improvements don't necessarily stem from an increased level of skill on the outsourcing team, but flow from the nature of outsourcing. The outsourcing team brings fresh ideas and perspective to their assignment, and is often able to suggest methods and solutions that they've developed and proven successful in previous assignments. The outsourcing team also doesn't have to deal with manpower shortages and conflicting priorities faced by the previous internal team.

By now, you've realized that building a data warehouse is no easy task. With the average cost of a data warehousing system valued at US$1.8 million, the right people, methodology, and experience are critical. The reliance on technology is only a small part in realizing the true business value buried within the multitude of data collected within organizations today. Data warehouses touch the organization at all levels, and the people that design and build the data warehouse must work across the organization. The industry and product experience of a diverse team, coupled with a business focus and proven methodology, can make a huge difference.


What do YOU think about this topic? Share your advice and thoughts using this form.

Your Name

REQUIRED : PUBLIC

Your E-Mail

REQUIRED : PRIVATE

Job, Company

OPTIONAL : PUBLIC

City, State, Country

OPTIONAL : PUBLIC

Your Web Site

OPTIONAL : PUBLIC

Your Comment

Please help everyone by keeping your comments on-topic, using clean language, and not defaming or making personal attacks.


Your e-mail address is required, but it will not be displayed to the public or given to anyone. See our Privacy Policy. Comments become visible after they pass our spam filter, and spammers and abusers are permanently blocked. Please report spam or abuse.

Phillip Blackwood is the vice president of Data Warehouse and Business Intelligence for Syntel Inc., headquartered in Troy, Michigan. Syntel's Enterprise Solutions Group helps leading organizations efficiently and cost-effectively develop advanced technologies, including data warehousing, e-business, customer relationship management, and web solutions. Syntel has more than 1,900 employees worldwide and four fully networked Global Development Centers. 248-619-3584, http://www.syntelinc.com

Printer-friendly
page layout

Keyword Tags: Business Intelligence, Data Acquisition, Data Analysis, Data Mining, Data Replication, Database, Database Development, Marketing, Software, Syntel

ADVISORAMA
Anxiety is love's greatest killer. It makes others feel as you might when a drowning man holds on to you. You want to save him, but you know he will strangle you with his panic.
-- Anais Nin (1903-1977)

ARTICLE INFO

DataBased Advisor

Web Edition: 2000.05.31, Doc #06444

FREE ACCESS FREE ACCESS

SUBSCRIPTION STATUS
You are not signed-in. If you are a subscriber to this publication, sign-in above to access locked articles. To subscribe or renew go to www.AdvisorStore.com.

Subscribe to DATABASED.ADVISOR.com

Get it all -- every current and past ADVISOR tech/business publication, now all-in-one subscription, with new articles and a huge reference library packed with expert advice, how-to and downloads. Subscribe now to get it all.

DataBased.Advisor.com

Subscribe to FileMaker Advisor Magazine

Read the advanced guide to creating custom business database solutions with FileMaker software. Subscribe now to gain access to all the archives and downloads.

FileMaker.Advisor.com

Subscribe to Advisor Basics of FileMaker Pro

Learn the fundamentals of using FileMaker Pro software. Every issue gives you step-by-step instructions on creating the databases you need. Subscribe now!

FileMaker.AdvisorBasics.com

Showcase Your Smarts

Submit your tips, techniques and advice and let Advisor promote your business and build your career. Show the world what you know!

AdvisorTips.com

Use of this or any other site, content, product or service of Advisor Media constitutes acceptance of Terms of Use.
Portions copyright ©1983-2008 Advisor Media, Inc. All Rights Reserved.
Reuse or reproduction of any portion or quantity of Advisor Media's copyrighted content, in any form, for any purpose, requires written permission.
ADVISOR®, the ADVISOR logo, and other names and logos that incorporate ADVISOR are registered trademarks, trademarks or service marks of Advisor Media, Inc. in the United States and/or other countries.
Other trademarks are used for identification, editorial or descriptive purposes and are the property of their owners.
Hosted by Prominic.NET Website powered by
LOTUS SOFTWARE
BLACP01 posted 05/31/2000 modified 01/07/2009 03:42:49 AM ztdbms/ztdbms
domino-144.advisor.com my.advisor.com 01/07/2009 01:43:22 PM