Spatial data warehousing may not be a new concept to many organizations, especially those involved in product marketing, demographic analysis or market segmentation. For other organizations such as construction or engineering, this may not be the case. As the lines of organizations began to blur and cross functional data became more important, the reality that many information technology systems developed over years and decades did not peacefully coexist. This was a challenge when faced with the need to unleash corporate data to improve financial viability by fully using corporate assets and improve market share. Traditionally the organizations providing consumer marketing do not interface directly with those providing the services they sell, or at least that was the case in the telecommunications industry. A case study from the telecommunications industry will be used to fully explain the evolution of spatial data warehousing and how it emerged as a key business asset for managing a business in a competitive environment.

Initial Business Drivers

The term spatial data warehousing was not widely used outside the information technology ranks, but a single business problem presented itself to demonstrate a need for such a warehouse. The telecommunications industry was faced with the dilemma that competition may adversely impact revenue generation. This was a foreign thought for most large telecom companies; however it was a problem that had to be addressed in a capital intensive business. Month over month revenue growth was and remains important for a corporation, but it must be monitored in such a way that trends can be identified and acted upon quickly. Large companies not accustomed to market shifts may not be as nimble as they should be nor are their IT systems.

This dilemma provided an unusual opportunity to introduce Spatial Data Warehousing and the concept of rapid application development; however there were a couple of immediate obstacles that had to be overcome. The first was the need for a server, so a surplus UNIX server was reactivated in a corporate data center to house the data warehouse. Next, a database management system was required and fortunately the resurrected server had an installed database product that included a basic web portal application. At this point, the basic IT infrastructure was in place to implement the data warehouse.

Financial Management

The first stage of the data warehouse was to gather revenue data through a direct connection to the revenue reporting system. The concept was to gather data at the lowest possible level and aggregate it to the highest level for the corporation. Since there were multiple methodologies used by various organizations, the spatial data warehouse had to adopt a single method of aggregation and this was based on the boundary hierarchy from the enterprise GIS. This was used as the basis for data aggregation for revenue and all future datasets added to the warehouse.

Revenue data was only summarized at the State and Company level prior to the start of the data warehouse. After integration in the data warehouse, approximately 98% of the revenue could be directly associated with lower level entities. The remaining 2% had to go through a rigorous process to apply business rules to assign the revenue properly at the lowest possible level. This provided infinitely more detailed revenue data than ever published before and it was directly related to entities defined in the corporate GIS.

The next revelation was that capital spending was continuing at high levels even though revenue was declining. A direct connection to the capital management system was established and a similar data processing procedure was implemented for the revenue data. Once the data was in the warehouse, reporting was available at detailed levels showing capital spending relative to revenue growth and astounding facts surfaced.

Capital spending for growth was taking place in locations where revenue growth was actually negative. By having the data depicted in tabular, graphic and geographic formats from the most detailed to the corporate summary, all levels of the organization could determine their performance in capital management and revenue production. A byproduct of this effort was the beginning of the discovery that the organizations providing telecommunications services could positively or negatively impact both capital spending and revenue generation. Inversely, the marketing organization could impact capital spending to support marketing campaigns and help limit capital spending in soft markets. Prior to implementation of this basic data warehouse, these correlations could not be made at the operations level. Once the data was available, the next step was measurement and compliance. The on-line reporting system made the data available to all levels of management including drill down to extremely detailed data used in the summary calculations.

There are no published dollar values associated with the benefit of the initial deployment of the data warehouse, but interest savings on capital dollars alone through improved capital management was more than enough to pay for the staff and system support for the initial year of service.

Construction Management

The results from the initial implementation of the web based spatial data warehouse were so dramatic; other uses for the system began to materialize. The next opportunity was to provide a reporting system for a construction management system to reduce licensing costs for an older report generation system. Since the revenue and capital system included web reporting, the construction management system users wanted it as well. System integration between the two systems was more challenging than the first since the data warehouse and construction management systems used different relational database management systems, where the revenue, capital and data warehouse all used the same database management product. This was solved for the reporting system by leveraging an existing communication mechanism for direct connection to the remote database. Eventually there was a need to add some of the construction management data to the data warehouse and this required the use of an approved transfer protocol in an Extract, Transform and Load (ETL) process. The implementation of the reporting system alone saved over three hundred thousand dollars a year in ongoing software maintenance charges.

Implementation Approach

The implementation approach had to be simple considering this was an unfunded project and was totally dependent on its success to allow continued operations. A fairly unique approach was used in the data warehouse construction initiative and there are several aspects of the development approach that may be useful in any project.

Guiding Principles

There are four guiding principles that were adopted as part of the implementation strategy to allow the low cost implementation of a corporate system that would be accepted and used by all levels of a company. Leverage everything was a basic survival technique used when funding was not available for the development of a spatial data warehouse.

Providing good system performance was considered a make or break requirement, since a wide array of individuals would be using the system and poor system performance would drive users away and prevent the full utilization of the data warehouse.

Responsiveness to the user community during the system implementation was perceived as a key factor to the success of the data warehouse user acceptance of the system. System scalability was felt to be critical because if the system could not grow successfully with increased utilization, users would become disenchanted and not use or be supportive of the data warehouse system.

Organization

The organization began with a single developer who worked to determine the feasibility of relating revenue data to capital expenditures. The developer was also managing report development on the construction management system that required a new report engine.

After the feasibility study was completed, the organization was established using individuals who had worked together previously on the enterprise GIS development effort. The group quickly grew from one to three including a senior systems architect / programmer, senior programmer / business analysis and manager / project manager. The small group worked in a “skunkworks®*” like environment making the development team very nimble and able to anticipate requirements because of their practical experience in the industry and expertise in the application development field. Since the development activities touched sensitive areas of corporation, the organization remained below the radar in the early stages of system development. The group remained under the direction of finance and operations, however worked directly with the IT organization to ensure compliance with corporate IT standards.

The small group eventually grew to seven, but remained relatively small compared to IT sponsored application development groups.

Information Technology Standards

Large companies and government agencies require stringent IT standards to ensure system interoperability, network integrity, and system and data security. In addition, the IT organization developed the roadmap for the future IT infrastructure, so it was important to comply with as many standards as possible, without giving up on the business requirements for the data warehouse and associated applications. Therefore, IT compliance for hardware, software and data communications was an important aspect of the data warehouse implementation.

Hardware

The initial system used for development and production was a surplus HP K460 server with 3 CPU’s and 100 GB of RAID 5 storage. This was not a lot of processing power for a data warehouse, but it was a reusable asset that could be leveraged. The HP server was an approved server product and since it was located in a production data center, the hardware standard was met. * Term used as alias for Lockheed Martin’s Advanced Development Programs and is a registered trademark of the Lockheed Martin Corporation.

Software

The server was licensed for Oracle® 8.1.3, so additional funding for a relational database management system was not required. As an additional perk, that version of Oracle came with a product called WebDB 2.2 that was a nice little portal application that could be used as the basic framework for further application development. The Oracle database product was an IT approved database, and WebDB was allowed since it was included as part of the approved database product. WebFOCUS® from Information Builders (IBI) was introduced to provide on-line reporting for the Construction Management System to replace the FOCUS reporting previously used.

Data Communications

The financial management business portion of the data warehouse required data communication with two corporate systems that both used Oracle as the database management system. Since all systems used Oracle, SQL Net was the communication method chosen. SQL Net was not an approved data communications methodology, so a variance was required. Approval was granted since there was not an approved product that could provide the necessary performance at a reasonable cost.

Communication was also required to a third system that used an Informix database presenting an additional communications issue. The EDA (event-driven architecture) from IBI was used since it was an approved product and installed on the Informix system.

Application Development Methodology

Building a successful system depends on using a development approach that considers the ultimate system when initial construction begins. This is difficult when the development window is compressed and detailed user requirements are not existent. The key to a successful implementation is adopting a development methodology that allows rapid development and ensures scalability of the developed product. Four development standards were adopted to ensure the short and long term success of the data warehouse. Rapid prototyping was required to meet the scheduling demands for the initial offering of the spatial data warehouse. The specific methodology used was “Extreme Programming†” and was adopted since there were no documented user requirements.

Extreme Programming (XP) as defined by Wikipedia “is a software engineering methodology, the most prominent of several agile software development methodologies. Like other agile methodologies, Extreme Programming differs from traditional methodologies primarily in placing a higher value on adaptability than on predictability.

Proponents of XP regard ongoing changes to requirements as a natural, inescapable and desirable aspect of software development projects; they believe that being able to adapt to changing requirements at any point during the project life is a more realistic and better approach than attempting to define all requirements at the beginning of a project and then expending effort to control changes to the requirements.”

This extreme programming methodology proved to be effective since data, through applications and reports could be implemented in hours and days instead of months and years. Direct communication between selected subject matter expert users and the developers through an iterative process was very effective and dramatically reduced development time. The developer would learn the details of a business problem directly from the user expert who could immediately clarify requirements and data questions immediately. The developer would then implement a prototype of the requested data, report or application through a short series of iterations. This process would satisfy the needs of the expert user in a short time period and usually exceed their expectations with the finished product.

Data Standards were adopted to ensure data could be summarized from the most detailed information available. As in most organizations, databases developed independently over the years do not often contain a proper data structure to easily relate similar data from other corporate systems. This is an important obstacle to get around, but by establishing a single definition for an item in the data warehouse, the combination of data from multiple systems into a common structure can be achieved. The second aspect to the standard is to ensure that the standard selected has a direct relationship to spatial data standard contained in the corporate GIS. Proper planning and adopting the data standard will allow a consistent method for the summarization of tabular information and the correct spatial location of that information.

An example of this in the telecommunications industry is the use of the Common Language Location Identification (CLLI) codes to describe elements of the telecommunications network. All network items have a CLLI code that provides a direct correlation to the geographic area where the item is located. This naming structure is somewhat convoluted, so to solve this issue, every item had to be associated with the 8 character CLLI code that related to the geographic area of the service center. This allowed the information to be summarized at the service center, District, State, Operations Areas and Company levels. Most items had more detailed location information, but by defining the minimum requirement at the service center level, it was possible to summarize all data in the data warehouse in a consistent manner.

Database Design standards were adopted to ensure referential integrity was maintained, scalability was possible and satisfactory performance would continue with system growth. Referential integrity was primarily maintained within the database, however as more complex applications were developed, some data integrity had to be controlled at the application level. The database was partitioned based on geographic areas to improve performance, facilitate data loading, and limit the impact of failures in one partition on the other unaffected partitions.

Report Summarization was a critical part of the data warehouse design since a lot of the data being published had never been globally available in the past. One additional aspect of the reporting was that the data would be used for evaluation of departments and organizations; therefore the data was scrutinized more than ever before causing questions about the validity of the data. For this reason, all reports provided drilldown to the source data used for summarization on the respective report. With this feature, any authorized user could validate the data against the source system and be assured the data being reported was accurate. It is interesting when organizations are measured using data they create; they develop a keen interest in data validity. Providing the drill down capability reduced the level of complaints about the data quality. A benefit for senior managers was they could review summarized results and be able to quickly determine the specific cause of any unsatisfactory results. Since the data was also spatially aligned, geographic comparisons of results proved that “a map is worth a million words”.

Ongoing Support

The long term support of the system was a critical part of the implementation strategy. As the system matured, so did the requirement for real time support and since the system was managed outside the normal IT structure, an enhancement /defect tracking system was developed as part of the data warehouse system. When the system was first implemented, there were only two modules, so support was not too difficult. As time passed, the number of modules grew to 25 and so did the requirement for real time support. Since the system was managed outside the normal IT structure, an enhancement /defect tracking system was developed as part of the data warehouse system to manage and assign tasks to the support staff. The staff eventually grew to 7 and each developer was assigned a module to support. Users could enter defect information using an on-line system and the defect would be routed to the proper developer for analysis and defect correction. Subject matter experts could enter new enhancements to a module and the new enhancements were routed to the system architect for analysis and assignment to a developer to implement. Additional staff members were added to take calls from users for application support, but the total staff never exceeded seven. By maintaining a small staff and using the extreme programming techniques, the system was able to remain a cost effecting system and nimble enough to meet the demands of the ever changing telecommunications industry.

Additional Drivers For System Enhancements

As the utilization of the data warehouse grew, so did the imagination of the user community and senior managers of the corporation. As a result, the system grew and incorporated a wide range of data and analysis tools.

Asset Management

Asset records are generally created in a GIS and other records management systems that include detailed information including location and layout. These systems send data to a tabular asset management system where data is debited and credited and to keep counts of items including value, quantity, asset type, mortality, general location, tax code, etc. The data warehouse provided the capability to combine data from the records management, asset management system and other corporate systems to provide analysis capabilities in an on-line system that would normally be performed by an analysis using a desktop application.

Sarbanes Oxley Controls

Once the various asset management systems were successfully integrated, several “holes” in proper asset management were identified. One such hole was the removal of a plug-in (circuit card) from one location from another. There was not a corporate application that would facilitate the move of these circuit cards from one location to another and effect the change in the proper asset management records. By combining the asset data and providing the application to manage the location change, reason, authorization, etc., a control was implemented for a critical process that was previously uncontrolled.

Target Marketing

When telecommunications companies began marketing Digital Subscriber Line (DSL) services, one of the marketing strategies was that a customer could give up their 2nd telephone line since they could talk and surf the internet at the same time on one line.

Well, a lot of customers disconnected their 2nd line causing a loss in revenue and increased the level of spare telephone lines. With the number of spare lines increasing, capital spending in those areas was limited. To solve this dilemma, data analysis using various sources of data in the data warehouse produced the locations where an additional 2nd line could be provided to small businesses without a capital expenditure and marketing campaigns were conducted in those areas. This increased the utilization of existing spare lines and increased revenue for the additional lines recovered through this effort.

Ad Valorem Tax Assessments

Utility, telecommunications and other companies pay taxes on assets in municipal tax areas. The tax liability for telecommunications companies is substantial because of the number of assets and level of revenue derived from those assets. However, taxes may be assessed on abandoned or unused assets in error. Using the power of the data warehouse that combined data from numerous corporate systems, assets that should not be taxed could be identified and a compelling case made to the taxing agency resulting in substantial tax savings.

Hazardous Material Reporting

Electronic equipment placed in locations throughout cities in buildings, huts, underground vaults and cabinets on the side of the road contain batteries to allow continuation of service in the event of a commercial power failure. The type batteries used in these locations require periodic reporting to the Environmental Protection Agency (EPA). There was not a single database for these batteries so gathering data for the EPA reports was tedious, even though there was a data collection system used by technicians to collect and update this information. Since the spatial data warehouse contained all asset location data, it made sense for the individual battery inventories to be made a part of the data warehouse. This not only simplified the EPA reporting, but also provided a corporate inventory of batteries by type, voltage, vendor, and date installed and provided a method of determining when batteries should be replaced.

Disaster Response

Managing the restoration of telecommunications service after a disaster requires the use of many IT systems. One value added application the data warehouse provided was the ability to determine the number and type customers affected when commercial power was lost at a remote equipment location. When power is lost, an alarm is sent to an alarm database and the system automatically reverts to battery power. By having data from the alarm system, equipment locations, battery information and specific customer types served by the equipment affected by the outage, the estimated time for batteries to continue service could be calculated and the numbers of customers affected by the outage was known. This information was used to prioritize dispatches, provide the number of customers affected by geographic locations such as zip code, city and county, even though the data is stored by wire center or other telecom areas not meaningful to the general public. This was helpful for improved public relations, Public Service Commissions (PSC) and the Federal Emergency Management Agency (FEMA). This system was used beginning in the 2004 hurricane season.

The Data Warehouse Evolution

The spatial data warehouse began as a simple reporting system, but evolved over time to become a robust data warehouse and application platform.

Simple Report Generation Engine

The first phase of the system was to provide revenue and capital reporting using WebDB. Summary data was stored within the data warehouse; however detailed data remained on the source system, but was reported using the WebDB report engine. The second part of the system included reporting for the construction management system using WebFocus. The reports were available through the main WebDB portal and user security accessing data hosted on the 4 remote Informix databases.

On-line Analytical Processing (OLAP)

The next phase of the system was to implement OLAP and add other data to the data warehouse. This change added customer information and demographics to the system where marketing campaigns could be created to enhance revenue in certain product groups. By combining marketing, financial, capital and construction data into a single data warehouse, decision makers had access to real-time information for network planning, budgeting, and capital management. This information was used for day to day operational decision making, as well as, supporting marketing initiatives to match service offerings to available facilities.

On-line Transaction Processing (OLTP)

The final phase of the system included the implementation of OLTP and one of the first applications developed was a scheduling tool for construction resources. This application provided the tools necessary to schedule material purchases, contractor resources, telecom resources and other resources from other support organizations. The on-line application provided a means of collaboration among groups in various buildings, Cities or States. All history for a specific work activity was documented in the system including scheduling meeting notes to provide an audit trail for job activities. This system replaced a cumbersome client application for job scheduling.

One other significant transactional system was for the management of electronic plug-in equipment. The system stored transactions entered by field technicians for the movement of electronic equipment from one location to another. The transaction was reviewed and approved, then sent to the asset management system to complete the transaction. This system did not exist until implemented as part of the data warehouse application suite.

Ultimate System Configuration

The spatial data warehouse experienced phenomenal growth requiring various infrastructure changes over a 5 year period.

Hardware

The initial system was a single HP K460 server with 3 CPUs, 2 GB Ram and 100GB internal RAID 5 storage. The system was installed in a secure data center. The ultimate system went through two iterations of improvement resulting in 3 server groups located in 2 data centers in separate States. The production system was a Sun Sunfire™ 4500 with 12 CPUs, 12 GB RAM and 3TB EMC® storage in a RAID 1 configuration. The Test and Disaster Recovery (DR) server was a Sun E6500 configured exactly the same as the production server. The development server was a Sun E4500 server configured with 4 CPUs, 4GB RAM and .5TB internal storage. Each system used a silo tape backup system and the production and Test/DR server required their own media servers to allow backups to complete in the allotted time frames.

Software

The original core software products remained, although products were upgraded over time to include Oracle 10g, Oracle Application Server and the current version of WebFocus. Connect Direct from Sterling Commerce was added to facilitate structured batch data transfer among those systems requiring that type interface.

System Interfaces

Interfaces were required initially to two corporate systems, but ultimately included connection to most corporate systems hosted on approximately 25 different servers.

Benefits Attributed to the Data Warehouse

The implementation of the spatial data warehouse provided a significant improvement for managing all aspects of the telecom business. One of the most important aspects of the system was having the ability to analyze capital spending relative to revenue growth. This analysis was possible at the State level before implementation of the data warehouse. This made it possible for operations managers to proactively manage their capital budgets considering the impact on revenue earnings.

Software license savings was another contributor to the success of the data warehouse resulting in first year savings of $300K. Additional savings in subsequent years approached $700K in recurring annual software maintenance expenses by consolidating on-line reporting to the data warehouse system.

Ad Valorem Tax Assessment savings resulted in over $1M in recurring tax savings based on the exclusion of abandoned and non working facilities. Large capital construction projects require substantial capital and interest is paid on facilities until they are placed in a working status. Interest on construction loans was reduced $4.5M over a three year period.

There are other non quantifiable expense savings, capital avoidance and revenue enhancement resulting from the effective utilization of the spatial data warehouse.

Lessons Learned

The most powerful lesson learned from this experience was the need to be aware of when the transition is made to an On-line Transaction Processing (OLTP) system. One of the failures of this implementation was not changing the backup and recovery strategy at the time the system added OLTP to the system. Previously, the backup strategy included weekly full backups and daily incremental backups. At one point there was a system failure affecting data created in one of the transactional systems.

Even though most of the data was recovered, it was an eye opener for the development team. Since transactions were being performed constantly, the implementation of archive logs and a backup plan to recover to a point in time backup plan should have been implemented. A plan was implemented immediately after this failure, but it could have been a significant event for the users and caused a negative perception of the system. Because of the prompt effort of the development team, this was avoided. This event also brought the IT organization and the operations application development group closer together and as a result, this working relationship was adopted as a model for the rest of the IT community.

Conclusion

A spatial data warehouse is one of the most powerful corporate assets in any corporation or government agency. Data that was previously available to only selected users is now readily available to users throughout the corporation. Although this system did not begin as a decision support system, it evolved into one of the most effective and critical decision support systems. Having the latitude to develop the system in a “skunkworks” like environment allowed the innovativeness of the team members to be unleashed and the results surprised everyone in the corporation. The system is still used today and will be for many years. New companies can design their IT systems to provide the effectiveness of a spatial data warehouse from the beginning. When established companies have so many systems that were built over the years as individual systems, the approach used for this data warehouse provides an extremely effective management tool for a reasonable cost by using the approach described in this paper.

William P. Geer

Geer Services, Inc.