Data Warehouses and Enterprise Content Management
In an ideal Enterprise Content Management system, all the content generated by the enterprise would go to a data warehouse. The content would be stored in a manner that can be queried and analyzed in ways that brings up the knowledge represented by this content.
Because the accumulated content represents the experience and other sources of business-related knowledge of the enterprise, it should ideally be possible to use this knowledgebase for making any kind of business decision.
That is the goal ECM strives for. In practice, enterprises would be somewhere on the road to this ideal.
What is Special About a Data Warehouse?
A data warehouse is not just another database of transactions. Firstly, it contains both structured and unstructured content. In most enterprises, unstructured content in the forms of word-processed documents, emails, rich media used for marketing and training purposes, and so on would be more voluminous than structured content.
Secondly, the data warehouse is optimized in a way that differs from transaction databases. While the latter are optimized to facilitate fast online transaction processing data warehouses are optimized for querying and analysis. Features such as tags and other metadata, along with advanced search capabilities, facilitate querying the data warehouse for different kinds of information.
The most notable characteristic of a data warehouse is that it seeks to accommodate the enterprise content in its entirety, instead of content related to particular departments or functions. This permits intra-departmental exchange of information, an essential requirement for making informed decisions.
Yet another feature of data warehouses is the capability to allow online access to the content from anywhere in the world. This is important in the case of a global enterprise as their senior managers could be located anywhere in the world, and might need to access the content even while on the road.
The same online accessibility makes it possible to transfer content being generated all across the enterprise to the data warehouse without much delay. It is this feature that makes the data warehouse a truly enterprise content repository.
Data stored in data warehouses are not changed or deleted. It thus becomes possible to access historical data and generate trend reports and historical comparisons. These could be high value decision-support information for managers.
Likely Problems With Data Warehousing
The need for complex design, including tackling security concerns posed by Web access, and compatibility with existing systems could pose problems for implementing data warehousing. Careful management is essential for success of the data-warehousing project.
Data Mining
Data mining involves sorting through large amounts of data and extracting relevant information from these. With the huge volumes of all kinds of content in data warehouses, data mining is an essential feature they come with.
Data mining is facilitated through the use of metadata that is associated with data sets. Metadata includes such things as tags indicating the nature of the content and executive summaries or short descriptions of the content.
Using modern sophisticated algorithms, data mining can reveal significant business trends revealed by the historical data in the warehouse. The trends so revealed are often projected into the future to forecast future business scenarios.
Data mining is more than just a template-based data analysis in that it involves some degree of “intelligence” in producing the informational result.
A data warehouse is a repository that stores structured unstructured content being generated all across the enterprise. This content is mined to extract meaningful information, such as historical trends. Data warehouse is one option used by Enterprise Content Management systems to present a unified interface to all the content for its users.