Data Repositories – An Analyst Perspective

Data Warehouse  Enterprise data warehouses (EDW) are critical for reporting and Business Intelligence (BI) tasks, although from an analyst perspective they tend to restrict the flexibility that a data analyst has for performing robust analysis or data exploration. In this model, data is managed and controlled by IT groups and DBAs, and analysts must depend on IT for access and changes to the data schemas. This tighter control and oversight also means longer lead times for analysts to get data, which generally must come from multiple sources. Another implication is that EDW rules restrict analysts from building data sets, which can cause shadow systems to emerge within organizations containing critical data for constructing
analytic data sets, managed locally by power users.Analytic sandboxes enable high performance computing using in-database processing. This approach creates relationships to multiple data sources within an organization and saves the analyst time of creating these data feeds on an individual basis. In-database processing for deep analytics enables faster turnaround time for developing and executing new analytic
models, while reducing (though not eliminating) the cost associated with data stored in local, “shadow” file systems. In addition, rather than the typical structured data in the EDW, analytic sandboxes can house a greater variety of data, such as webscale data, raw data, and
unstructured data.