Can Data Producers and Data Consumers
Find a Better Solution?
Would a business or scientific end user, a data analyst and data scientist, IT and purchasing professional, and an executive all agree what makes for the ideal informatics solution? Probably not - which is why we often end up with divergent IT stacks that support conflicting requirements.
End users want speed and agility to perform their analytics. Now isn’t soon enough, and all too often the data raises more questions than it answers, leading to new queries. Even if everyone agrees on the question, one of the biggest challenges is determining what data – including where it’s located and whether it’s been scrubbed – can answer it. For some queries, imperfect data now is better than clean data in weeks or months.
The Rise of Frustration-Driven Spreadmarts
Frustrated, users start exporting data to flat files from reporting tools like Business Objects, then creating spreadmarts to combine data from other data sources in Excel. This is a difficult-to-repeat and error-prone process. Another difficult-to-replicate process involves creating analyses in Excel, or loading data from Excel back into analytics tools. This effort must be repeated any time the analysis needs to be updated.
Ideally, end users like the idea of self-service because it can be too frustrating to endure multiple cycles of IT requests.
Data analysts and data scientists spend too much time manually finding and assembling data for end users. They and IT want to give end users access to the wealth of data generated both inside and outside the organization’s firewalls.
Avoiding Stealth or Shadow IT
But their preference is for governed self-service, or IT-supported rapid data-request prototyping. This helps prevent stealth or shadow IT efforts that can create havoc within organizations. When individuals find their own ways to provision data for analysis, it becomes ungovernable…not to mention non-repeatable across the enterprise, and open to error or security risks.
C-suite and purchasing executives want to control data costs without hamstringing operations. And, with Data management and data analytics becoming central to business strategy, executives are looking at data as the holy grail of insights that lead to better decisions that drive profits, innovation, and business advantage.
While IT doesn’t need total control, they do need visibility into how data analysis is being performed and what tools are being used. How, then, to get to single source of truth (SSOT) systems that everyone can rely on?
A Platform for Ad-Hoc Analytics: Bringing it All Together
One way to support the needs of various data producers and data consumers in an organization is through an IT-supported platform for ad-hoc analytics. This provides governed self-service access to the wide array of available data while also allowing IT to rapidly operationalize new data requests into the governed IT stack.
Note: self-service can mean two things. It can mean end users have direct access to the data, or that IT has a better understanding of data across the organization and is better able to quickly prototype new data requests. Realistically, it’s a blend of the two.
With an IT-managed platform that lets users provision their own data, the demand on IT from circular ad-hoc queries is greatly reduced. Yet, the platform still gives IT visibility into the queries and searches that are being created, which they and data scientists can then use to rapidly prototype new applications to integrate into the governed SSOT system (often a data warehouse).
This approach meets the needs of data producers and consumers alike, adding speed and agility to the process while protecting organizational data and the system overall with a single version of the truth.
Protecting Data While Simultaneously Enabling Self-Service: Can It Be Done?
To build such a platform requires the creation of a semantic layer to catalog all information within an organization – as well as public data sources relevant to its users. This helps users understand the data available to them, from which they can seamlessly identify and unify the relevant data for their analysis.
Self-Service Data Discovery: The ‘eCommerce’ Approach
The ideal solution becomes a self-service discovery portal that delivers an ecommerce-like shopping experience, where users search for data of interest (both known and unknown to them). The solution would also assist in identifying and unifying the best, most relevant data sources, regardless of structure, for the analysis at hand.
Finding data quickly is more like visiting an online library, some have proposed, where a data librarian provides the structure for users to find the materials they need - when they need it - for their research. Instead of being perceived as the gatekeeper, IT, data analysts, and data scientists - using a well-managed platform - transform into beacons, shining a light on the pathways to information and insight.
There are opportunities for more holistic data initiatives. The best solutions provide rapid, self-service or guided access to semi-governed data. They move ad-hoc queries to an IT-owned platform for traceability and repeatability. And they enable rapid prototyping of new requests and operationalize them in the governed IT stack.
To deploy the right platform to support your ad-hoc data analytics, make sure yours:
- supports an ecommerce-like experience for accessing data
- recommends data relevant to the search
- correlates all structured, semi-structured, and unstructured data
- enriches the data with scientifically relevant dictionaries and ontologies
- simplifies provisioning data to business intelligence tools, like PerkinElmer TIBCO® Spotfire
Are you meeting the needs of both your data producers and data consumers?
Learn more at PerkinElmer.