Tackling Data Challenges in Translational Medicine

When scientists gather, it doesn’t seem to take long for the conversation to steer to difficulties in corralling and analyzing data. In fact, a recent PerkinElmer survey shows that more than half of life science researchers said a lack of data transparency and collaborative methods is the key obstacle to precision medicine. New data-generating laboratory technologies are driving an urgent need to better manage and share an unwieldy influx of data.

Researchers across life science disciplines, including translational, are crying out for more practical ways to improve access to data and collaboration around that information.

Data Democratization

The goal has been to democratize data. This requires enabling scientists to access and analyze relevant data through scalable tools provided by IT rather than specific requests for each dataset or analysis.  Researchers are no longer satisfied waiting for IT or bioinformaticians to run reports; they want applications they can actually use, at the bench, to speed their work. 

Right now, the National Center for Advancing Translational Sciences says it takes, on average, 14 years and $2 billion to bring a drug to market – and perhaps another decade before it’s available to all patients who need it.

Tackling the data challenge could go a long way toward shortening that timeline and reducing that cost.

Tough Questions On Data

Some of the most pressing questions around data management stem from the most basic need: bringing useful, appropriate data together, and making it searchable and sharable, to solve problems. 

Translational researchers have a wide variety of medically relevant data sources available to them, from omics to adverse events to electronic health records and more. Tapping into the right data at the right time can help these researchers:

Determine new uses for approved or partially developed drugs

Analyze trends and predict potential direction for further research

Translate discoveries from basic research into useful clinical implementations

Analyze clinical data and outcomes to guide new discoveries and treatments

Here are some of the lingering questions that need answers in order to truly democratize data:

Question 1: How to Bring Data Together?

Most organizations still struggle to find all the data that might be helpful to them; or it is captured in silos that are difficult to penetrate, let alone aggregate. Often times, people who need the data aren’t even aware it exists. Figuring out the best means to usefully aggregate data remains a challenge.  Further complexity is added when layering in who is permitted to access specific datasets and how that access is controlled.

Question 2: How to Compare Data?

Once data is aggregated, researchers must be able to determine if they are accurately comparing appropriate or related data sets. This can stem from non-standard ontologies that make it difficult to map different data to each other. Additionally, if two items look similar but are in fact very different, how can the scientist tell? 

Question 3: When to Normalize Data?

Aggregating and integrating data inherently changes the data; it is possible to manipulate the data without meaning to. So some are in favor of normalizing data as early as possible in the integration process, arguing it’s best to align the data well before analysis. Others say normalizing all data – some of which you may never use – is too time consuming and expensive. They support aggregating data later, so it is closer to raw when doing analysis. This can make analysis more effective because there is more context, but the data is harder to share.

Question 4: Who Analyzes Data?

In most organizations today, a small subset of the research organization – data scientists and bioinformaticians – perform data analyses. This creates a bottleneck. But until most bench researchers have the tools and skills to analyze the volumes of data they encounter, it is going to be difficult to scale analysis capabilities. Currently, the solution has been to employ more data scientists.

Delivering Answers

To solve the problem of helping researchers and scientists more quickly and efficiently analyze data themselves, we’re building scientifically relevant applications in an intuitive, simple, and repeatable framework with PerkinElmer Signals™ platform. We’re delivering workflow-based applications on the Signals platform for uses from Translational to Medical Review to Screening and more.  

Powered by TIBCO Spotfire®, the Signals platform makes it easier and more intuitive for researchers to perform everyday data mining. It enables scientists to leverage a single platform that combines best in class technology to address the issues around big data storage, search, applying semantic knowledge, and analytics in a solution that they can understand and use, leading to faster insights and greater collaboration.

PerkinElmer Signals is our answer to the four basic, yet pressing questions above. With it, we’re providing an out-of-the-box cloud solution that can handle the wide array of experimental and clinical data available to translational scientists. Without IT intervention, they can integrate, search, retrieve, and analyze the trove of relevant data from across internal and external sources.

If you’ve got questions about Precision Medicine data management tools and ROI, download our white paper.


Collaborative Data Storage & Security: Critical Needs of the Precision Medicine Data Life Cycle – Part 2


As we discussed in the previous post on precision medicine, the quantity of data being generated in the life sciences is reaching staggering proportions - especially in the field of genomics. This post is the second in our Critical Needs of Precision Medicine Data Life Cycle series. 

Growing Biological Data Analysis Costs 

Although generating raw DNA sequence data has become progressively less expensive, the associated costs of data analysis have continued to grow. Due to the challenges of computational resource availability, cloud computing has become increasingly more important in the development and execution of large scale biological data

Scalability and collaboration are often cited as primary motivations for cloud computing both by commercial and academic scientists. The utility and scalability of the cloud is an attractive option for not only multi-site collaborative research projects, but also for smaller labs lacking adequate computational infrastructure to meet current and future needs. 

Cloud Computing Defined

So, what exactly is Cloud Computing? Gartner Group  - a world-class IT consulting organization - describes it as “a style of computing in which massively scalable IT-related capabilities are provided ‘as a service’ using Internet technologies to multiple external customers.” 

The group also predicts that “By 2020, a corporate ‘no-cloud’ policy will be as rare as a ‘no-internet’ policy is today”.  To put it simply, scientific or business users who want to run complex applications or store very large datasets no longer have to rely on in-house computing infrastructure but can simply rent the services of a cloud services vendor, do their work, get their results, and then release the resources back to the cloud.

50 Years of Cloud Computing

Cloud computing is by no means a new phenomenon. In fact, it traces its roots back over 50 years to the computer clusters of the 1960s in which groups of computers were networked together to function as a single computing entity. 

Computer clusters eventually grew and developed into the internet and led to the rise of grid computing – a form of distributed computing. One key cloud development – MapReduce – was implemented by Google to regenerate their entire index of the web. MapReduce (and open-source adaptations such as Hadoop) allow large datasets to be broken into smaller pieces which can be spread among different computers - a key element in today’s life sciences cloud computing

Amazon Web Services & Life Sciences R&D

Today, Amazon Web Services (AWS) is the market leader for cloud computing both in general, as well as for the life science R&D sector. AWS is currently used to create scalable and highly available IT infrastructures to store, compute and share data. It is also the technology platform used to deliver cloud scale architecture to PerkinElmer Signals for Translational.

Private, Public & Hybrid Cloud Computing

Cloud computing services can be offered as either public or private clouds, or as a hybrid model combining elements of the two. Public clouds allow the user to ‘rent’ hardware and software needed to process or store their data, and release back to the cloud when no longer needed. 

Private clouds are generally preferred by large organizations that cite data security as a primary concern. Private clouds have the advantages of the cloud model, while keeping the infrastructure itself contained behind their own firewall. 

A third model - the hybrid cloud - allows companies to keep key data within their firewall while extending selective activities out to public clouds. 

Cloud Computing Security

Because healthcare data is subject to certain regulations that other industry sectors might not face, both commercial and academic sectors have in the past voiced concerns over data security and integrity in cloud services

However, cloud security has surpassed the security measures at most private data centers - and cloud solutions are well positioned to be turned into data-security aggregators. Given the protections afforded to their mission-critical data and intellectual property, the life sciences industry as a whole is beginning to embrace cloud technology.

For example, Pfizer’s adoption of the Amazon Virtual Private Cloud - which permits a company to extend its firewall and other security measures to the cloud (though at some cost in operating efficiency) - exemplifies this issue. Hybrid cloud solutions are also very popular because they can be deployed to provide the required information for research while maintaining personal or confidential information on a separate system.

The Promise of Cloud Computing

The most promising aspect of the cloud technology “is realizing the pairing of the cloud with big data, analytical tools and mobile devices, especially in healthcare where it can provide around-the-clock monitoring at a fraction of the cost of traditional on-premises tools.” 

Cloud solutions can be leveraged to improve the quality and accessibility of data by allowing data mashups between public and private data sets. This, in turn, enhances the quality and accessibility of life sciences and clinical trial data. 

At PerkinElmer informatics, this is precisely what we are trying to achieve by leveraging the best-in-class in cloud technology and data analytics to help disseminate information faster and more efficiently, in order to provide deep insights into Translational Medicine data. 

For further information, check out this webinar  and stay tuned to find out about other Critical Needs for the Precision Medicine Data Cycle.  


 


How Analytics Centers of Excellence Improve Service & Save Costs


Centers of Excellence: Centralizing Expertise
The “Center of Excellence” as a business model has an assortment of definitions and uses. In general, such “centers” are established to reduce time to value, often by spreading multidisciplinary knowledge, expertise, best business practices and solution delivery methods more broadly across organizations.

They have been identified as “an organizing mechanism to align People, Process, Technology, and Culture” or - for business intelligence applications - as “execution models to enable the corporate or strategic vision to create an enterprise that uses data and analytics for business value.” Still others define these centers as “a premier organization providing an exceptional product or service in an assigned sphere of expertise and within a specified field of technology, business or government…


Using a CoE to Improve Business Intelligence
In approaching how the Center of Excellence (CoE) concept might improve business intelligence (BI), analytics, and the use of data in science-based organizations, PerkinElmer Informatics has developed an Analytics Center of Excellence to deliver service for our customers.

As a framework, the CoE offers ongoing service coverage by experts from a variety of domains, including IT & architecture, statistics and advanced analytics, data integration & ETL, visualization engineering and scientific workflows. In many cases an expert is located at your facility and then leverages a wider range of remote staff, to provide support, reduce costs, and eliminate red tape and paperwork.

There are four pillars to our Analytics CoE for your organization: 

Architecture Services
Mainly for IT, this covers architecture strategy, sizing and capacity planning, security and authentication, connectivity and integration planning, and library management

Governance Services
Centralizing planning, execution and monitoring of projects, Program Management approach to managing multiple work streams, Steering Committee participation, SOPs and best practices, and change management

Value Sustainment Services
Expertise for subject matter consulting, support, hypercare, roadmap and future planning, and analytics core competency

Training & Enablement Services
Training needs assessment, training plans, courseware development, training delivery and mentoring

Cost Savings with Standardized BI Solutions
PerkinElmer’s Analytics CoE leverages TIBCO® Spotfire to help our customers get the most out of this technology as quickly as possible - from the experts. Very often - especially at mid- to large-enterprises - the question is asked, “Why aren’t we standardized on a single BI solution?”

It’s a good question.

Rather than investing time, effort, and money in evaluating, implementing, and maintaining and updating several BI solutions, not to mention training staff to use them, considerable cost savings can be gained from deploying a standard business intelligence solution across the enterprise. And the savings can be further supplemented because the Analytics CoE covers both foreseen and unforeseen needs.

Under an Analytics CoE implementation, cost savings are derived from:

  • Economy of scale from a suite of informatics services
  • Reduced administration efforts for both customer and vendor
  • “Just-in-time” project delivery that engages the right resources at the right time


Reducing the Pharma Services Budget
After converting to the Analytics CoE model, a top 25 pharmaceutical company saved 50% on its services budget, relative to TIBCO® Spotfire. This was possible because the services were bid out once - not for every service engagement. Purchasing service engagements was significantly less fragmented, and the high costs of supporting multiple tools & platforms and responding to RFPs was greatly reduced.

Standardizing on an Ongoing Service Model
Centralizing around a formal service model focuses management of the vendor relationship on a single partner – who truly becomes a partner as they manage projects across multiple domains and departments. 

The Analytics CoE model, also called competency centers or capability centers, oversees deployments, consolidation of services, dashboard setup and platform upgrades - all without the additional burden of new RFPs, vetting of new vendors, and establishing new relationships.

The benefits of standardizing on an ongoing service model, centered on a standard BI platform, include:

  • Holistic approach to deploying analytics solutions across the organization
  • Cost savings from reducing the number of tools used 
  • IT organization isn’t spread too thin as it no longer has to support multiple systems
  • Greater departmental sharing
  • Improvements beyond the distributed model

In addition, there are numerous reasons for analytical organizations to adopt an Analytics CoE:
  • Program Management managing multiple project workstreams and chairing Steering Committee meetings to provide management insight into solution delivery.
  • High quality of subject matter expertise (SME) available for your projects; SMEs are pulled in as needed and are billed against CoE.
  • Significant savings over typical daily rates – up to 50%.
  • Flexible engagement period.
  • Hourly rate fees move from the FTE model to “pay for what you use” further reduce costs.
  • Multiple projects billed against Analytics CoE.


Are you ready for true service excellence in your data-driven organization? Find out if PerkinElmer’s Analytics Center of Excellence is a good fit.


Contact us at informatics.insights@PERKINELMER.COM

Addressing Critical Needs Of Precision Medicine Data Life Cycle

With the amount of data currently being generated, we are in a unique position to find diagnoses and treatments for a multitude of diseases. However, the progress of lab technologies in generating data is now beset by another very unique form of challenge- making sense of the copious amounts of immensely heterogeneous data. 

In the 2010 paper, “The $1,000 genome, the $100,000 analysis?” the author rightly points out that regardless of how cheap human genome sequencing gets the development of ‘clinical grade’ interpretation analysis is needed to make coherent clinical sense out of the data. However, as mentioned in one of our previous blogs (Beyond Genomics: Translational Medicine Goes Data Mining), genomic data cannot work in isolation within a biological context and integration of knowledge from different biological silos is the next big challenge. The clinical utility of all this data will be determined by our ability to mine these data appropriately by addressing some very critical pain points in the data life cycle that are briefly discussed below: 

1) Collaborative Data Storage & Security

As the availability of computational resources becomes challenging, cloud computing is becoming increasingly more important in development and execution of large scale biological data. It’s scalability on demand is an attractive option, especially for multisite collaborative research projects.  Healthcare data is subject to certain regulations that other industry sectors might not have to comprehend, such as ensuring that data is stored in on premise private data centers. However, cloud security has surpassed the security measures at most private data centers and cloud solutions are well positioned to be turned into data-security aggregators. Furthermore, on premise solutions are not able to provide the same level of scalability as cloud computing without significantly increasing the infrastructure costs. This coupled with multisite collaborative research projects that happen in healthcare, makes cloud solutions an attractive scalable option on demand. This article can help you assess whether an on-prem or cloud solution is better suited for your needs. 

2) Facilitating rapid transfer and data processing: Support for Distributed Research

Using tools that allow for processing and storage of extremely large data sets in a distributed computing environment are a foundation for Big Data processing tasks. Some of these Big Data tools include Hadoop Distributed File System (HDFS) and Spark. The distributed file system of HDFS facilitates rapid data transfer rates among nodes and drastically lowers the risk of system failure, whereas Spark can process data from a variety of data repositories including HDFS, NoSQL databases and relational data stores (e.g. Apache Hive). These technologies help organizations move away from traditional data warehouses towards a data lake where data can be stored in its original structure.

3) Access to public or legacy databases: Data Type Flexibility

There is currently a large amount of data sitting in public databases and the need to integrate them with your data can be of utmost importance. Any Big Data platform looking at life sciences data needs to deploy technology that allow for a seamless access to data stored in databases such as Gene Expression Omnibus (GEO), tranSMART, OHDSI etc. just to name a few.  

4) Searching and analyzing data in real time: Accessible Data 

The ability to search and query data at a fast speed is a critical step in the data lifecycle. Tools such as Elasticsearch vastly improve the ability to query/mine your data by focusing on searching an index instead of searching/querying the text directly. This allows for a seamless flow of information from the data lake to the user.  

5) Enriching or curating your data: Information Intelligence

A comprehensive environment should further integrate tools that enrich data by adding context for deeper and more meaningful integration of data from different sources. Tools such as Attivio achieve precisely this by semantically enriching the data across structured and unstructured silos and thereby making the eventual analysis more powerful.

6) User friendly Advanced Data Exploration applications: 

Providing the end-user with a state of the art user friendly workflow is a two-tiered challenge. Firstly, the ability to reuse analytics workflows for reproducible analysis of biomedical data is become increasingly important. Secondly, visually aided data exploration is an important component of combining scientific data and disseminating complex knowledge. The ability to interact with data to slice and dice it in different ways whilst working with a reproducible analytics workflow can help the end user to identify unexpected patterns and allow them to further refine their hypothesis. Visual data analytics platform such as TIBCO Spotfire® allow for self-service access to all relevant data and allow the end-users to take an exploratory approach of their data and make informed decisions based not just on interactive dashboards but with best in class statistical analysis. 

The large scale nature of biological data means that we need to have an agile, integrated environment that implements the right tools to tackle the problem of data storage, management, integration and eventual analysis. All of the components of the data life cycle need to work in sync and in an optimal manner to enable the end-users to make real-time decisions in a scalable and informed manner.  In subsequent blogs, we intend to tackle each one of these pain points in detail to see how a turnkey solution can be created for Translational Medicine applications.

Want to learn how you can configure your data analytics workflow to address the critical needs of your Precision Medicine Research? Join David John for a dedicated webinar on April 24th. 


Precision Medicine: Can Informatics Help Meet Expectations?


A little over two years ago, the Precision Medicine Initiative was announced to much fanfare at the prospect of one day being able to provide individuals with the right treatment at the right time, based largely on their unique genetic makeup, as well as environment and lifestyle factors.

Since the sequencing of the human genome, efforts have been underway to leverage genetic testing to find more tailored treatments and therapies for individuals’ conditions. The White House says “precision medicine is already transforming the way diseases like cancer and mental health conditions are treated,” and points to molecular testing and genetics to determine the best possible treatment. It put $215 million behind the initiative, which is collecting genetic data from volunteers, to be shared broadly with researchers and others involved in finding precision - or personalized - medicine solutions.

From a consumer perspective, precision medicine “is not yet delivering customized care.” In fact, there have been some troubling misdiagnoses and - more generally - difficulty for medical professionals to make individual treatment decisions based on the genetic data currently made available to them. 

Potential for Precision Medicine

Excitement at the potential for precision medicine, however, has not dimmed. Instead, government agencies, research institutes, medical professionals, and technology vendors are working hard to deliver on its promise.

In 2016, the U.S. FDA issued two sets of draft guidance to streamline regulatory oversight for Next-Generation Sequencing tests that are used to identify a person’s genome. The guidance will help developers of NGS-based tests as FDA works to ensure the tests are safe and accurate.

The first draft focuses on standards for designing, developing, and validating NGS-based in vitro diagnostics used for diagnosing germline (hereditary) diseases. The second draft guidance focuses on helping NGS-based test developers use data from FDA-recognized public genome databases to support clinical validity.

Data-Driven Precision Medicine

Bioinformatics joins NGS and drug discovery as technologies that will - in part - drive the global precision medicine market to nearly $173 billion by the end of 2024. The global market study said “proper storage of genome data plays a crucial part in this segment,” and reported acute data storage and data privacy issues remain to be solved.

Since precision medicine is a data-driven initiative, it makes sense that standards apply to how data – whether clinical or research  –  is collected, stored, analyzed, and used to support disease research, translational medicine, and drug discovery. PerkinElmer welcomes efforts to standardize big data analytics for precision medicine. 

Informatics platforms will be required that can: 

Support translational research with designated workflows

Securely consolidate public databases and patient information in a single solution

Provide analytical and visualization capabilities for data from a host of sources – electronic health records, clinical lab records, genetic testing and more

Integrate and aggregate data for cohort analysis

Leverage the cloud to increase access to the broadest range of data at a low cost

Enable self-service and effective collaboration within and across organizations

Want to leverage informatics to make the excitement for precision medicine a reality? Deploying the right informatics solutions can set you on the right path.

Download our white paper, The Need for an Informatics Solutions in Translational Medicine, to learn how our platform - designed to address the complexities of translational research - enables researchers to more quickly and easily identify and manage biomarkers essential to precision medicine.


Beyond Genomics: Translational Medicine Goes Data Mining


We are fortunate to live in a time of growing life expectancy across most world populations, but this has also resulted in an increased prevalence of chronic diseases. Accounting for more than 70 percent of healthcare spending in the developed world, treatments for chronic diseases are typically costly, prolonged, and - in many cases - largely ineffective. 

This is further compounded by the extremely low probability - less than 10 percent - that a drug will make it from Phase 1 to approval during the course of a clinical trial. This statistic is disheartening not only for the scientists, physicians, and patients involved, but also for the drug development industry at large. In addition, the average cost of clinical trials - before approval - has reached $30-40 million across all therapeutic areas in the U.S.

A Drug Development Strategy Focused on Efficacy

This unsustainable situation has driven the notion that money could be better spent to pursue more effective treatments. Over the last couple of decades, the drug development industry has been looking at strategies to select more efficacious drugs while controlling the ever-spiraling costs related to their development.

This has led to the evolution of the multifaceted discipline, Translational Medicine (TM), which has been called “bidirectional” since it “seeks to coordinate the use of new knowledge in clinical practice and to incorporate clinical observations and questions into scientific hypotheses in the laboratory.” The beauty of the translational approach is that it applies research findings from genes, proteins, cells, tissues, organs, and animals to clinical research in patient populations, with an explicit aim of predicting outcomes in specific patients. Essentially, it promotes a “bench-to-bedside” approach, where basic research is used to develop new therapeutic strategies that are tested clinically. 

From Bench-to-Bedside…and Back Again

However, translational also works “bedside-to-bench” since learning from clinical observations can provide optimal feedback on the application of new treatments and potential improvements.  

Recent technological advances have endowed us with the ability to test this “Bench-to-Bedside-to-Bench” approach. Today we can investigate the molecular signature of patients to identify biomarkers (or surrogate clinical endpoints) that then allows us to stratify the patient population and only administer the drug to those who have any hope of responding to it. 

Herceptin, arguably ‘the first personalized treatment for cancer,’ is a good example of the benefits of a translational approach. A 30-year success story in the making, Herceptin was discovered by scientists who used genomics technologies to identify ‘over-expression of HER2’ which leads to a particularly aggressive form of breast cancer. Adding Herceptin to chemotherapy has been shown to slow the progression of HER2-positive metastatic breast cancer.  Read the story here

Human Genome Project – Translational’s Driving Force

The world's largest collaborative biological project, the ‘Human Genome Project,’ successfully mapped 95% of the human genome. The sequencing of the human genome holds benefits for a wide range of fields and is perceived as the driving force behind Translational Medicine applications. We can now look forward to a time where the focus will start shifting to a more ‘individualized approach to medicine,’ perhaps even a focus on disease prevention as opposed to treating symptoms of disease. 

The viewpoint of the Personalised Medicine Coalition, that “physicians [will] combine their knowledge and judgment with a network of linked databases that help them interpret and act upon a patient’s genomic information,” further shows faith in this unification of art and science in medicine.

The Human Genome Project may have spearheaded technological advances in the genomics and bioinformatics fields, but many challenges remain for TM to cross over into clinical utility and become mainstream. 

Beyond Genomics Knowledge

For one, Translational Medicine can no longer solely rely on the ever-present bounty of genomics knowledge (though it will keep us busy for quite some time). All biologists know that genetics doesn’t work in isolation. Yes, it helps to identify biomarkers and particular molecular signatures, but the integration of knowledge from different biological silos is the next big challenge – and opportunity. That challenge (and opportunity) is data.

The goal is to effectively mine data brought together from live experiments, external ‘open access’ sources, legacy and real-world data portals, clinical and preclinical data systems, and more.

PerkinElmer Informatics will examine the challenges associated with the data integration needs of translational researchers, to deliver on the promise of Translational Medicine.

Download this article which features insights into how translational is simultaneously reducing expenses and improving patient health.