Faster insights and better science in the search for desperately needed new therapies

In a previous post  we noted the increasing importance of biologics as therapeutic agents, with 37% of the drugs approved by the FDA in 2017 being biologic entities. A recent article in Chemical & Engineering News (June 4, 2018, pp 28-33) focused on activities in immuno-oncology, where biologic checkpoint inhibitors are being tested in combination with other immunotherapies: there are currently ca. 250 small molecule- and antibody-based immunotherapies in clinical studies, and > 1100 clinical trials in 2017 combined a checkpoint inhibitor with another treatment. 

With this increasingly urgent drive to discover and develop novel bio therapeutics in areas such as oncology, it is crucial that researchers are equipped with the best possible tools to capture, manage and exploit all the available data, and we commented that “In the area SAR and bioSAR, underlying chemical structural and bio-sequence intelligence are key requirements for meaningful exploration and analysis, and these are often only available in separate and distinct applications with different user interfaces, when ideally they should be accessible through a unified chemistry/bio-sequence search and display application, supported by a full range of substructure and sequence analysis and display tools.”

In this post we drill into these requirements in more detail and discuss how an ideal bioSAR tool should support faster insights and better science in the search for desperately needed new therapies. 

As we noted previously, researchers are struggling with a data deluge, and need effective tools to locate, extract, sift and filter relevant data for further detailed visualization and analysis. With biologics, these applications will need to understand and manage bio-sequences, and an immediate requirement will be to allow sequence searching, using a standard tool such as BLAST to search across internal and external sequence collections, to collect and import the appropriate hits in a standard format, and to link them to other pertinent properties (bioactivity, toxicity, physicochemical, DMPK, production, etc.)

With a tractable data set on hand, researchers will want to explore sequences to try to discern particular motifs or sequence differences that are correlated with bioactivity or desired physicochemical or DMPK profiles, and thus potentially amenable to further manipulation and enhancement. 

The sequences must be aligned, for example by CLUSTAL Omega, and visualizations should present sequences so that sequence differences are immediately highlighted, and monomer substitutions can be explored for potential links to bio-therapeutic activity. LOGO plots to investigate the distribution of monomers in a set of sequences, and annotations to highlight and share areas of interest will also help researchers to get to insights more quickly. 

If scientists want a deeper dive into the underlying structure of the sequence or a region, immediate access to a detailed and interactive 3D rendering of the biomolecule’s structure can provide a different lens through which to understand how different monomers substitutions may impact protein folding or active site binding and thus activity.

There may also be cases where required specialized analysis or visualization capabilities are only available in a separate in-house developed, third party, or open-source application, and the provision of an extensible Web Services framework will enable these to be quickly linked in to an enhanced analysis pipeline that can then be shared with colleagues and collaborators.

A bioSAR system providing the capabilities discussed above, equipped with an intuitive and unified user interface catering for novice and power users alike will enable them to derive faster incisive insights and make better informed scientific decisions in the search for novel bio therapeutic agents targeting some of the world’s most pressing unmet clinical needs.

Accelerate analysis of sequence differences relative to a reference sequence.


Rapid, Incisive Data Analysis for Lead Discovery

In a previous post we commiserated with drug discovery scientists and their IT colleagues in their daily struggles to deal with the ever-increasing research data deluge. We then attempted to ease their pain by exploring modern informatics tools and applications that guide them to rapidly and intelligently identify, locate, search, extract and organize tractable sets of relevant data (internal or external, structured or unstructured, small molecule or biologic) for detailed analysis and visualization. 

This current post moves to the improvements that are needed in the next stage in the drug discovery/lead identification/lead optimization workflow, where scientists will want to analyze data sets to derive insights and to answer pressing scientific questions and make research decisions such as: 

Identifying the most promising chemical scaffold and substituent set with oncologic activity

Exploring this set further to improve candidate compounds’ DMPK and toxicity profiles

Drop compounds which have been explored to the point that no further progress seems achievable. Things aren’t all bad, surely, as researchers can choose from an extensive (some might even say overwhelmingly scary)  array of analysis and visualization tools to wield at their data set. Such tools and applications  will often have their own idiosyncratic user interfaces and steep learning curves, and it can be difficult for scientists to become experts in all the tools that they might want to use, or even to know which are the most appropriate tools, and the order in which to use them for optimum effect. 

An ideal informatics platform should be well aware of these impediments to rapid and incisive data analysis and advanced structure-activity relationship (SAR) visualization. Instead of daunting the researcher with a potentially overwhelming menu of tools and applications, an effective system should offer a guided pathway embodying scientifically meaningful best practices to select and apply the most appropriate tools and techniques to the active data set. 

Faced with a potentially daunting set of chemical structures, bioassay results, physicochemical properties and DMPK/tox profiles, researchers will want simple access to a rich set of viewing options that can help present the data in an intuitive way, including gallery views, forms, combined chemical compound and sequence views, and 3D chemical structure overlay. These advanced visualizations should also be bolstered by a rich set of dynamic charting options, and robust statistical analyses. 

Data sets can be enriched and explored with more precision if the system can calculate additional physicochemical properties, and then filter and cluster the data based on observed and calculated parameters and structural descriptors to help in identifying promising lead series. Researchers also need to be able to explore and display data at multiple levels within a data hierarchy, from plate to compound level, and compound to chemical series or therapeutic project level for ease of navigation. 

Creating a SAR table should be simple and easy with point-and-click configuration. The resultant  SAR tables should be amenable to exploration in depth with a powerful set of chemical structural and biosequence analysis and visualization tools, including ultra-fast search and R-group analysis of chemical series, and biosequence search and alignment. These advanced chemistry and biologics SAR tools and workflows will be the subject of subsequent blog posts in this series. 

Modern systems should appeal both to power users, who will want unfettered access to advanced tools, and also to occasional users, who will want to extract immediate value with a minimal learning curve. A productive system should provide ready-to-use templates so that occasional users can immediately start to confidently explore their data, and as groups establish and refine their analysis workflows, these can be captured in shared templates for quick and consistent analyses between collaborating research groups. Adding an auto-update capability, so that saved queries and filters are automatically re-run as the underlying data is updated, will save time and increase productivity. 

The end result of using such an ideal, modern informatics system is more rapid and incisive SAR analyses and increased researcher productivity, as they focus on science rather than learning new user interfaces, and faster insights and better informed scientific decision making. If you are a medicinal chemist, assay biologist, or ADMET analytical chemist, you can  leverage self guided analytics and visualizations for faster lead discovery. Watch this webinar for a quick demoDiscover and automatically update project data.

Configure project-specific SAR tables and annotate for compounds of interest.

Perform compound series analysis to identify R-groups of interest.

Perform sequence analysis to identify amino acid changes of interest.

Find Clinical Candidates Faster: Watch webinar now. 

SAR Trek

In a previous Blog post we highlighted the day-to-day informatics problems facing IT/IS staff and researchers in biopharma companies as they struggle to discover and develop better drugs faster and more cheaply. Key among these was the challenge of dealing with the data deluge – more complex data at greater volumes, in multiple formats and often stored in disparate internal and external data silos. As Jerry Karabelas, former head of R&D at Novartis quipped, in an updated and repurposed phrase from Coleridge “Data, data everywhere, and not a drug I think.” And that was at the turn of the 21st century, and things have only got worse since then, with the term “big data” now getting over four million hits in Google. 

Typical therapeutic research projects continually generate and amass data – chemical structures; sequences; formulations; primary, secondary, and high-content assay results; observed and predicted physicochemical properties; DMPK study results; instrument data; sample genealogy and provenance; progress reports, etc. – and researchers are then charged with the responsibility of making sense of all the data, to advance and explore hypotheses, deduce insights, and decide which compounds and entities to pursue, which formulations or growth conditions to optimize, and which to drop or shelve. 

A usual first step will be to collect together all the relevant data and get it into a form that is amenable to further searching and refinement: but this poses a potentially challenging set of questions – what data exists, where is it, what format is it in, and how much of it is there? Answering these questions may then be complicated if the data resides in different, possibly disconnected, potentially legacy systems: e.g. chemical structures in an aging corporate registry, sequences in a newer system, assay results in another database, DMPK values buried inside an electronic lab notebook, and instrument data in an unconnected LIMS or LES. 

So the researcher is faced with knowing:

(a) Which systems exist, where they are located, and what they contain, 

(b) How to search each of them to find the required data, 

(c) How to extract the desired information from each source in the correct usable format, and 

(d) How to meld or mash-up these various disparate data sets to generate a project corpus for further analysis and refinement. 

They will still be likely to get frustrated along the way by things like different query input paradigms (e.g. pre-designed and inflexible search forms or the need to write SQL queries), slow search response times, and either too many or too few results to generate a tractable data set. If they opt to start with an overlarge hit list, they can try to whittle the list down by tightening up their search parameters, or by locating and subtracting items with undesirable properties, but in most cases they will be faced with a slew of somewhat different hit files which then need to be sensibly merged through a sequence of cumbersome list logic operations (e.g. intersect the pyrrolidine substructure search compounds with the bioassay IC50 < 0.5 nanomolar hits, and then see if any of those match the required physicochemical property and DMPK profiles in the third and fourth files). This trial-and-error approach is inefficient, unpredictable, potentially unreliable, and time-consuming.  

Fortunately, modern systems such as PerkinElmer Signals™ Lead Discovery are now available to overcome these challenges and to equip scientists with efficient tools to rapidly locate and assemble accurate, comprehensive and workable data sets for detailed and scientifically intelligent refinement and analysis. Prerequisites include a future-proof, flexible, and extensible underlying informatics infrastructure and platform that can intelligently and flexibly handle and stage all types of R&D data, now and in the future, data or text, structured or unstructured, internal or external.   Establishing an informatics platform like this and making all the relevant data instantly accessible to researchers removes the data wrangling challenges (a – d) discussed above and delivers immediate productivity and outcomes benefits as researchers are free to focus on science rather than software. 

Rather than struggling to remember where data is located, and how to search it, scientists can now be intelligently guided. Signals Lead Discovery lists and clearly presents the relevant available data (including internal and external sources) and offers simple and consistent yet flexible searching paradigms to query the underlying content. Modern indexing techniques (including blazing fast, patent-pending, no-SQL chemical searching, and a full range of biological sequence searching tools) ensure rapid response times to searches with immediate feedback to see whether a query is delivering the required number of hits. Intuitive views of the data in tables and forms with advanced display capabilities built on Spotfire also give immediate visual feedback about the evolving content of a hit set as it is refined, and data drill down is always available to get a more granular view of the underlying data. 

Once the researcher has adequately shaped and refined a data set  to contain all the required relevant data, it is then immediately available for further detailed analysis and visualization, using Signals Lead Discovery’s powerful set of built-in workflows and tools, or via RESTful APIs with external and third-party tools. This downstream analysis and visualization will be the subject of future blog posts in this series. This video shows how guided search and analytics can power your SAR analysis quickly and effectively.

Light at the end of lead discovery tunnel?

Drug discovery is hard (nine out of ten drug candidates fail), time-consuming (typically 10 - 15 years), and expensive (Tufts’ 2016 estimate $2.87Bn). But things are getting better, right? In 2017, although the EMA only approved 35 new active substances,  FDA drug approvals hit a 21-year high, with 46 new molecular entities approved, the highest number since 1996. This was mix of 29 small molecules and, demonstrating their increasing therapeutic importance, 17 biologics (nine antibodies, five peptides, two enzymes, and an antibody-drug conjugate). But of the 46 approvals, the FDA only counted 33% as new classes of compound, so the others would have to be from older classes of compound, which probably entered the R&D pipeline 15 – 20 years ago. 

Is this bumper crop of 2017 new approvals some reflection of major advances in drug discovery techniques and technology that primed the R&D pipeline at the turn of the century? Or is it just an artifact of the FDA approval process and timeline? Hard to say either way, but in the long game of drug development, scientists and researchers will be keen to jump on any improvements that can be made now. 

What contributes to the tri-fold challenges that make drug discovery and development hard, time-consuming and expensive? Surely the plethora of “latest things” – personalized and translational medicine, biomarkers, the cloud, AI, NLP, CRISPR, data lakes, etc. – will lead to better drugs sooner and more cheaply? At the highest level, probably; but down in the trenches researchers and their IT and data scientist colleagues are engaged in an ever-increasing daily struggle to develop and run more complex assays, to capture and manage larger volumes of variable and disparate data, and to handle a mix of small molecule and biologic entities; then to make sense of this data deluge and draw conclusions and insights: and often to do this with inflexible and hard-to-maintain home-grown or legacy systems that can no longer keep pace.  

Let’s look at some of these challenges in more detail.

The Sneakernet

Informatics systems built on traditional RDBMS require expensive DB operators just to keep them functioning, and much time and budget has to be devoted to fixing issues and keeping up with software and system upgrades: this leaves little or no time to make enhancements or to adjust the system to incorporate a new assay or manage and index a novel and different data type. This delays IT staff making even the simplest requested change and may spur researchers to go rogue and revert to using spreadsheets and sneakernet to capture and share data. 

The Data Scientist’s inbox

Organizing and indexing the variety and volume of data and datatypes generated in modern drug discovery research is an ongoing challenge. Scientists want timely and complete access to the data, with reasonable response times to searches, and easy-to-use display forms and tables. 

Older legacy informatics systems did a reasonable job of capturing, indexing, linking and presenting basic chemistry, physical properties and bioassay structured data, but at the cost of devising, setting up, and maintaining an unwieldy array of underlying files and forms.  Extending a bioassay that captures additional data, reading in a completely new instrument data file, or linking two previously disconnected data elements all require modifications to the underlying data schema and forms, and add to the growing backlog of unaddressed enhancement tasks in the data scientist’s inbox. 

In addition to managing well-structured data, scientists increasingly want combined access to non-structured data such as text contained in comments or written reports, and legacy systems have very limited capabilities to incorporate and index such material in a usable way, so that potentially valuable information is ignored when making decisions or drawing insights.

Lack of tools for meaningful exploration

Faced with the research data deluge, scientists want to get to just the right data in the right format, and with the right tools on hand for visualization and analysis. But the challenge is to know what data exists, where, and in what format. Legacy systems often provide data catalogs to help find what is available, and offer simple, brute-force search tools, but often response times are not adequate, and hit lists contain far too few or too many results to be useful. Iterative searches may help to focus a hit set on a lead series or assay type of interest, but often the searcher is left trying to make sense of a series of slightly different hits lists by using cumbersome list logic operations to arrive at the correct intersection list that has all the specified substructure/dose response/physical property range parameters.

Once a tractable hit set is available, the researcher is then challenged to locate and use the appropriate tools to explore structure activity relationships (SARs), develop and test hypotheses, and identify promising candidates for more detailed evaluation. Such tools are often hard to find, and each may come with its own idiosyncratic user interface, with a steep and challenging learning curve. Time is also spent designing and tweaking display forms to present the data in the best way, and every change slows down decision making. Knowing which tools and forms to use, in what order, and on which sets of data can be frustrating, and lead to incomplete or misleading analyses or conclusions. 

In the area SAR and bioSAR, underlying chemical structural and biosequence intelligence are key requirements for meaningful exploration and analysis, and these are often only available in separate and distinct applications with different user interfaces, when ideally they should be accessible through a unified chemistry/biosequence search and display application, supported by a full range of substructure and sequence analysis and display tools. 

R&D Management

Lab, section, and therapeutic area managers are all challenged to help discover, develop, and deliver better drugs faster and more cheaply. They want their R&D teams to be working at peak efficiency, with the best tools available to meet current and future demands. This first requires the foundation of a future-proof, flexible, and extensible platform. Next, any system built on the platform must be able to intelligently and flexibly handle all types of R&D data, now and in the future, structured or unstructured. Research scientists can then exploit this well-managed data with tools that guide them through effective and timely search and retrieval; analysis workflows; and advanced SAR visual analytics. This will lead to better science and faster insights to action. 

Follow us on social media to be notified of the next blog in this series 


ELNs: Selecting the Right Solution for Your Organization

What can be more thrilling – and terrifying – than getting the budget to purchase new technology for your lab? For those with the funds and the responsibility to select an electronic laboratory notebook (ELN) for your organization, the decision can be lasting. You don’t want to get it wrong. 

According to experts the most important thing you can do is buy a solution that addresses challenges and achieves stated goals, not simply evaluate and purchase a product. This requires homework on the frontend, to carefully and deliberately outline the criteria that a successful deployment must meet. 

What key outcomes do you need to achieve?

What data must be entered, and how should it be stored so that you can do with it what you need?

What is the workflow?

How will it enable collaboration? Improve data quality? Protect IP?

How easy is it to maintain, scale, and upgrade?

To select the best ELN for your needs, avoid seeking an abbreviation or exact automation of your paper methods. It is important to assess your needs beyond a simple translation of the paper-based tasks; use this as an opportunity to let technology drive enhancements – in productivity, collaboration, standards – that you may not have envisioned from a paper process.

Prepare for Change

With a strong understanding of how an ELN will improve your situation, you’re ready for step two: gaining buy-in from leadership to engage in change management. Moving researchers, lab analysts, and others from a paper notebook or computer-based methods (spreadsheets, for example) is difficult without gaining recognition that a better way exists. Involve leadership and users as you go through the process of selecting an ELN for your lab.

Preparing for change helps create the understanding of what, exactly, is required. Are there existing human or system challenges that must be overcome? What systems must the ELN connect to? What instruments? Is visualization needed? Should you use an on-premise solution, a hosted one, or one in the cloud?

Articulate Desired Outcomes

ELNs can solve many problems and bring many benefits. What is most important? At a high level, organizations often seek productivity gains – that researchers, for example, can more quickly and accurately plan experiments, record results, and search data. Collaboration is another important goal, so that the organization benefits from the shared knowledge of its researchers and analysts – and external sources as well. Other goals include:

Improved data quality, consistency, integrity

Knowledge management

Reduced repetition, rework

Data standardization and centralization

Efficient data search, retrieval, and comparison

IP protection

To gain knowledge from the ELN, it’s helpful if the solution can store your data in a structured manner and then be transferred to an analytics platform. Interfacing ELNs with visualization and analytics software or layering on tailored modules and add-on applications enable scientists to query and correlate data and generate actionable insights.

Strong ROI

The right ELN can offer a strong and swift return on investment. The financial case for an ELN stems from:

Enhanced R&D productivity and cost effectiveness, as much as 7.5% per user

Achieve a 25% improvement in your intellectual property. Your legal team is able to file more patents by taking advantage of high-quality scientific data, readily available and entered in a compliant way

Efficiency improvements – 20-30% greater than paper-based methods, redirecting 8-10 hours of user’s time per week for savings of 10% of the annual FTE-rate.

Two ELN Choices from PerkinElmer

Recognizing the many goals and needs of organizations, PerkinElmer offers two types of ELNs. For years, the on-premise, feature-rich, enterprise E-Notebook has been the market-leading ELN for major pharmaceutical, food and beverage, oil and gas, agriculture, and other industries. As many organizations are adapting to cloud technologies and the added value it brings, we introduced the next digital transformation: a cloud-based ELN built on our Signals™ platform, the Signals™ Notebook.  

Rather than attempting to “webify” E-Notebook, the Signals Notebook leverages more modern technology for workflow and decision support and global collaboration. Because it’s 100% web-based, with no downloads needed, no hardware to buy, and no IT assets to maintain, Signals Notebook provides immediate ROI for budget-minded science teams.

The highly configurable E-Notebook focuses on enterprise needs for collaboration, data quality, and increased productivity. This is accomplished with dedicated workflows for different scientific disciplines, and the data control and security that IT demands.

Both solutions enable analytics integration, which means organizations can build knowledge management systems from their ELN deployments. More than a notebook to store data, the ELN becomes the place people go to for information and to draw insight.

Is gaining better insights, faster, a main objective for you? Does your agenda include improving your laboratory efficiency and making research, experimentation, reporting, and collaborating easier for your scientists? 

If so, talk to PerkinElmer about our electronic lab notebook solutions.

Think Pink with ChemDraw

by Nessa Carson

Drawing with Flair

After much tweaking, I have a set of ChemDraw® settings, not unlike the sturdily-bonded Totally Synthetic stylesheet that went around the chemistry community a few years ago – but with a fetching pink background. In various workplaces, I’ve become famous for that bold look. 

Robinson’s total synthesis of tropinone, in my style1

My settings include: 

Thick bonds that give a crisp look to the molecules on screen, and are clear at the back of a presentation room

Explicit labels on terminal carbons, a requirement kept from my grad school supervisor’s rules

And even a special font such that serifs are present only on capital ‘I’, to distinguish it from nearby vertical bonds. 

However, the only comment when anyone sees my drawings is, “why is your ChemDraw pink?!”

I’m sure this is sometimes said with a hint of derision, but get over it. If I’m going to be staring at a screen for a portion of my day, it might as well be colorful. This is what I love about ChemDraw® : the chance to make it your own. 

Personalization is not only about the aesthetics. Features I typically suggest to newbie organic chemists include enabling autosave, snapping only the desired toolbars to the window edges, and switching default ChemNMR solvent from DMSO to CDCl3. And if you’re in pink, turning off ‘Print Background Color’ - though this option might only affect me. Once you’ve set these options, remember to save your personal stylesheet in the ChemDraw Items folder, so you are ready to go. 

Perfect Molecules, Every Time

Every chemist is unique, but most of us can be fussy about how our precious molecules look on screen. To this end, I recommend the Clean Structure feature. Highlight the molecule, or part of it, and press Ctrl+Shift+K (on Windows). Most of the time, if your bonds aren’t already perfect, they’ll snap to the default length, and optimize angles to avoid overlapping substituents. In particularly crowded structures, any computational effort may not match your individual preferences, and you’ll have to do some post-clean tinkering to get it exactly how you want. Luckily, this further personalization is not hard! 

As an example, I’ve drawn blockbuster chemotherapy drug taxol. Just kidding: I pressed Ctrl+Shift+N to automatically produce a structure from the name.

The structure of this natural product is too complex to depict with the usual 120° bond angles, particularly with its bridged 6-8 ring system. Sadly, my personal stylesheet falls down here. The molecule looks worse than with default settings – those explicit methyl groups have something to answer for! 


Since this compound is so complex, I like to use abbreviations to keep the groups from taking up too much space. I’ve condensed COPh groups to ‘Bz’ and COMe groups to ‘Ac’ – ChemDraw® recognizes these common abbreviations and continues to determine molecular weight and other properties correctly. You’ll notice, I’ve removed some of my explicit methyls for clarity’s sake. 


That looks better, but my abbreviations still overlap, and there are one or two portions I’d like to adjust. At this point, I reorient some of the groups. I can rewrite the NHBz as BzNH, then center it with Ctrl+Shift+C. I’ve also right-oriented the benzoyl group with Ctrl+Shift+R, and added in a new bond for the southeast acetate to avoid overlap. I selected any straggling atoms and the entire ester side-chain, and tweaked their exact positions with the arrow keys, to achieve my own variant of perfection. 

Lastly, I like to present important molecules with a frame, which can be sized automatically by selecting the entire molecule (double-clicking on any atom or bond), right-clicking, and choosing Add Frame. 

My final tip is for thesis-writers. If your reactions are too long to fit into a word processor document without being annoyingly resized, keep all bonds to identical length by using the Ctrl+K Scale feature at any time. 

Standardization Isn’t Always a Bad Thing

Don’t throw away your default settings yet! Default stylesheets look great in print, and are required for publication in major journals. Most chemists will also agree on roughly standard ways to transcribe flatter, more linear molecules to a 2D screen. For the majority of your drawings, ChemDraw®  will provide perfect chain bond angles (press 1 or 0 when an atom is selected) and Kekulé structure aromatic rings (press 3 with an atom selected) without effort – saving you time for the perfecting strokes that matter. 

1 Robinson, R. LXIII.–A synthesis of tropinone. J. Chem. Soc., Trans., 1917, 111, 762–768

Try for free ChemOffice® Professional, our robust, scientifically-intelligent research productivity suite that builds on the foundation of ChemDraw®

About the Author

Nessa Carson is a synthetic organic chemist based in southeast England. Nessa graduated  from the University of Illinois at Urbana-Champaign with an MS degree in organic chemistry, working with Prof Scott E Denmark. 

Nessa is also a freelance writer, with a regular column in Chemistry World. She tweets as @SuperScienceGrl, where she mostly enthuses over new papers and complains about fluorine. 

Nessa can be reached at 

Tackling Data Challenges in Translational Medicine

When scientists gather, it doesn’t seem to take long for the conversation to steer to difficulties in corralling and analyzing data. In fact, a recent PerkinElmer survey shows that more than half of life science researchers said a lack of data transparency and collaborative methods is the key obstacle to precision medicine. New data-generating laboratory technologies are driving an urgent need to better manage and share an unwieldy influx of data.

Researchers across life science disciplines, including translational, are crying out for more practical ways to improve access to data and collaboration around that information.

Data Democratization

The goal has been to democratize data. This requires enabling scientists to access and analyze relevant data through scalable tools provided by IT rather than specific requests for each dataset or analysis.  Researchers are no longer satisfied waiting for IT or bioinformaticians to run reports; they want applications they can actually use, at the bench, to speed their work. 

Right now, the National Center for Advancing Translational Sciences says it takes, on average, 14 years and $2 billion to bring a drug to market – and perhaps another decade before it’s available to all patients who need it.

Tackling the data challenge could go a long way toward shortening that timeline and reducing that cost.

Tough Questions On Data

Some of the most pressing questions around data management stem from the most basic need: bringing useful, appropriate data together, and making it searchable and sharable, to solve problems. 

Translational researchers have a wide variety of medically relevant data sources available to them, from omics to adverse events to electronic health records and more. Tapping into the right data at the right time can help these researchers:

Determine new uses for approved or partially developed drugs

Analyze trends and predict potential direction for further research

Translate discoveries from basic research into useful clinical implementations

Analyze clinical data and outcomes to guide new discoveries and treatments

Here are some of the lingering questions that need answers in order to truly democratize data:

Question 1: How to Bring Data Together?

Most organizations still struggle to find all the data that might be helpful to them; or it is captured in silos that are difficult to penetrate, let alone aggregate. Often times, people who need the data aren’t even aware it exists. Figuring out the best means to usefully aggregate data remains a challenge.  Further complexity is added when layering in who is permitted to access specific datasets and how that access is controlled.

Question 2: How to Compare Data?

Once data is aggregated, researchers must be able to determine if they are accurately comparing appropriate or related data sets. This can stem from non-standard ontologies that make it difficult to map different data to each other. Additionally, if two items look similar but are in fact very different, how can the scientist tell? 

Question 3: When to Normalize Data?

Aggregating and integrating data inherently changes the data; it is possible to manipulate the data without meaning to. So some are in favor of normalizing data as early as possible in the integration process, arguing it’s best to align the data well before analysis. Others say normalizing all data – some of which you may never use – is too time consuming and expensive. They support aggregating data later, so it is closer to raw when doing analysis. This can make analysis more effective because there is more context, but the data is harder to share.

Question 4: Who Analyzes Data?

In most organizations today, a small subset of the research organization – data scientists and bioinformaticians – perform data analyses. This creates a bottleneck. But until most bench researchers have the tools and skills to analyze the volumes of data they encounter, it is going to be difficult to scale analysis capabilities. Currently, the solution has been to employ more data scientists.

Delivering Answers

To solve the problem of helping researchers and scientists more quickly and efficiently analyze data themselves, we’re building scientifically relevant applications in an intuitive, simple, and repeatable framework with PerkinElmer Signals™ platform. We’re delivering workflow-based applications on the Signals platform for uses from Translational to Medical Review to Screening and more.  

Powered by TIBCO Spotfire®, the Signals platform makes it easier and more intuitive for researchers to perform everyday data mining. It enables scientists to leverage a single platform that combines best in class technology to address the issues around big data storage, search, applying semantic knowledge, and analytics in a solution that they can understand and use, leading to faster insights and greater collaboration.

PerkinElmer Signals is our answer to the four basic, yet pressing questions above. With it, we’re providing an out-of-the-box cloud solution that can handle the wide array of experimental and clinical data available to translational scientists. Without IT intervention, they can integrate, search, retrieve, and analyze the trove of relevant data from across internal and external sources.

If you’ve got questions about Precision Medicine data management tools and ROI, download our white paper.

When Platforms Launch Discovery

Platform software development is trending these days, and for good reason. Platform development lets developers get to a robust foundation that addresses broad essentials, like security and reporting, but then enables dedicated teams to build innovative applications on top of that foundation to serve specific users. 

Forbes, quoting blogger Jonathan Clarks, described platforms as structures on which multiple products can be built. By trending toward platforms, the logic functions of applications can be separated out, “so that an IT structure can be built for change,” Clarks argued. He said companies “invest in platforms in the hope that future products can be developed faster and cheaper than if they built them stand-alone.” 

We’ve been convinced that a platform approach is most effective, both for our customers and for us. One of our main motivators is finding better ways to help the scientists, researchers, business analysts and others we serve quickly make sense of the overabundant data they encounter daily. The PerkinElmer Signals™ platform is enabling us to achieve this goal. 

The PerkinElmer Signals Platform

With Signals, we’re taking data integration and analysis to the next level, beyond data visualization. It consists of applications that enable deeper data insights based on context from available research data, search queries, workflows and more.

PerkinElmer Signals portfolio includes cloud-scale products as well as on premise scalable offerings that can grow with demand. Signals also leverages TIBCO Spotfire® enabled scientific workflows and data visualizations. From a basis that empowers data-driven scientific and business decisions, Signals has branched out to offer self-guided data discovery, self-guided data analysis, and visual analytics in the fields of translational, screening, medical review, and lead discovery, with more to come in 2018. We´ve even got Signals Notebook, a web-based electronic notebook for scientific research data management, and have built the Signals vision into existing products, such as E-Notebook.

This approach lets us help customers corral the explosion of data across their enterprises, while focusing on the complexity with specific solutions. The foundation uses tools for big data storage, search, semantics, and analytics to help users get a much clearer and broader view, much more quickly, from an array of disparate data. Each application builds on scientific knowledge to create scientifically accurate and relevant workflows that deliver deeper insights from the data. You get a truly modern, intuitive tool to collaborate, search, wrangle and manage data in familiar yet flexible workflows.

PerkinElmer Signals Lead Discovery

Take Signals Lead Discovery-the applications create greater data awareness for users by helping them to find data they might not even know exists. Using agile guided search and query, Signals anticipates needs and provides flexibility for on-the-fly exploration of compounds and their activity against a target.

It also brings the benefit of index-based search to Lead Discovery. For example, Signals Lead Discovery can capture search constraints and search attributes. A patent-pending algorithm lets users search chemical structures within the Apache Lucene-based indexing system, meaning it is possible, in a single query, to capture chemical-structure search constraints along with other search attributes. 

Secondly, it is suitable for both chemical and biological activity data. You can shape and annotate biological activity data into a hyper-scalable structure to perform precise quantitative searches that are seamlessly integrated with structure search. You can rapidly execute queries like “Retrieve all assay results for compounds containing a certain substructure where the activity in one of the assays is less than 15 nm.”

The lead discovery workflow, for example, enables scientists to:

discover data of interest 

immediately confirm the meaningful intersection of compounds of interest with assay results they require 

seamlessly deposit that data into a fit-for-purpose template SAR analysis 

Using search features, scientists can find their project, navigate assay hierarchies, discover how much data is available, merge substructure search results with their select project, and begin filtering for assays of interest. Or, Signals Lead Discovery can show results for compounds in range. This data is ready for SAR analysis.

No Programming

Importantly, no new programming skills are needed for working with Signals. Its composable wizardry lets users interchange components to configure the workflows they need, resulting in rapid, agile application development. When programmers don’t have to concern themselves with entirely new query syntax, or anticipate all joining operations in advance, it avoids the trouble of “schema after” that indexing systems are designs to avoid.

With a platform built on best-in-class technologies like TIBCO Spotfire®, PerkinElmer Sgnals frees organizations from being overly IT and programmer dependent. Self-service discovery and analytics, driven by powerful visualizations and easy configuration, keeps scientists focused on their science. When tools do a better job of managing and presenting the data- scientists, researchers, and business analysts gain more time for critical thinking and analysis, essential for discovery.

Are you ready for Signals? We’d love to show you around the Signals platform and the applications best suited for you. 

Making the Move to Paperless Experiment Data Capture

A good laboratory notebook is the lifeblood of a successful laboratory, serving as a vital repository for retaining valuable experiment data. There's nothing more frustrating than—when looking to replicate an experiment that a former colleague ran— to find that the experimental details were limited (or absent), necessitating that you re-optimize the entire procedure wasting valuable time and resources. In a recent Nature Study, over 70% of researchers surveyed have tried and failed to reproduce another scientist's experiments, and more than 50% have failed to reproduce their own experiments, further justifying the importance of keeping detailed experiment records. 

Several years ago as an undergraduate working in an organic synthetic chemistry laboratory, I made it a habit to be incredibly detailed in the experimental information and observation sections of my paper notebook. However, I found it painfully laborious when re-running previous experiments to have to either completely re-write all the experimental information/observations, or to reference a previous experiment. Even more painful? The fact that I more often than not referenced the wrong experiment! 

As I began to collaborate more with my chemistry colleagues, my newfound challenge was around the ability to easily share my experimental procedures. I had to remember which paper notebook/experiment number contained the experiment of interest, which in turn required that I spend a lot of time rummaging through 10+ paper notebooks to find that single experiment. Paper notebooks take up valuable bench space, and—as one who runs dozens of experiments at any given time knows all too well— every inch of bench and hood space is valuable.

When I got to graduate school, the laboratory that I joined was transitioning to the PerkinElmer Signals™ Notebook, a browser-based electronic laboratory notebook (ELN).  Signals Notebook has literally transformed the way in which I take notes and keep track of my lab experiments for the better.

ELNs: What’s in Them for You? A few crucial benefits of using ELNs – and in particular, Signals Notebook–are captured below:

Electronic Notebooks Facilitate Collaboration: 

As aforementioned, there was nothing more frustrating than having to search through several paper notebooks to find a single experiment to share with a colleague. The good news is that, unlike paper notebooks, electronic laboratory notebooks are accessible from any computer and can be easily shared with other researchers, greatly facilitating collaboration between colleagues. Furthermore, it is also possible to share permissions with colleagues, allowing them to search through the notebook as I actively use that same notebook. 

Electronic Notebooks Store Information Online: 

We live in an ‘everything's in the cloud’ 21st century world, so storing your valuable experiments in a structured online format allows for easy access by individual researchers/collaborators and also those collaborators who require a password for entry.  Experiments and data are secure 24/7. As we look ahead, more tools will be developed that cater to electronic records, so scientific researchers will continue to enjoy the time savings, accuracy and other perks of electronic laboratory notebooks as the paperless digital framework of the laboratory expands.

Electronic Laboratory Notebooks are Searchable: 

Have a specific structure you're looking for? ELNs allow a user to simply enter their structure of interest, and the notebook will return back every experiment that either contains that specific structure or substructures of said structure. What’s more, if colleagues have shared their experiments with you, it's also possible to search across colleagues’ notebooks further catalyzing collaboration. No more rummaging through old paper notebooks to find a single experiment.


Simplified search capabilities

Electronic Laboratory Notebooks Are Green: 

Each year, the world produces a staggering 300 million tons of paper. Eliminating paper notebooks eliminates most use of paper in a laboratory (profound I know!). This is a wonderful benefit for the environment and also minimizes the cost for researchers who find themselves continually buying paper notebooks.

Electronic Laboratory Notebooks Allow For Easy Upload of Data: 

Take a ton of TLCs? Maybe run a bunch of NMR/HPLC on a given reaction? With ELNs, you seamlessly drop and drag the PDFs into your notebook, ensuring that you always have your data online for your own benefit, as well as having the ability to share data with colleague and supervisors for ongoing research updates. 


Easily transfer schemes from Signals to ChemDraw and vice-versa

Signals Notebook Runs Through ChemDraw®: 

If you make molecules, you use ChemDraw, simple as that. An electronic laboratory notebook with ChemDraw built in allows for easy export/import of experimental information and reaction schemes for research updates/presentations, saving you the time of having to redraw compounds/reaction schemes.


Designing Reactions with Signals Notebook Individual Edition

21st Century Science Made Simpler

Save yourself numerous headaches and get an electronic laboratory notebook for your lab. Most research communities— from a technological point of view— have been dragged into the 21st century kicking and screaming because of the large upfront costs required by new instruments and technologies designed to save time and resources. Signals Notebook is different. The ELN platform is inexpensive and greatly catalyzes the fast-paced, collaborative workflow of modern science happening in 21st century labs. I can't even imagine at this point of my career going back to a paper notebook.

To download your free trial of ChemOffice Professional with Signals Notebook Individual Edition built in please click here.

About the Author:

Rick Betori graduated from Baylor University Magna Cum Laude with a B.S in Biochemistry, where he was a Departmental Fellow in Chemistry and Biochemistry. 

Rick is currently a doctoral candidate in the Department of Chemistry at Northwestern University working in the laboratory of Professor Karl Scheidt. Rick is a National Institutes of Health Chemistry-Biology Interface Predoctoral Fellow and a Northwestern Department of Chemistry Departmental Fellow. Rick’s research focuses on the use of natural product inspired small molecule chemical probes to interrogate the biological role of telomerase in cancer cell development and maintenance.

Rick can be reached at

A Teacher’s How-to Guide for Unlocking the Mysteries of Organic Chemistry

Organic Chemistry in the High School Classroom – with ChemDraw® 

Several hundred thousand high school students take an advanced chemistry elective every year.  Most of these students follow a traditional American curriculum centered on topics covered in a standard introductory college course, with units like electron structure and periodicity, bonding, thermodynamics, kinetics, acids and bases, and the like.  The Advanced Placement Chemistry program codifies this curriculum into a standard form that is followed by thousands of chemistry teachers.  But there is another interesting option for a second year high school chemistry elective: organic chemistry!  And tools like ChemDraw® make it easy to build exciting and powerful lessons focused around OChem.  

The high school organic course we have taught at Lakeside School for the past 16 years is a hybrid between a traditional college OChem course covering the intricacies of organic structures and reactions, and a more topical course that explores the world of polymers and other macromolecules, petroleum and biofuels, soaps and surfactants, dyes and pigments, and food chemistry.  I use ChemDraw extensively in nearly every unit, from making images for my lectures and homework packets to creating activities and labs for my students to work on during class.  Here are some examples of how I use the amazing tools that ChemDraw offers to help my students unlock the mysteries of organic chemistry.

Molecular Structure

I start the year exploring the many levels of molecular structure inherent in organic molecules with a series of molecular modeling labs.  We start by looking at the variation present in constitutional isomers.  Students build different branched alkanes, name them, and find their corresponding constitutional isomers

the constitutional isomers of heptane, C7H16

From there we move to looking at how conformatial changes can affect molecular structure, starting with the conformers of butane and working up to the conformational equilibria present in substituted cyclohexanes.

gauche butane                                                                                                     Conformers of cis-1,3-dimethylcyclohexane

Stereochemistry is a rich area to explore with high school students.  We spend some time looking at simple enantiomers, and then move into more complex aspects of stereochemistry like diastereomers and meso compounds

The enantiomers of 1-bromo-1-fluoroethane

I make extensive use of the amazing “Name to Structure” function in ChemDraw to look up real chiral molecules that the students might have heard of so that we can study their 3-D structure together.

Dexmethylphenidate, one the pharmacologically active configurational isomers found in the drug, Ritalin

While the “Name to Structure” feature is a real time-saver, the one I use even more frequently is “Structure to Name,” which allows to me instantly answer those persistent student questions of the type, “But how would you name THIS?”

“Easy, kid: it’s 1-((3S,5R,7R)-5-amino-3-chloro-7-methyloctahydrocyclopenta[c]pyran-4-yl)ethan-1-one, of course”

This function is even sophisticated enough to handle obscure structural features like prochiral centers.

A bromochlorocarbon with a prochiral center in the 3-position

Organic Reactions, Labs, and Spectroscopy

In addition to looking at levels of molecular structure, ChemDraw is very helpful in making up lessons and activities around organic reactions, labs, and NMR spectroscopy.  We spend quite a bit of time learning some basic organic reactions, and it’s easy to sketch out practice exercises and make up quizzes and tests:

Practice with some organic redox reactions

The clip art library makes illustrating lab handouts easy.  We usually do a fun and engaging steam distillation of lavender oil as our first real “wet lab” of the year.

ground glass labware used in our steam distillation

My students usually end the year learning spectroscopy, and ChemDraw has some cool features that I use in teaching students about Nuclear Magnetic Resonance.  For any given structure, ChemDraw can predict 1H and 13C NMR spectra, assign peaks, and calculate coupling constants.  

An estimated H-NMR spectrum for 3-methylbutan-2-ol

After learning the basics of spectra reading, I make up a bunch of “unknown” spectra for the students to analyze and have them try to predict the corresponding structures.  For many students, studying spectroscopy is one of most engaging topics we look at all year.

From isomers to spectroscopy, ChemDraw makes my high school organic class possible.  I’d love to hear from other science teachers who are also teaching organic chemistry to high school students, and learn how you use ChemDraw and other tools to create interesting and meaningful lessons for your students.

About the author
Hans de Grys teaches organic chemistry to students at Lakeside School in Seattle, WA, where he is also the Assistant Director of e Upper School.  His favorite things to do with his students are making cold-processed soap in the lab, studying spectroscopy (especially NMR), and synthesizing banana oil and biodiesel.  You can follow him on Twitter @chiral_guy

About ChemDraw
For more information on the power of ChemDraw, please visit our website