Storing and sharing analytical data for maximum commercial benefit

Data-mining, viewing and comparison of data across the organisation are just a few of the issues surrounding the archival of analytical data to facilitate knowledge management. Here, Kevin Smith examines the issues in depth and introduces a new solution.

As laboratory throughput increases, the volume and wide variety of data generated is growing exponentially to create a real pressure on science-based organisations to manage analytical data more effectively.

The major challenge to lab managers is the long-term, secure storage of the raw data, method details and results that accumulate. Providing the means to easily search, explore and retrieve and share any piece of data for inspection, visualisation and manipulation becomes the next hurdle and this is where the real commercial benefits and competitive advantages lie. Such challenges can be almost insurmountable in laboratories where there are various types of instruments, data systems and file formats from many different manufacturers.

Help is now at hand in the form of eRecordManager, a new software package launched following the acquisition of spectroscopy software specialist Galactic Industries Corporation by informatics specialist Thermo LabSystems.

As its name suggests, eRecordManager is a solution for the management of electronic records. The aeRecord' aspect refers to aelectronic records', their secure archiving and the ability to retrieve records in the future. The amanager' aspect refers to knowledge management and the ability to share all this information across the organisation.

Much of an organisation's analytical data is required to be stored for a lengthy period to be used as evidence in consumer investigations, potential patent infringement or intellectual property protection cases. This new software package meets the requirement for long-term secure storage of spectral and chromatographic data from multiple types of instrumentation ­ such as FT-IR, GC, LC, MS, NMR, UV-Vis, Raman, NIR ­ and multiple data formats, while eliminating the reliance on the original instrument software, operating system and hardware to search, restore, view and manipulate the data.

In fact, the system can store any entity that can be captured as a file. The issue is whether the data requires translation in order to be viewed. If an image is a JPG, PNG or GIF then it can be viewed on almost any software, therefore it could be retrieved from the eRecordManager archive and viewed locally. As the product and the data archival software market evolves it is anticipated that viewers will be developed for many other platform neutral formats, including images and PDF documents.

Data that was created 10 years ago might still be needed in 10 or 15-year's time to resolve questions of product liability or regulatory compliance. Imagine the headache of archiving so many data file formats and keeping them easily accessible to search and restore. It will be music to the ears of lab managers that, with this new software package, they are no longer required to retain legacy computer and software systems so they can access the archive of proprietary, often binary, data file formats.

Knowledge management

Spectra and chromatograms are the fundamental providers of data on which calculated results and subsequent conclusions are based. Putting this data into one place in a common format provides the ability to data-mine, compare and visualise instrument data that is vital in improving R&D productivity.

In many data-intensive production environments, the ability to easily access data from throughout the organisation also aids the development of new ways of analysing samples and predictive models that are impossible when the data is scattered across the company in individual instrument workstations. Access to past research avoids redundancies such as the repeating of work on similar recipes, formulations and processes mixtures.

The ability to search and retrieve chromatographic data is a powerful technique for analysing and comparing both intermediate and finished products. A number of manufacturers archive batch-to-batch runs and have constructed searchable libraries that, in turn, could be used for future quality evaluation, customer complaint or process investigations and for retrieving data in response to inquiries from trade and standards authorities. In distilling, where maturation periods can last up to 40 years, the volume of analytical data for archival and subsequent retrieval can be a major challenge. eRecordManager can help organisations to improve efficiency by providing a structured central repository of knowledge. Moreover, there is no requirement to retain to legacy computer and software systems in order to access the archive of proprietary, binary data file formats.

This is all possible because of a library of over 150 powerful file converters that automatically generate XML versions of the original data. The archived information can be viewed and reworked on virtually any platform long into the future, effectively future-proofing a customer's data.

XML (eXtensible Markup Language) has become the standard format for data storage and exchange, mainly because it allows the accurate representation of any data structure. Such acceptance ensures XML will predominate for many years to come, regardless of the evolution of operating systems and computer hardware. XML files are ASCII text-based and therefore retain the aknowledge' in the data. Many large organisations ­ each with much more at stake than Thermo LabSystems - have made significant commitments to XML. Its advantages include: u It is publicly-available; managed by the not-for-profit World Wide Web Consortium (W3C). u Being both self-describing and open ASCII text, XML-based files are essentially future proof. u Public-domain schema for specific data types are used to guide its use and against which documents can be externally validated.

The new package archives both the original raw data files from the instrument software, along with the normalised representation in XML. Users with access to this archive can view the normalised version of the data from any workstation. In addition, either XML or original data files can be retrieved for use with other software applications, though the latter relies on the original software and hardware being available.

Real access to real data

Because of this archival of XML-based files, Thermo LabSystems claims eRecordManager is unique in terms of its ability to free so many types of data files from the software applications that created them and to make them available for viewing and manipulation.

Using any workstation, the user is able to view the real data as acquired by the instrument, including 2D representations of data structures such as HPLC, GC, MS, UV and NIR; all of which are types of analytical data acquired for qualitative and quantitative purposes. eRecordManager allows the user to view right down to the individual XY data points of a chromatographic trace as they came off the detector. It is possible to expand the signal to see precisely, for example, where the original chromatography data system positioned the baseline and individual peak characteristics.

Thermo LabSystems contests that conventional data archiving systems only allow the restoration of data into the original data system application, or as a apicture' of a report that, in fact, only provides half the story. Organisations that rely on the archiving of a graphical representation of a final report would be advised to consider its limits. The content of their archived file is restricted to that data the scientist included in their report. What if a colleague wishes to view other information not incorporated in the report? Furthermore, the ability this type of picture file presents to view, rework or manipulate the real data is extremely limited.

A further problem with relying on mechanisms such as Windows Metafiles is that the same fonts and symbol sets must be available on both the original workstation where the image was created and the workstation where it is viewed. Unlike systems that rely on uniformity in computing environments, application-independent data archived by eRecordManager, is complete and self describing, ensuring that the electronic record will not change when it is viewed on different workstations. This is clearly a pre-requisite for any system needing to be relied upon during patent litigation or in cases of consumer complaint enquiries. u

Kevin Smith is Director of Electronic Record Management at Thermo LabSystems. Call Thermo LabSystems on

+44 (0) 161 942 3000 or email info@thermolabsystems.com

Recent Issues