Knowledge management: do not drown the enterprise in data

1st April 2013

Knowledge is power. Nothing has changed that during the ages. However, gaining knowledge is problematic; not as a result of too little information, but by the enormous amount of information available. In our daily life we are flooded with unnecessary information. The internet has increased the accessibility of information by orders of magnitude. And we see the same development at the lab bench. More data is created and more information is stored. Freek Varossieau reports.

The challenge of gathering data in the current and future work environment is on! A modern laboratory produces more and more data with increasingly sophisticated instrumentation and software. The increasing regulatory pressure (21 CFR part 11, Record Maintenance, GxP, etc.) to store data on-line for longer periods does not simplify this. Although some LIMS products do offer the ability to manage information and workflows, they do not support the large amount of non-structured data a laboratory generates. A different approach is needed.

Data growth

The past few years' growth in data has been exponential and will increase even further. University of California at Berkeley, in a late 2000 study (, demonstrated an ongoing information explosion (so-called aContent Big Bang').

According to the study, since the beginning of man until 1999 roughly 12 terabyte of data was generated. This large number will be dwarfed in the near future.

The study clearly demonstrated that the amount of data will double every year in the next years to come. In 2000, 3 billion terabytes was generated and in 2001, 6 billion terabytes and in 2002 is expected to double to 12 billion terabytes. This staggering growth is mainly caused by the growth in so-called afixed content data (Fig.1).

Fixed content data

So the growth is exponential, but what exactly is afixed content data'? In the laboratory there are two types of data: u Structured data: subject to change; transaction orientated; multi-table (database) orientated, Examples are LIMS, ERP and other databases. u Unstructured data: not subject to change; often large; contains valuable information.Examples are machine readable data (ie instrument data files), videos, X-Rays, documents, reports, methods and procedures. These Fixed Content Data are the basis of all aknowledge assets'.

The tremendous growth of data in this area becomes a real challenge to collect, organise, search and archive. Some applications, like LC-MS and NMR produce more than 13 data files per experiment.

Fixed content data is the major source of information growth within an organisation. Easy and quick access to the information is where the difference is made.


One of the driving forces behind the data growth are new regulations. The Part 11 guidelines of the US FDA regarding the management and maintenance of electronic records and signatures, places tremendous pressure on the pharmaceutical and food industry to manage these records for secure long term retention.

The 21 CFR Part 11 guidelines are focused on the management of human readable and machine readable electronic files (e-records). This to ensure the authenticity and integrity of the data.

It also provides guidelines for electronic signatures (e-sigs). Currently, the FDA enforces these guidelines during company audits and enforcement letters.

Companies must have policies for short and long term archiving of data according to 21 CFR Part 11. But not only is the storage of data important, also data must be readily available when required.

In a recent publication, the FDA released their draft guidelines on Electronic Record Maintenance. It provides guidelines for the archiving of data and that policies must be in place.

The following control factors are given:

* Data encoded within an electronic record.

* Metadata for an electronic record.

* Media (eg, disk, tape, or flash memory devices) that record data and metadata.

* Hardware used to retrieve and display the electronic record.

* Software (both application programs and operating systems) used to read, process, and display electronic records.

* The processes of extracting and presenting information in human readable form.

Access to archived data must be well organised. The implementation of archival systems will be an important factor in the new IT infrastructures.

Software solutions

With the changing scope in mind, software vendors have developed applications that address the data collection, management and archival (Fig. 2).

Some of these software solutions are focused on basic data management or as a data archival plug-in for a LIMS. Depending on the need, these applications might suffice.

The growing demand for immediate availability and collaboration of information in an organisation requires a different approach.

So-called Knowledge Engineering Systems not only support raw data management but human readable reports.

By applying a flexible and comprehensive indexing for both types of records, simple access to knowledge assets is ensured.

Knowledge Engineering is focused on making information available in the enterprise. Benefits of Knowledge Engineering are improved information exchange in the organisation and the related increase in productivity.

The future, paperless?

Will these knowledge engineering systems eventually lead to the so-called paperless organisation (Fig. 3)? Mankind has worked with paper for many hundreds of years and this will not change quickly.

Yet, there are many advantages to a paperless environment. Well implemented a paperless environment enables rapid and controlled access to valuable information. Reduction of the administrative burden and the improvement in collaboration are just the start of the many savings companies can achieve by implementing a knowledge engineering system.

Enquiry No 105

Freek Varossieau is with Scientific Software International BV, Willemstad, The Netherlands.




Twitter Icon © Setform Limited