Petroleum, on which modern day society was built and is now dependent, is a diminishing resource with increasing environmental, political, and economic disadvantages.

The ideal alternative would be chemically identical to petroleum, allowing broad and rapid adoption, derived from renewable resources, scalable to support current and future demands, domestically derived, and cost competitive without subsidies.

Many technologies provide potential alternatives to fossil fuels.  Biofuels (mainly ethanol) are used to power a significant proportion of motor vehicles in some geographies.  Production of such fuels is based on microbes metabolizing organic material to produce alcohol.  While this technology shows promise, problems have arisen in its long term commercial viability and the fact that it competes with food crops.

Approaches to alternate fuel production can be divided into three categories:

  • Use of microbes to metabolize organic materials to produce combustible gases
  • Use of natural vegetable oils
  • Production of alcohols through microbial or enzymatic activity – usually a sugar or starch product

These approaches have one common theme: the need for specialized distribution systems and/or engine modifications to exploit the new fuel.  An alternative is to create drop-in compatible biofuels which can be used by current combustion engines.  LS9, a California biotechnology company, has developed Renewable Petroleum™ technologies to meet this need.

A biological, fermentation-based process starting from renewable raw materials which offers compelling economics is being developed.  LS9 is engineering a wide range of DesignerMicrobes™ that are used in a proprietary 1-step fermentation process to produce renewable fuels and sustainable chemicals.  LS9’s Renewable Petroleum™ technology enables the rapid and widespread adoption of renewable transportation fuels. Patent-pending UltraClean™ fuels are custom engineered to have higher energetic content than ethanol or butanol; to have fuel properties that are essentially indistinguishable from those of gasoline, diesel, and jet fuel; and to be distributed in existing pipeline infrastructure and run in any vehicle.

The use of engineered bacteria and recombinant DNA technology to produce such fuels requires precise data management and analysis. This paper will examine the use of test management, to improve reporting and efficiency in the research and development of renewable petroleum.

LS9 – the renewable petroleum company

Pushing the frontiers of synthetic biology and industrial biotechnology, LS9’s unique technology provides a means to genetically control the structure and function of its fuels and chemicals, enabling a product portfolio that meets the diverse demands of the petroleum economy.

LS9 has developed a new means of efficiently converting fatty acid intermediates into petroleum replacement products via fermentation of renewable sugars. LS9 has also discovered and engineered a new class of enzymes and their associated genes to efficiently convert fatty acids into hydrocarbons. LS9 believes this pathway is the most cost, resource, and energy-efficient way to produce petroleum-replacement products and industrial chemicals. This translates into efficient land and feedstock use and directly addresses tensions between food versus fuel and chemical production.

LS9 can identify DNA and protein sequences that result in improved production of high energy content fuels, which can be mass produced in bioreactors using recombinant DNA technology.  The high energy compounds are subsequently purified and made available as fuel, making LS9’s technologies compatible with existing machinery and vehicles, with no need to modify engine components.

In order to achieve this goal, LS9 is currently engaged in a substantial screening programme to determine a protein sequence of the ideal composition.

Using error-prone PCR to generate optimal protein sequences

An approach employed by LS9 to optimize protein structures suitable to act as catalysts for biofuels, is to employ mutagenesis methods commonly used in the industry such as PCR (polymerase chain reaction). With these methods, an original gene sequence is subjected to duplicate the genes of interest in reaction conditions that create random errors in the sequence.  The result is the production of mutated DNA strands. These mutated forms of the original gene are subsequently ligated into plasmids and their resulting proteins are expressed after transfection into E. coli. Unlike traditional screening programmes, specific information relating to the object to be tested (the expressed protein) is unknown. To manage this experiment, users are able to use ActivityBase’s Object Manager, to create database records for the products of the error prone PCR. Records for the test plates, and details of the position of an object on a plate are also inserted into the database in this single Object Manager process.

Use of database generated Object IDs is overridden in case of error prone PCR products. Instead, the combination of plate ID + well reference is used as an Object ID. A script is run for customized Object ID generation and well assignment. At this stage, the sequence of the PCR product is unknown, so a custom Object ID is a very user friendly way to identify a PCR product. Such customized Object IDs are helpful for identification of products to be rearrayed from primary screen to the next round.


Figure 1: Overview of error prone PCR workflow

Object Manager is used for creating the error prone PCR objects. Plate creation and well assignments are done using Plate Creator.

Object IDs are assigned using the automatic ID generation functionality. Individual proteins are screened for potential improvements and the activity of the individual protein is determined by measuring the degree of fluorescence on interaction with a substrate.


Figure 2: ActivityBase XE can perform both inter-well comparison and intra-well kinetic analysis. In this example, individual proteins are measured for their respective activity over time

Screening results are captured in an ActivityBase XE result template by importing delimited text files. However, it is possible to import almost any file format into XE. Statistical data analysis is performed in the results template using powerful visualizations available in the application. The results are verified directly within these views using both automatic and manual methods. The results are then automatically returned to the database against Object ID, plate ID and well reference when the XE template is closed. Proteins that display promising activity are then sequenced, with this information being combined with the associated Object ID and well data. This is achieved through an update to the object record in the database using Object Manager.


Figure 3: Protein sequence data is included in the object detail form. In this example, the sequence of the tested protein has been included in an additional field

Saturation library testing

Once potential candidates have been identified, these proteins are subject to saturation mutagenesis. Here, specific amino acids are substituted in the protein to determine the effect on activity.  Unlike error prone PCR, in this technique the exact sequence of the protein is known prior to screening.  Thus, database generated Object IDs are used at this stage. Proteins are then screened, and the results are again analyzed and written back to the database using ActivityBase XE.

In this case, Object Manager is used for object, plate creation and well assignments.


Figure 4: Specific changes in the gene sequence produce a wide variety of mutant proteins for screening

Data analysis

ActivityBase Suite includes a comprehensive tool kit for the creation, deployment and maintenance of analysis templates. Users have the option to either use Microsoft® Excel or the ActivityBase XE template format. The template design utilities included in the XE Designer module enable the creation of both well and plate-based calculations. Individual formulae can be applied at different levels within the experiment, meaning users only have to maintain one formula rather than one formula per cell in a spreadsheet. Screening campaigns in LS9 focus on the determination of protein activity and subsequent suitability as an improved catalyst. The results can be presented using visualization tools within ActivityBase to display the relationship between protein sequence and activity.

The statistic engine made available to the user through ActivityBase XE gives users access to more than 75 different curve fitting models. Furthermore, erroneous data points can be either automatically or interactively excluded. In addition, for complex analysis, ActivityBase XE can integrate with third-party analysis packages. In this instance, users combine deconvoluted data from a third-party analysis package (JMP) together with actual screening data generated within ActivityBase XE.

LS9 Development and ActivityBase XE

LS9 required a solution that would store many thousands of protein products, record specific mutations, run multiple assay types and ultimately link all relevant information together. IDBS’ ActivityBase XE provided the answer for their high throughput, plate-based work.

Using ActivityBase XE almost exclusively and in a unique way, LS9 is recording proteins and their various properties to create an exclusive biofuel that is less carbon intense and more cost effective.
Using error prone PCR, LS9 is creating a library of mutated DNA products, which are subsequently being transfected, with the resulting proteins isolated and then screened. The DNA sequence for an interesting protein is amplified, which generates a library of nucleotide sequences that are mutated forms of the parent sequence.

Testable objects that show promising activity are placed on plates and screened for activity. This activity is measured by two methods. One is a kinetic approach (measured through in-well calculations and curve fitting in XE). This is based on measuring the rate of activity of a protein through increases in fluorescence. The second is via Gas Chromatography, which identifies the amount of different types of products created.

The physical creation of plates is replicated in ActivityBase XE using Object Manager, but no sequence information can be uploaded, as the objects will not yet have been sequenced. The data from screening is then captured in ActivityBase XE in a results template, and a VB script returns a list of objects that have achieved the desired characteristics. Additional information from the results is also gathered, including plate ID, well reference, location on plate and sequence (if it exists). The output is used in Object Manager to create new plates using objects cherry picked from the original screened plates. These new plates are then put through secondary screening.

The DNA sequences for the objects used in the secondary screen are determined and Object Manager is used to update the mutation field for each object record in ActivityBase.


Figure 5: Screenshot of Object Manager

Integrating ActivityBase XE with other systems

Whilst the screening users at LS9 are able to use the visualizations in the ActivityBase XE Runner module for verification of results, the reporting of this data is achieved through seamless integration with other applications and data repositories.

After verification, the results from high throughput screening are stored in the ActivityBase database, which resides within an Oracle platform. LS9 maintain various other databases also on an Oracle platform. They take advantage of this common platform to integrate the data in these databases with ActivityBase in many different ways. The customer uses a database customization to summarize data and do statistical calculations. Summary tables are created from this output.

The integrated data is then queried using Reporter. This is a tool within the ActivityBase Suite, which is able to run parameterized SQL queries that can query any Oracle data source. The output of the queries is automatically launched in a statistical analysis and deconvolution program. The statistics that are generated are then loaded back into the database to be linked with all of the screening data. Further reports are automatically launched in SmartViz and the multi-dimensional visualizations available are employed to easily relate mutant proteins with their statistical scores. This is particularly useful for LS9 in deciphering how each mutation may contribute to a protein’s enhanced functionality (see figure 6). Visualizations are also available to plot the activity of a specific protein against the position of mutations along a protein chain. Clicking on a data point within a SmartViz visualization displays contextual information for the selected mutant.


Figure 6: ActivityBase SmartViz - displaying the relationship between algorithmic estimate and actual Z Score


Accommodating a wide range of screening methodologies, ActivityBase XE offers LS9 a secure and searchable application to record protein activity for an alternative biofuel. Originally designed to manage discovery data in pharmaceutical research, ActivityBase XE can be adapted to varied workflows, including biofuels research.

Registration of potential biofuel proteins, screening, analysis and visualization are all performed within ActivityBase. Furthermore, integration points allow the meaningful exchange of information and results between ActivityBase XE and other applications. Using ActivityBase XE, all screening data is captured and stored in one accessible location, providing LS9 with the flexibility to record the results of error prone PCR, and measure protein activity through inter-well comparison and intra-well kinetic analysis. Using applications such as SmartViz, LS9 can measure significant changes in proteins, and view the results through statistical calculations, graphical information and summary tables.