Section 6: Challenges and limitations of high-throughput in vitro testing

Question mark

While high-throughput in vitro testing (HT testing) offers many potential benefits (detailed in Section 5 of this primer), the critical question is whether these tests will improve our ability to accurately identify and predict hazardous effects of chemicals and the risks they present to the human population. Today there are a number of key limitations and challenges to using HT tests to assess chemical risks.

If HT assays are to be used more extensively in chemical testing, the following question must be asked: Can tests conducted in vitro accurately reflect the effects that a chemical would have in the more complex and complete environment of a whole animal, including a human? In other words, can they accurately predict adverse outcomes in whole animals, including people? This question has a number of dimensions.

Key Challenges

In vivo versus in vitro

Traditional toxicity testing aims to determine whether a particular dose of a chemical results in an observable change in the health or normal functioning of the whole animal. In contrast, HT tests examine whether and by what mechanism a chemical induces changes at cellular and molecular levels. Such changes may be precursor events leading to an actual disease outcome, in some cases picking up effects that can’t easily be detected or measured in traditional whole animal tests.

Both EPA and the National Research Council acknowledge that HT methods do not capture all relevant processes—at least not yet—that occur within the more complex system of a whole tissue or whole organism. To quote an EPA study, “The most widely held criticism of this in vitro-to-in vivo prediction approach is that genes or cells are not organisms and that the emergent properties of tissues and organisms are key determinants of whether a particular chemical will be toxic.”

Coverage of the full biological response landscape

Determining whether a chemical perturbs a biological pathway requires that all key events in the pathway—and any auxiliary molecular activity associated with that pathway, such as epigenetic processes (see epigenetics below)—are included in the battery of HT assays being used. In other words, it’s impossible to detect an adverse effect if it’s not being tested for. As Dr. Robert Kavlock, Deputy Assistant Administrator for Science at the EPA has stated, “And then another lack that we have is we’re looking at 467 [HT] assays right now. We may need to have 2,000 or 3,000 assays before we cover enough human biology to be comfortable that when we say something doesn’t have an effect, that we’ve covered all the bases correctly.” (The Researchers Perspective Podcast, 2010. Read the full podcast transcript [PDF].)

Likewise, during the NexGen Public Conference in February 2011, Dr. Linda Birnbaum, Director of the National Institutes of Environmental Health Sciences (NIEHS), pointed to significant gaps in our understanding of biological pathways. She described gene targets relevant to disease pathways involved in diabetes that are not currently included in the ToxCast HT battery of assays. These gene targets were identified by experts during an NIEHS workshop on chemicals and their relationship to obesity and diabetes. It will be critical for ToxCast-like efforts to continuously mine and integrate the latest science into their HT assay batteries.

EPA’s use of HT in vitro testing must ensure adequate coverage of the biological response landscape; composed of numerous, complex, and interconnected biological pathways involved in the progression of adverse health outcomes. Click here to see larger version of the image. Photo by Kyoto Encyclopedia of Genes and Genomes

Accounting for chemical metabolism

When chemicals are studied in whole animals, the effects observed are dependent in part on how the body metabolizes the substance. One critically important factor challenging the predictive ability of in vitro testing is whether and to what extent such methods capture the mechanisms animals use to metabolize chemical substances. The toxicity—or lack of toxicity—of a chemical is not always derived from the chemical itself, but rather from the rate at which it is broken down and the nature of the breakdown products (called metabolites). A classic example is the polycyclic aromatic hydrocarbon (PAH), benzo[a]pyrene: It is the metabolites of the chemical that are mutagenic and carcinogenic. Metabolism can also work in reverse, of course, rendering a toxic chemical less or non-toxic.

Many of the HT assays utilized in ToxCast and other HT systems lack explicit metabolizing capabilities. EPA is exploring ways to better account for wholeanimal capabilities such as metabolism in HT testing, but until there is greater confidence that these complexities are accounted for, this factor will continue to limit the extent to which in vitro HT test data can be considered fully predictive of in vivo effects.

Ability to account for diversity in the population
Multi-colored Silhouettes of 5 people

HT in vitro tests using genetically-identical cell lines may not accurately predict the effects of chemicals on the diverse human population.

Another challenge is the ability to account for real-world diversity among the human population that influences susceptibilities to toxic chemical exposures. Individual differences in our genomes, epigenomes, life stage, gender, pre-existing health conditions and other characteristics are integral to determining the ultimate health effect of a chemical exposure. This is a challenge, of course, for traditional animal toxicity tests as well as for newer HT methods.

Traditional animal toxicity tests typically use inbred, genetically identical (isogenic) animal strains to generate results that must then be extrapolated to predict a chemical’s effect in the much more diverse human population. Similarly, newer HT methods typically use homogenous populations of cells or components drawn from such cells. While there are often good reasons to start with genetically homogenous populations of animals or cells, their use limits the ability to make accurate predictions about effects in very diverse human populations.

This challenge has not escaped federal researchers, who are testing thousands of compounds on different human cell lines to better account for differential susceptibility to effects. Indeed, recently published scientific research reveals that genetically diverse cell lines responded differently to certain compounds in a HT testing system, suggesting that using a diversity of cell lines is one approach to incorporating genetic diversity in the population.

Accounting for multiple exposures and different timing of exposure

We know that in the real world we are exposed to a complex mixture of chemicals, not one chemical at a time. And we are learning that the timing of such exposures—early in fetal development, early childhood, or in adulthood—influences the health outcome. Capturing this complexity of exposure presents a fundamental challenge to the use of HT testing (as well as to traditional toxicity testing).

Accounting for different patterns of exposure

The ultimate impact of a chemical exposure on our health may be quite different depending on the duration, frequency, and level of exposure. For example, the effects of acute, high dose exposures can be quite different than those that result from continuous, low dose exposures. Issues relating to the frequency and duration of exposure have been acknowledged by agency experts in a peer-reviewed publication: “A related challenge is the understanding of what short timescale (hours to days) in vitro assays can tell us about long-timescale (months to years) processes that lead to in vivo toxicity end points such as cancer.” The frequency and duration of real-world chemical exposures will need to be either directly addressed in HT assays or otherwise somehow integrated into the interpretation of HT testing data.

Determining a significant and adverse level of perturbation

At some point, an informed decision will need to be made as to what level of chemically-induced perturbation observed in an HT assay is considered sufficiently indicative or predictive of an adverse effect in a human. In other words, even if an assay performs perfectly (i.e., yields no false positives or false negatives – see bullet below), determining how to interpret and translate HT data into a measure of actual toxicity in humans is a challenge—further complicated by the issues of individual and population variability discussed earlier.

Just as in our efforts to deal with data from existing testing methods, there will need to be decision rules that govern how to extrapolate HT data to humans so as to measure the intensity of effect at a given dose, not just whether or not there is an effect. Translating and interpreting such data to inform decisions about toxicity and risk to humans will also require transparent and clear delineation of where value judgments or assumptions enter into decision-making.

Insufficient accounting for epigenetic effects

Epigenetics is a burgeoning field of science that studies how gene expression and function can be altered by means other than a change in the sequence of DNA, i.e., a mutation. Epigenetic programming of our genes is critical for normal human development and function. For example, epigenetics is the reason why the single fertilized egg we all began as differentiates into the more than 200 different types of cells that make up our adult bodies.

Evidence is increasing that certain chemicals can interfere with normal epigenetic patterns. For example, epigenetic changes induced by tributyltin have been shown to influence the programming of stem cells to become fat cells instead of bone cells. The current ToxCast battery of assays is quite limited in explicitly measuring epigenetic effects of chemicals (see this NIEHS presentation slide 13 and this description of one of the few such assays currently available currently available).

False Negatives/False Positives

Fundamental to the success of HT assays is their ability to correctly identify chemicals that are – or are not – of concern. EPA’s approach to validating HT tests largely involves testing chemicals with already well-defined hazard characteristics based on traditional animal testing. By using well-studied chemicals, the agency plans to determine the extent to which HT assays accurately detect those hazards identified in the whole-animal studies.

Determining how accurately a HT assay identifies a chemical’s hazards includes assessing the “false positive” and “false negative” rates of the test. If the false negative rates are high in the HT assays used to screen chemicals for further assessment, a potentially hazardous chemical could be erroneously determined to be of low concern and side aside. As Dr. Bob Kavlock explained, “You want to have as few false negatives as possible in the system, because if you put something low in a priority queue, we may never get to it, and so you really want to have confidence that when you say something is negative, it really does have a low potential.” (The Researchers Perspective Podcast, 2010. Read the full podcast transcript [PDF].)

During the 2011 NIEHS Workshop on the “Role of Environmental Chemicals in the Development of Obesity and Diabetes,” experts examining organotins and phthalates noted that ToxCast high-throughput assays did not successfully identify chemicals known to interfere with PPAR—a protein important for proper lipid and fatty acid metabolism—in assays designed to flag this interference.

New chemical testing approaches offer great potential to address some of the long-standing limitations of chemical risk assessment (which are detailed in Section 1 of this primer). But as discussed above, there are several major challenges—many also applicable to traditional toxicity testing—that need to be met in the development and use of HT testing and other newer approaches. Moreover, new challenges will continue to arise as a consequence of the ever-evolving nature of science. While there may not be immediate solutions to the challenges we face, it is profoundly important that current limitations are acknowledged, characterized and communicated to decision-makers and stakeholders so that new data or new assessments can be appropriately interpreted and appropriately used.

To learn about how new chemical testing technology is being incorporated into risk assessment, proceed to Section 7: Advancing the Next Generation (NextGen) of Risk Assessment

For a commentary on the need for public engagement in the development and use of these new methods, proceed to Section 8: The Need for a Public Interest Perspective