Methodology: How the data was collected

EDF and Google Earth Outreach, along with researchers at Colorado State University, spent several years developing and testing these systems, proving that we could deliver accurate results in real-world conditions.

The scientists involved in this project published the methodology and mapping results in a peer-reviewed paper in Environmental Science & Technology in March 2017.

Data collection

This study uses two different sets of vehicles, each with different instrument packages: one vehicle, based at Colorado State University, used for experiments and intensive local monitoring; and another set of Google Street View vehicles that are deployed to cities around the country. All of the data for the maps is generated by the Google vehicles.

The research vehicle

The CSU research vehicle measures methane concentrations with a LiCor 7700 methane analyzer. The analyzer is an open-path design that measures methane in the air as it passes through the analyzer space. GPS location is determined with a Hemispheres A100 GPS with precision of 24 inches. Wind speed and direction is measured with a Gill WindMaster sonic anemometer and location with a Hemispheres GPS unit.

For these sensors, the frequency of data recordings is ten times per second, which corresponds to about three feet when driving at 20 mph. The CSU vehicle also uses a methane carbon isotope analyzer to discriminate between biological and fossil fuel sources of methane when required. This instrument records data approximately one time per second, corresponding to about 35 feet of street-level observation at 20 MPH.

The Google vehicles

The Google Street View vehicles which collect data for this study use a Picarro methane concentration analyzer that has a closed-path design with a sampling frequency of one data point every half-second, resulting in samples every 13 feet when driving at 20 MPH. These vehicles are also equipped with a Hemisphere GPS Receiver MD-100 GPS unit and Climatronics Sonimometer RS232 anemometer for measuring wind speed and direction.

Drivers are asked to drive every public street in a study area at least twice to ensure accuracy. Re-driving helps confirm the persistence of elevated methane readings and characterize their emission rate accurately. Drivers do not view the data as it is being collected.

The survey areas

EDF and CSU carefully select survey areas to ensure that a representative sample of a city’s local distribution system is mapped. In particular, when surveying large cities, it is impractical to map every street. Survey areas are selected to represent the variety of conditions in the city — different landscapes, type of pipe materials, and age of the pipes.

Measuring leaks in downtown areas presents special challenges: Tall buildings can interfere with GPS signals and create wind tunnels that reduce the ability to locate leaks. Because of this, EDF and CSU generally choose other neighborhoods in which to measure leaks.

Data Variables: Location, Attribution, Magnitude, Uncertainty

Location of Methane Source

Ambient methane concentrations in the atmosphere average around 1.8 parts per million (ppm). There is natural variability so it is possible to see average daily concentration values between 1.75 and 2.2 ppm in some cities and some days, depending on local sources and winds. Such variability is well within the sensitivity of the methane analyzers in the vehicles; these instruments can precisely measure methane concentrations down to 0.02 ppm.

To determine when the vehicle is driving through an area where methane readings are elevated above the background levels, researchers examine not just the concentration values, but also variability. When driving through air near a natural gas leak, the concentrations are higher but also uneven, much like the swirling patterns of white and brown in a coffee cup when first adding milk. When driving through air that is unaffected by a local leak, the instrument reads very stable values (a statistically low standard deviation), like the even color of coffee when the milk is well mixed. The CSU team has established thresholds for this variability that are used to determine where methane readings rise above background levels.

Air movement and below ground migration of natural gas can cause the plume location to differ from leak location. Based on validation data, the researchers have confidence that indicated leak locations are accurate to within 400 feet.


Biological processes can cause elevated methane concentrations in air, so it is important to discriminate whether methane emissions are due to leaks, vehicle emissions, landfills or other biological processes. The overwhelming majority of natural gas leaks are observed at elevated methane concentrations over a distance of 500 feet or less.

In contrast, methane emissions from landfills or wetlands cause elevated readings over thousands of feet.

An area is identified as having a natural gas leak if elevated methane readings extend over a driving distance of less than 500 feet.

Elevated concentrations that persist over 500 feet are screened to see if measurements are adjacent to known biological source areas, and if so, the elevated readings are set aside. Areas of elevated readings longer than 500 feet that are not adjacent to known biological sources of methane require further investigation.

There is evidence that some large-area observations of elevated methane concentrations arise from long stretches of natural gas distribution lines with many leaks. In other cases, the most likely cause is city buses or other vehicles powered by compressed natural gas. In some highly complex situations, both vehicle and natural gas sources overlap to generate complex patterns that are difficult to resolve.

Under these scenarios, attribution of elevated methane readings to leaks from natural gas local distribution infrastructure may be difficult. In other words, the leak patterns for local distribution leaks and non-local distribution sources (like traffic or biological sources) may be similar if elevated methane readings are detected over a large continuous area. CSU excludes these complex situations that present unclear attribution, resulting in conservative estimates of leak frequency and magnitude. Ongoing work will study ways to better resolve these complex patterns.


The CSU research team developed a way to estimate the leakage rates from verified sources. To develop this leak rate estimate, researchers conducted “controlled release” experiments, where methane was released into the air at varying rates, and vehicles are driven through the resulting plume. They found that the methane concentration patterns are distinctly different with different methane leak rates, allowing researchers to assign leaks to bins of small (0-13 cubic feet per hour), medium (13 to 85 cubic feet per hour) and large (above 85 cubic feet per hour) leak rate. Due to variability in wind and other uncertainties, it is not possible to provide more precise estimates of leak rates at this time.

Results were validated through blind controlled releases followed by a comparison of predicted versus observed results. In addition, approximately seven leaks were measured using comprehensive in-place tenting methods, the results of which were compared to the emissions rates using the procedures described above. Both validation efforts demonstrate the accuracy of the technique.


The methane analyzers used in this study are high-precision and high-accuracy instruments when used as per the manufacturer’s specifications, so it is extremely unlikely that any elevated readings are a result of errors with the instrument. False positives and false negatives are both possible, but are not evenly spread across the study area. Statistical analysis shows the probability of missing an observable leak, when the area is driven three times, is less than 6%.

This variable is taken into account when estimating the overall leak frequency from a city or part of a city. When driving past known leaks, the study’s best data indicates that with a single drive-by, there is a 76% chance of observing the leak.

The probability of missing or identifying a leak depends on the magnitude of the leak and researchers are actively working to assess the precision and accuracy of the collection methods. Researchers expect these statistics to be included in published assessments.

Data Analysis: Algorithm

Because this project accumulates large amounts of data (about 194,000 data entries per hour) an automated way to analyze the patterns in the data was needed. The CSU team developed a set of data analysis codes used to automatically process the data in a number of steps.

First, data are checked for quality and are corrected for any known influences (e.g., the travel time of air from the front bumper to the methane analyzer).

Then the data are examined for elevated readings and the repeated observations of elevated readings are collected.

Data undergo an additional manual screening for outliers and verification. Where possible, these readings are compared against leak locations known to the utility, as well.

Future Goals and Launch

The current industry standard practice of leak surveys is relatively labor intensive, requiring two or more highly-trained drivers plus well-trained walking survey workers.

The types of gas analyzers utilities typically use for leak detection vary widely. While some utilities use older flame ionization detector (FID) systems, others have adopted newer laser-based instruments. The FID analyzers are not able to detect small changes above background methane levels, and thus require a higher minimum methane concentration to identify a leak. The FIDs are also unable to discriminate between methane and other hydrocarbons like gasoline.

The laser-based instruments are more effective in identifying primarily methane even at lower concentrations. Older laser instruments from the 1990s and early 2000s have about 1 part(s) ppm precision and a minimum concentration of about 3 ppm, which is less sensitive and precise compared to the newest instruments used for this project. And even within laser-based systems, there are differences among different aged instruments.

This project takes advantage of improved sensor technology and sophisticated data analysis allowing minimally-trained drivers to gather the needed data for rapid leak detection.

The underlying data used in the development of these maps is now public. Request the data.

Over the next year, the data analysis and map publication will become increasingly automated. EDF will continue to estimate city and/or sub-city leak rates and validate data with utilities. Over the long term, the team hopes to enhance understanding of how this study’s techniques can be applied to other vehicle-based environmental observations.


Media contacts

  • Jon Coifman
    (212) 616-1325 (office)
    (917) 575-1885 (cell)
  • Lauren Whittenberg
    (512) 691-3437 (office)
    (512) 784-2161 (cell)

Request access to dataset

We invite researchers, utilities and regulators to explore and use the data from this project.

Request access »

Project experts

Steve Hamburg Chief Scientist Contact

Joe von FischerAssociate Professor, Colorado State University

Millie Baird Managing Director, Office of the Chief Scientist Contact