The datasets presented here have been divided into three categories: Source data, Intermediate data, and Output data. Source data consists of the raw temperature reports that form the foundation of our averaging system. Source observations are provided as originally reported and will contain many quality control and redundancy issues. Intermediate data is constructed from the source data by merging redundant records, identifying a variety of quality control problems, and creating monthly averages from daily reports when necessary. And lastly, the Berkeley Earth averaging process generates a variety of Output data including a set of gridded temperature fields, regional averages, and bias-corrected stations data. The definitive repository for Source and Intermediate data is located in the SVN, which is built nightly. The Source and Intermediate data files presented here are offered to assist users who cannot use Matlab.
Gridded datasets are available below in netcdf format.
- Average Temperature (TAVG) - equal area
- TAVG - 1 degree latitude longitude (lat long 1)
- Maximum Temperature (TMAX) - equal area
- TMAX - lat long 1
- Minimum Temperature (TMIN) - equal area
- TMIN lat long 1
Time Series Data
Time series data is available on the results by location page: http://berkeleyearth.lbl.gov/country-list/. Time series for individual stations, cities, states, countries, continents can all be retrieved. Current Global and hemispherical series are available here:
- Global: http://berkeleyearth.lbl.gov/regions/global-land
- Northern Hemisphere: http://berkeleyearth.lbl.gov/regions/northern-hemisphere
- Southern Hemisphere: http://berkeleyearth.lbl.gov/regions/southern-hemisphere
Experimental datasets are listed below. These are new products in a late stage of development, and are included here so that potential users can give us feedback. For inquiries about this data please contact .
¼ degree Gridded TAVG for CONUS and Europe. Ftp is located here: http://berkeleyearth.lbl.gov/auto/Global/Gridded/
Breakpoint Adjusted Station Data
During the Berkeley Earth averaging process we compare each station to other stations in its local neighborhood, which allows us to identify discontinuities and other heterogeneities in the time series from individual weather stations. The averaging process is then designed to automatically compensate for various biases that appear to be present. After the average field is constructed, it is possible to create a set of estimated bias corrections that suggest what the weather station might have reported had apparent biasing events not occurred. This breakpoint-adjusted data set provides a collection of adjusted, homogeneous station data that is recommended for users who want to avoid heterogeneities in station temperature data.
The source files we used to create the Berkeley Earth database are available in a common format.
This includes all time series from the originating datasets. Due to duplication with the same data being reported by multiple agencies, on average there will be 3-4 time series reports with each site. Only limited quality control flagging has been performed at this stage.
Data have been collapsed so that there is only one time series per location. Quality control procedures have been completed and their output is reported via a series of quality "flags". Users of this data set will have to consider these flags and remove any data they don't want to use. Seasonality is preserved in this data set.
Same as "Single-valued" except that all values flagged as bad via the quality control processes have been removed. This dataset is recommended for users that require relatively clean data. However, no adjustments have been made for heterogeneous and other biasing events. Please consider the breakpoint adjusted station data above if you wish to avoid heterogeneity.