Journal cover Journal topic
Geoscientific Model Development An interactive open-access journal of the European Geosciences Union
Journal topic

Journal metrics

Journal metrics

  • IF value: 4.252 IF 4.252
  • IF 5-year value: 4.890 IF 5-year 4.890
  • CiteScore value: 4.49 CiteScore 4.49
  • SNIP value: 1.539 SNIP 1.539
  • SJR value: 2.404 SJR 2.404
  • IPP value: 4.28 IPP 4.28
  • h5-index value: 40 h5-index 40
  • Scimago H index value: 51 Scimago H index 51
Discussion papers
https://doi.org/10.5194/gmd-2018-250
© Author(s) 2018. This work is distributed under
the Creative Commons Attribution 4.0 License.
https://doi.org/10.5194/gmd-2018-250
© Author(s) 2018. This work is distributed under
the Creative Commons Attribution 4.0 License.

Development and technical paper 20 Nov 2018

Development and technical paper | 20 Nov 2018

Review status
This discussion paper is a preprint. It is a manuscript under review for the journal Geoscientific Model Development (GMD).

Evaluation of lossless and lossy algorithms for the compression of scientific datasets in NetCDF-4 or HDF5 formatted files

Xavier Delaunay1, Aurélie Courtois1, and Flavien Gouillon2 Xavier Delaunay et al.
  • 1Thales Services, 290 allée du Lac, 31670 Labège, France
  • 2CNES, Centre spatial de Toulouse, 18 avenue Edouard Belin, 31401 Toulouse, France

Abstract. The increasing volume of scientific datasets imposes the use of compression to reduce the data storage or transmission costs, specifically for the oceanography or meteorological datasets generated by Earth observation mission ground segments. These data are mostly produced in NetCDF formatted files. Indeed, the NetCDF-4/HDF5 file formats are widely spread in the global scientific community because of the nice features they offer. Particularly, the HDF5 offers the dynamically loaded filter plugin functionality allowing users to write filters, such as compression/decompression filters, to process the data before reading or writing it on the disk. In this work, we evaluate the performance of lossy and lossless compression/decompression methods through NetCDF-4 and HDF5 tools on analytical and real scientific floating-point datasets. We also introduce the Digit Rounding algorithm, a new relative error bounded data reduction method inspired by the Bit Grooming algorithm. The Digit Rounding algorithm allows high compression ratio while preserving a given number of significant digits in the dataset. It achieves higher compression ratio than the Bit Grooming algorithm while keeping similar compression speed.

Xavier Delaunay et al.
Interactive discussion
Status: open (until 15 Jan 2019)
Status: open (until 15 Jan 2019)
AC: Author comment | RC: Referee comment | SC: Short comment | EC: Editor comment
[Subscribe to comment alert] Printer-friendly Version - Printer-friendly version Supplement - Supplement
Xavier Delaunay et al.
Xavier Delaunay et al.
Viewed  
Total article views: 266 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
203 60 3 266 3 4
  • HTML: 203
  • PDF: 60
  • XML: 3
  • Total: 266
  • BibTeX: 3
  • EndNote: 4
Views and downloads (calculated since 20 Nov 2018)
Cumulative views and downloads (calculated since 20 Nov 2018)
Viewed (geographical distribution)  
Total article views: 264 (including HTML, PDF, and XML) Thereof 263 with geography defined and 1 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Cited  
Saved  
No saved metrics found.
Discussed  
No discussed metrics found.
Latest update: 10 Dec 2018
Publications Copernicus
Download
Short summary
This work evaluates the performance of lossy and lossless compression/decompression of NetCDF-4/HDF5 floating-point datasets. It also introduces the Digit Rounding algorithm. It is a relative error bounded data reduction method inspired by the Bit Grooming algorithm. It allows high compression ratio while preserving a given number of significant digits in the dataset, and achieves higher compression ratio than the Bit Grooming algorithm while keeping similar compression speed.
This work evaluates the performance of lossy and lossless compression/decompression of...
Citation
Share