Python gzip module

Python gzip Module

We use the GZip application to compress and decompress files. It is part of the GNU Project. The Python gzip module is an interface to the GZip application. The gzip data compression algorithm itself is based on the zlib module.

The gzip module contains the definition of the GzipFile class and its methods. It also contains the convenience functions open(), compress(), and decompress().

Let’s discuss these functions:

open() Function

This function opens a gzip-compressed file in binary or text mode and returns a file object, which can be an actual file, a string, or a bytes object. By default, files are opened in ‘rb’ mode, meaning binary data is read. However, this function accepts other modes as its mode argument, as follows:

  • Binary mode – ‘r’, ‘rb’, ‘a’, ‘ab’, ‘w’, ‘wb’, ‘x’, ‘xb’.
  • Text mode – ‘rt’, ‘at’, ‘wt’, or ‘xt’.

This function also defines a compression level, with acceptable values between 0 and 9. When a file is opened in text mode, the GzipFile object is wrapped in a TextIOWrapper object.

compress() Function

This function applies compression to the given data and returns a compressed bytes object. The default compression level is 9.

decompress() Function

This function decompresses a bytes object and returns the decompressed data.

The following example creates a gzip file by writing compressed data to it.

import gzip
data=b'Python - Batteries included'
with gzip.open("test.txt.gz", "wb") as f:
f.write(data)

This will create a file named “test.txt.gz” in the current directory. This gzip archive contains the file “test.txt,” which you can verify using any decompression tool.

Read this compressed file programmatically.

with gzip.open("test.txt.gz", "rb") as f:
print (f.read())

Output

b'Python - Batteries included'

Compress an existing file into a gzip archive, read the text within it, and convert it into a byte array. This byte array object is then written to a gzip file. In the following example, assume there is a file named ‘zen.txt’ in the current directory.

fp=open("zen.txt","rb")
data=fp.read()
bindata=bytearray(data)
with gzip.open("zen.txt.gz", "wb") as f:
f.write(bindata)

Retrieve the uncompressed file from the gzip archive.

fp=open("zen1.txt", "wb")
with gzip.open("zen.txt.gz", "rb") as f:
bindata=f.read()
fp.write(bindata)
fp.close()

The above code will create ‘zen1.txt’ in the current directory, containing the same data as in ‘zen.txt’.

In addition to these convenience functions, the gzip module also provides the GzipFile class, which defines the compress() and decompress() methods. This class’s constructor accepts file, mode, and compression level arguments, which have the same meanings as described above.

When the mode argument is ‘w’, ‘wb’, or ‘wt’, the GzipFile object provides a write() method that compresses the given data and writes it to a gzip file.

f=gzip.GzipFile("testnew.txt.gz","wb")
data=b'Python - Batteries included'
f.write(data)
f.close()

This will create the testnew.txt.gz file. You can decompress it using any tool to view the text ‘Python – Batteries included’ contained within the testnew.txt file.

To decompress a gzip file using a GzipFile object, create it with the ‘rb’ value and read the uncompressed data using the read() method.

f=gzip.GzipFile("testnew.txt.gz","rb")
data=f.read()
print(data)

Leave a Reply

Your email address will not be published. Required fields are marked *