loongson/pypi/: zlib-state-0.1.9 metadata and description
Low-level interface to the zlib library that enables capturing the decoding state
author | Sean MacAvaney |
author_email | sean.macavaney@gmail.com |
classifiers |
|
description_content_type | text/markdown |
requires_python | >=3.6 |
File | Tox results | History |
---|---|---|
zlib_state-0.1.9-cp310-cp310-manylinux_2_27_loongarch64.whl
|
|
zlib-state
Low-level interface to the zlib library that enables capturing the decoding state.
Install
From PyPi:
pip install zlib-state
From source:
pip install .
Tested on Ubuntu/macOs/Windows with Python 3.7-3.12.
GzipStateFile
Wraps Decompressor as a buffered reader.
Based on my benchmarking, this is somewhat slower than python's gzip.
A typical usage pattern looks like:
import zlib_state TARGET_LINE = 5000 # pick back up after around the 5,000th line # Specify keep_last_state=True to tell object to grab and keep the state and pos after each block with zlib_state.GzipStateFile('testdata/frankenstein.txt.gz', keep_last_state=True) as f: for i, line in enumerate(f): if i == TARGET_LINE: state, pos = f.last_state, f.last_state_pos with zlib_state.GzipStateFile('testdata/frankenstein.txt.gz') as f: f.zseek(pos, state) remainder = f.read()
Decompressor
Very basic decompression object that's picky and unforgiving.
Based on my benchmarking, this can iterate over gzip files faster than python's gzip.
A typical usage pattern looks like:
import zlib_state decomp = zlib_state.Decompressor(32 + 15) # from zlib; 32 indicates gzip header, 15 window size block_count = 0 with open('testdata/frankenstein.txt.gz', 'rb') as f: while not decomp.eof(): needed_input = decomp.needs_input() if needed_input > 0: # decomp needs more input, and it tells you how much. decomp.feed_input(f.read(needed_input)) # next_chunk may be empty (e.g., if finished with gzip headers) or may contain data. # It sends as much as it has left in its output buffer, or asks zlib to continue. next_chunk = decomp.read() # you can also pass a maximum size to take and/or a buffer to write to if decomp.block_boundary(): block_count += 1 # When it reaches the end of a deflate block, it always stops. At these times, you can grab the state # if you wish. if block_count == 4: # resume after the 4th block state = decomp.get_state() # includes zdict, bits, byte -- everything it needs to resume from pos pos = decomp.total_in() # the current position in the binary file to resume from print(f'{block_count} blocks processed') # resume from somewhere in the file. Only possible spots are the block boundaries, given the state f.seek(pos) decomp = zlib_state.Decompressor(-15) # from zlib; 15 window size, negative means no headers decomp.set_state(*state) while not decomp.eof(): needed_input = decomp.needs_input() if needed_input > 0: # decomp needs more input, and it tells you how much. decomp.feed_input(f.read(needed_input)) next_chunk = decomp.read()