Age | Commit message (Collapse) | Author |
|
non-splitted -> non-split
uncompressible -> incompressible
|
|
I happened to find these from the Linux source tree.
|
|
MicroLZMA is a yet another header format variant where the first
byte of a raw LZMA stream (without the end of stream marker) has
been replaced with a bitwise-negation of the lc/lp/pb properties
byte. MicroLZMA was created to be used in EROFS but can be used
by other things too where wasting minimal amount of space for
headers is important.
This is implemented using most of the LZMA2 code as is so the
amount of new code is small. The API has a few extra features
compared to the XZ decoder. On the other hand, the API lacks
XZ_BUF_ERROR support which is important to take into account
when using this API.
MicroLZMA doesn't support BCJ filters. In theory they could be
added later as there are many unused/reserved values for the
first byte of the compressed stream but in practice it is
somewhat unlikely to happen due to a few implementation reasons.
Thanks to Gao Xiang (EROFS developer) for testing and feedback.
|
|
This might matter, for example, if the underlying type of
enum xz_check was a signed char. In such a case the validation
wouldn't catch an unsupported header.
With most compilers it already worked correctly but it's better
to change it for portability and conformance. This may increase
the code size by a few bytes though. An alternative would be to use
an unsigned int instead of enum xz_check but using an enumeration
looks cleaner.
|
|
It's a more logical place even if the resetting needs to be done
only once per LZMA2 stream (if lzma_reset() called in the middle
of an LZMA2 stream, .len will already be 0).
|
|
|
|
|
|
When "unsigned long" is 32 bits and GCC or Clang is in gnu89 mode,
the 64-bit constant doesn't become "unsigned long long" like it
would in C99. This is because in gnu89 mode "unsigned long long"
is a GNU extension to C89 and isn't considered when selecting the
type of the integer constant.
The CRC64 support was added in 2013 and the code has been broken
on 32-bit platforms unless one modified the Makefile to set C99
or a newer C standard. I didn't want to omit -std=gnu89 because
Linux still uses it and xz_crc64.c (which isn't in Linux) was
the only place that wasn't compatible with -std=gnu89.
Thanks to bzt for reporting the problem.
|
|
Thanks to Alexander A. Klimov.
|
|
With valid files, the safety margin described in lib/decompress_unxz.c
ensures that these buffers cannot overlap. But if the uncompressed size
of the input is larger than the caller thought, which is possible when
the input file is invalid/corrupt, the buffers can overlap. Obviously
the result will then be garbage (and usually the decoder will return
an error too) but no other harm will happen when such an over-run occurs.
This change only affects uncompressed LZMA2 chunks and so this
should have no effect on performance.
|
|
s->dict.allocated was initialized to 0 but never set after
a successful allocation, thus the code always thought that
the dictionary buffer has to be reallocated.
Thanks to Yu Sun from Cisco Systems for reporting this bug.
|
|
|
|
GCC7 is more strict than previous versions; add missing fall through
annotations. Style is modeled after similar annotations in xz_dec_lzma2.c.
|
|
|
|
This drops "if EXPERT" from BCJ filter configuration, and the
default configuration enables only the BCJ filter(s) that are
likely needed on the target arch: x86 BCJ filter is enabled
on x86, PowerPC filter on PowerPC, and so on.
Patches do this were made by Florian Fainelli, and a typo was
fixed by Paul Bolle.
|
|
|
|
xz_dec_run() could incorrectly return XZ_BUF_ERROR if
all of the following was true:
- The caller knows how many bytes of output to expect
and only provides that much output space.
- When the last output bytes are decoded, the
caller-provided input buffer ends right before
the LZMA2 end of payload marker. So LZMA2 won't
provide more output anymore, but it won't know it
yet and thus won't return XZ_STREAM_END yet.
- A BCJ filter is in use and it hasn't left any
unfiltered bytes in the temp buffer. This can happen
with any BCJ filter, but in practice it's more likely
with filters other than the x86 BCJ.
This fixes <https://bugzilla.redhat.com/show_bug.cgi?id=735408>
where Squashfs thinks that a valid file system is corrupt.
Thanks to Jindrich Novy for telling me that such a bug report
exists, Phillip Lougher for providing excellent debug info,
and other people on #fedora-ppc.
This also fixes a similar bug in single-call mode where the
uncompressed size of a XZ Block using BCJ + LZMA2 was 0 bytes
and caller provided no output space. Many empty .xz files
don't contain any Blocks and thus don't trigger this bug.
This also tweaks a closely related detail: xz_dec_bcj_run()
could call xz_dec_lzma2_run() to decode into temp buffer when
it was known to be useless. This was harmless although it
wasted a minuscule number of CPU cycles.
|
|
The min_t macro is defined in <linux/kernel.h>.
On x86 <linux/kernel.h> is included indirectly
via <asm/unaligned.h>, thus the missing include
wasn't caught on x86.
Since <linux/kernel.h> always includes <asm/byteorder.h>,
there's no need to include the latter explicitly.
Thanks to Russel King and and Imre Kaloz.
|
|
|
|
No .xz encoder creates files with empty LZMA2 streams,
but such files would still be valid and decompressors
must accept them.
Note that empty .xz files are a different thing than
empty LZMA2 streams. This bug didn't affect typical .xz
files that had no uncompressed data.
|
|
|
|
I clearly wasn't fully awake with the commit ac313d.
Thanks to Andrew Morton.
|
|
It's not needed to keep the stack usage of xz_dec_bcj_run()
low because the BCJ filters get inlined into bcj_apply(),
and that is not inlined into xz_dec_bcj_run().
Thanks to Andrew Morton.
|
|
Thanks to Andrew Morton.
|
|
Thanks to Andrew Morton.
|
|
|
|
It's no longer needed because initramfs decompression
uses the regular xz_dec module.
The definitions of memeq(), memzero(), and get_le32()
macros were moved to be done after all headers have
been included. Shouldn't matter in practice but looks
safer just in case some of those names appear in other
headers in the future.
|
|
This applies to xz_crc32_table. It's needed by the
pre-boot code on some architectures.
|
|
In Linux 2.6.31 (or so) and earlier, the initramfs
decompression had its own compiled copy of the
decompression code that got thrown away after the
kernel had booted. It required that all functions were
marked with __init when built for initramfs decompression.
Nowadays zlib and LZO have a wrapper that requires that
the respective decompressor code has been been enabled (=y)
in the kernel config. Only the wrapper is marked with
__init. This patch helps doing the same with the XZ
decompressor.
|
|
|
|
Caught by checkpatch.pl.
|
|
|
|
I thought it was more readable to write it there
explicitly, but since C will put a \0 there anyway,
relying on that can save one byte in code size.
|
|
The variables in structures in xz_dec_lzma2.c were reordered
so that the variables that the code references most are near
the beginning of the structure within 128 bytes. This allows
three bytes smaller instructions to access the variables, and
saves around 700-900 bytes in code size.
|
|
Previously the dictionary was preallocated at initialization
time, which is useful since the decoder cannot then later run
out of memory, but in several cases it is just an annoying
limitation.
It is now possible to enable only the needed operation mode(s)
at build time, which saves a few bytes in code size if only
one or two modes are actually needed. Bigger savings would be
possible especially in single-call mode, but I'll think about
that later.
This commit changes the API by adding the mode argument to
xz_dec_init(). A new return value (XZ_MEM_ERROR) was also added,
but it is used only in the new XZ_DYNALLOC operation mode.
|
|
This is enabled at compile time by defining XZ_DEC_ANY_CHECK.
If the Check ID is not supported, xz_dec_run() will return
XZ_UNSUPPORTED_CHECK. In multi-call mode, decoding can be
then continued normally. In single-call mode, decoding
cannot be continued, thus this feature is useful only
in multi-call mode.
|
|
Code that used
#define XZ_INTERNAL_CRC32
will now need to use:
#define XZ_INTERNAL_CRC32 1
This is to make it a little bit easier to use external CRC32
implementation outside the Linux kernel by using
#define XZ_INTERNAL_CRC32 0
and then providing xz_crc32() e.g. via xz_config.h.
|
|
Now xz_dec_bcj.c can be compiled even when no BCJ filters
are enabled (but only if the compiler supports files that
don't export any symbols). It also allows #including it
unconditionally in decompress_unxz.c.
|
|
Thanks to Denys Vlasenko.
|
|
correctly for initramfs decompression.
|
|
If the input file has BCJ filter and the output buffer
is too small in single-call mode, the LZMA2 decoder went
into an infinite loop. The actual bug was in the BCJ
decoder though, which called the LZMA2 decoder twice
when the output buffer was too small.
|
|
|
|
in early boot code in the Linux kernel. Thanks to Alain Knaff
for pointing this out.
It's possible that the compiler already avoided divide
instructions, but since there wasn't much to change,
it was simplest to change the code to be sure.
|
|
and initramfs decompression.
Linux 2.6.30 will have bzip2 and lzma support. This commit
adds a wrapper to convert the native XZ decompressor API to
the decompressor API that is used for kernel and initramfs
decompression in 2.6.30.
|
|
external linkage. This is needed to support marking those
functions as static in some situations in the Linux kernel.
XZ_EXTERN may be used for dllimport/dllexport too on some
other operating systems.
|
|
the code slightly in xz_dec_stream.c.
|
|
|
|
|
|
|
|
This makes it possible to use uint32_t as vli_type.
Doing so risks having some integer overflows unless
the caller can ensure that the total amounts of input
and output will stay below 256 MiB.
|