diff options
Diffstat (limited to 'xdelta3.c')
-rw-r--r-- | xdelta3.c | 5337 |
1 files changed, 5337 insertions, 0 deletions
diff --git a/xdelta3.c b/xdelta3.c new file mode 100644 index 0000000..9028d0c --- /dev/null +++ b/xdelta3.c @@ -0,0 +1,5337 @@ +/* xdelta 3 - delta compression tools and library + * Copyright (C) 2001, 2003, 2004, 2005, 2006, 2007. Joshua P. MacDonald + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + + ------------------------------------------------------------------- + + Xdelta 3 + + The goal of this library is to to implement both the (stand-alone) + data-compression and delta-compression aspects of VCDIFF encoding, and + to support a programming interface that works like Zlib + (http://www.gzip.org/zlib.html). See RFC3284: The VCDIFF Generic + Differencing and Compression Data Format. + + VCDIFF is a unified encoding that combines data-compression and + delta-encoding ("differencing"). + + VCDIFF has a detailed byte-code instruction set with many features. + The instruction format supports an immediate size operand for small + COPYs and ADDs (e.g., under 18 bytes). There are also instruction + "modes", which are used to compress COPY addresses by using two + address caches. An instruction mode refers to slots in the NEAR + and SAME caches for recent addresses. NEAR remembers the + previous 4 (by default) COPY addresses, and SAME catches + frequent re-uses of the same address using a 3-way (by default) + 256-entry associative cache of [ADDR mod 256], the encoded byte. + A hit in the NEAR/SAME cache requires 0/1 ADDR bytes. + + VCDIFF has a default instruction table, but an alternate + instruction tables may themselves be be delta-compressed and + included in the encoding header. This allows even more freedom. + There are 9 instruction modes in the default code table, 4 near, 3 + same, VCD_SELF (absolute encoding) and VCD_HERE (relative to the + current position). + + ---------------------------------------------------------------------- + + Algorithms + + Aside from the details of encoding and decoding, there are a bunch + of algorithms needed. + + 1. STRING-MATCH. A two-level fingerprinting approach is used. A + single loop computes the two checksums -- small and large -- at + successive offsets in the TARGET file. The large checksum is more + accurate and is used to discover SOURCE matches, which are + potentially very long. The small checksum is used to discover + copies within the TARGET. Small matching, which is more expensive, + usually dominates the large STRING-MATCH costs in this code - the + more exhaustive the search, the better the results. Either of the + two string-matching mechanisms may be disabled. + + 2. INSTRUCTION SELECTION. The IOPT buffer here represents a queue + used to store overlapping copy instructions. There are two possible + optimizations that go beyond a greedy search. Both of these fall + into the category of "non-greedy matching" optimizations. + + The first optimization stems from backward SOURCE-COPY matching. + When a new SOURCE-COPY instruction covers a previous instruction in + the target completely, it is erased from the queue. Randal Burns + originally analyzed these algorithms and did a lot of related work + (\cite the 1.5-pass algorithm). + + The second optimization comes by the encoding of common very-small + COPY and ADD instructions, for which there are special DOUBLE-code + instructions, which code two instructions in a single byte. + + The cost of bad instruction-selection overhead is relatively high + for data-compression, relative to delta-compression, so this second + optimization is fairly important. With "lazy" matching (the name + used in Zlib for a similar optimization), the string-match + algorithm searches after a match for potential overlapping copy + instructions. In Xdelta and by default, VCDIFF, the minimum match + size is 4 bytes, whereas Zlib searches with a 3-byte minimum. This + feature, combined with double instructions, provides a nice + challenge. Search in this file for "black magic", a heuristic. + + 3. STREAM ALIGNMENT. Stream alignment is needed to compress large + inputs in constant space. See xd3_srcwin_move_point(). + + 4. WINDOW SELECTION. When the IOPT buffer flushes, in the first call + to xd3_iopt_finish_encoding containing any kind of copy instruction, + the parameters of the source window must be decided: the offset into + the source and the length of the window. Since the IOPT buffer is + finite, the program may be forced to fix these values before knowing + the best offset/length. + + 5. SECONDARY COMPRESSION. VCDIFF supports a secondary encoding to + be applied to the individual sections of the data format, which are + ADDRess, INSTruction, and DATA. Several secondary compressor + variations are implemented here, although none is standardized yet. + + One is an adaptive huffman algorithm -- the FGK algorithm (Faller, + Gallager, and Knuth, 1985). This compressor is extremely slow. + + The other is a simple static Huffman routine, which is the base + case of a semi-adaptive scheme published by D.J. Wheeler and first + widely used in bzip2 (by Julian Seward). This is a very + interesting algorithm, originally published in nearly cryptic form + by D.J. Wheeler. !!!NOTE!!! Because these are not standardized, + secondary compression remains off by default. + ftp://ftp.cl.cam.ac.uk/users/djw3/bred3.{c,ps} + -------------------------------------------------------------------- + + Other Features + + 1. USER CONVENIENCE + + For user convenience, it is essential to recognize Gzip-compressed + files and automatically Gzip-decompress them prior to + delta-compression (or else no delta-compression will be achieved + unless the user manually decompresses the inputs). The compressed + represention competes with Xdelta, and this must be hidden from the + command-line user interface. The Xdelta-1.x encoding was simple, not + compressed itself, so Xdelta-1.x uses Zlib internally to compress the + representation. + + This implementation supports external compression, which implements + the necessary fork() and pipe() mechanics. There is a tricky step + involved to support automatic detection of a compressed input in a + non-seekable input. First you read a bit of the input to detect + magic headers. When a compressed format is recognized, exec() the + external compression program and create a second child process to + copy the original input stream. [Footnote: There is a difficulty + related to using Gzip externally. It is not possible to decompress + and recompress a Gzip file transparently. If FILE.GZ had a + cryptographic signature, then, after: (1) Gzip-decompression, (2) + Xdelta-encoding, (3) Gzip-compression the signature could be + broken. The only way to solve this problem is to guess at Gzip's + compression level or control it by other means. I recommend that + specific implementations of any compression scheme store + information needed to exactly re-compress the input, that way + external compression is transparent - however, this won't happen + here until it has stabilized.] + + 2. APPLICATION-HEADER + + This feature was introduced in RFC3284. It allows any application + to include a header within the VCDIFF file format. This allows + general inter-application data exchange with support for + application-specific extensions to communicate metadata. + + 3. VCDIFF CHECKSUM + + An optional checksum value is included with each window, which can + be used to validate the final result. This verifies the correct source + file was used for decompression as well as the obvious advantage: + checking the implementation (and underlying) correctness. + + 4. LIGHT WEIGHT + + The code makes efforts to avoid copying data more than necessary. + The code delays many initialization tasks until the first use, it + optimizes for identical (perfectly matching) inputs. It does not + compute any checksums until the first lookup misses. Memory usage + is reduced. String-matching is templatized (by slightly gross use + of CPP) to hard-code alternative compile-time defaults. The code + has few outside dependencies. + ---------------------------------------------------------------------- + + The default rfc3284 instruction table: + (see RFC for the explanation) + + TYPE SIZE MODE TYPE SIZE MODE INDEX + -------------------------------------------------------------------- + 1. Run 0 0 Noop 0 0 0 + 2. Add 0, [1,17] 0 Noop 0 0 [1,18] + 3. Copy 0, [4,18] 0 Noop 0 0 [19,34] + 4. Copy 0, [4,18] 1 Noop 0 0 [35,50] + 5. Copy 0, [4,18] 2 Noop 0 0 [51,66] + 6. Copy 0, [4,18] 3 Noop 0 0 [67,82] + 7. Copy 0, [4,18] 4 Noop 0 0 [83,98] + 8. Copy 0, [4,18] 5 Noop 0 0 [99,114] + 9. Copy 0, [4,18] 6 Noop 0 0 [115,130] + 10. Copy 0, [4,18] 7 Noop 0 0 [131,146] + 11. Copy 0, [4,18] 8 Noop 0 0 [147,162] + 12. Add [1,4] 0 Copy [4,6] 0 [163,174] + 13. Add [1,4] 0 Copy [4,6] 1 [175,186] + 14. Add [1,4] 0 Copy [4,6] 2 [187,198] + 15. Add [1,4] 0 Copy [4,6] 3 [199,210] + 16. Add [1,4] 0 Copy [4,6] 4 [211,222] + 17. Add [1,4] 0 Copy [4,6] 5 [223,234] + 18. Add [1,4] 0 Copy 4 6 [235,238] + 19. Add [1,4] 0 Copy 4 7 [239,242] + 20. Add [1,4] 0 Copy 4 8 [243,246] + 21. Copy 4 [0,8] Add 1 0 [247,255] + -------------------------------------------------------------------- + + Reading the source: Overview + + This file includes itself in several passes to macro-expand certain + sections with variable forms. Just read ahead, there's only a + little confusion. I know this sounds ugly, but hard-coding some of + the string-matching parameters results in a 10-15% increase in + string-match performance. The only time this hurts is when you have + unbalanced #if/endifs. + + A single compilation unit tames the Makefile. In short, this is to + allow the above-described hack without an explodingMakefile. The + single compilation unit includes the core library features, + configurable string-match templates, optional main() command-line + tool, misc optional features, and a regression test. Features are + controled with CPP #defines, see Makefile.am. + + The initial __XDELTA3_C_HEADER_PASS__ starts first, the _INLINE_ and + _TEMPLATE_ sections follow. Easy stuff first, hard stuff last. + + Optional features include: + + xdelta3-main.h The command-line interface, external compression + support, POSIX-specific, info & VCDIFF-debug tools. + xdelta3-second.h The common secondary compression routines. + xdelta3-decoder.h All decoding routines. + xdelta3-djw.h The semi-adaptive huffman secondary encoder. + xdelta3-fgk.h The adaptive huffman secondary encoder. + xdelta3-test.h The unit test covers major algorithms, + encoding and decoding. There are single-bit + error decoding tests. There are 32/64-bit file size + boundary tests. There are command-line tests. + There are compression tests. There are external + compression tests. There are string-matching tests. + There should be more tests... + + Additional headers include: + + xdelta3.h The public header file. + xdelta3-cfgs.h The default settings for default, built-in + encoders. These are hard-coded at + compile-time. There is also a single + soft-coded string matcher for experimenting + with arbitrary values. + xdelta3-list.h A cyclic list template + + Misc little debug utilities: + + badcopy.c Randomly modifies an input file based on two + parameters: (1) the probability that a byte in + the file is replaced with a pseudo-random value, + and (2) the mean change size. Changes are + generated using an expoential distribution + which approximates the expected error_prob + distribution. + -------------------------------------------------------------------- + + This file itself is unusually large. I hope to defend this layout + with lots of comments. Everything in this file is related to + encoding and decoding. I like it all together - the template stuff + is just a hack. */ + +#ifndef __XDELTA3_C_HEADER_PASS__ +#define __XDELTA3_C_HEADER_PASS__ + +#include <errno.h> +#include <string.h> + +#include "xdelta3.h" + +/*********************************************************************** + STATIC CONFIGURATION + ***********************************************************************/ + +#ifndef XD3_MAIN /* the main application */ +#define XD3_MAIN 0 +#endif + +#ifndef VCDIFF_TOOLS +#define VCDIFF_TOOLS XD3_MAIN +#endif + +#ifndef SECONDARY_FGK /* one from the algorithm preservation department: */ +#define SECONDARY_FGK 0 /* adaptive Huffman routines */ +#endif + +#ifndef SECONDARY_DJW /* semi-adaptive/static Huffman for the eventual */ +#define SECONDARY_DJW 0 /* standardization, off by default until such time. */ +#endif + +#ifndef GENERIC_ENCODE_TABLES /* These three are the RFC-spec'd app-specific */ +#define GENERIC_ENCODE_TABLES 0 /* code features. This is tested but not recommended */ +#endif /* unless there's a real application. */ +#ifndef GENERIC_ENCODE_TABLES_COMPUTE +#define GENERIC_ENCODE_TABLES_COMPUTE 0 +#endif +#ifndef GENERIC_ENCODE_TABLES_COMPUTE_PRINT +#define GENERIC_ENCODE_TABLES_COMPUTE_PRINT 0 +#endif + +#if XD3_ENCODER +#define IF_ENCODER(x) x +#else +#define IF_ENCODER(x) +#endif + +/***********************************************************************/ + +typedef enum { + + /* header indicator bits */ + VCD_SECONDARY = (1 << 0), /* uses secondary compressor */ + VCD_CODETABLE = (1 << 1), /* supplies code table data */ + VCD_APPHEADER = (1 << 2), /* supplies application data */ + VCD_INVHDR = ~7U, + + /* window indicator bits */ + VCD_SOURCE = (1 << 0), /* copy window in source file */ + VCD_TARGET = (1 << 1), /* copy window in target file */ + VCD_ADLER32 = (1 << 2), /* has adler32 checksum */ + VCD_INVWIN = ~7U, + + VCD_SRCORTGT = VCD_SOURCE | VCD_TARGET, + + /* delta indicator bits */ + VCD_DATACOMP = (1 << 0), + VCD_INSTCOMP = (1 << 1), + VCD_ADDRCOMP = (1 << 2), + VCD_INVDEL = ~0x7U, + +} xd3_indicator; + +typedef enum { + VCD_DJW_ID = 1, + VCD_FGK_ID = 16, /* Note: these are not standard IANA-allocated IDs! */ +} xd3_secondary_ids; + +typedef enum { + SEC_NOFLAGS = 0, + + /* Note: SEC_COUNT_FREQS Not implemented (to eliminate 1st Huffman pass) */ + SEC_COUNT_FREQS = (1 << 0), +} xd3_secondary_flags; + +typedef enum { + DATA_SECTION, /* These indicate which section to the secondary + * compressor. */ + INST_SECTION, /* The header section is not compressed, therefore not + * listed here. */ + ADDR_SECTION, +} xd3_section_type; + +typedef enum +{ + XD3_NOOP = 0, + XD3_ADD = 1, + XD3_RUN = 2, + XD3_CPY = 3, /* XD3_CPY rtypes are represented as (XD3_CPY + + * copy-mode value) */ +} xd3_rtype; + +/***********************************************************************/ + +#include "xdelta3-list.h" + +XD3_MAKELIST(xd3_rlist, xd3_rinst, link); + +/***********************************************************************/ + +#define SECONDARY_MIN_SAVINGS 2 /* Secondary compression has to save + at least this many bytes. */ +#define SECONDARY_MIN_INPUT 10 /* Secondary compression needs at + least this many bytes. */ + +#define VCDIFF_MAGIC1 0xd6 /* 1st file byte */ +#define VCDIFF_MAGIC2 0xc3 /* 2nd file byte */ +#define VCDIFF_MAGIC3 0xc4 /* 3rd file byte */ +#define VCDIFF_VERSION 0x00 /* 4th file byte */ + +#define VCD_SELF 0 /* 1st address mode */ +#define VCD_HERE 1 /* 2nd address mode */ + +#define CODE_TABLE_STRING_SIZE (6 * 256) /* Should fit a code table string. */ +#define CODE_TABLE_VCDIFF_SIZE (6 * 256) /* Should fit a compressed code + * table string */ + +#define SECONDARY_ANY (SECONDARY_DJW || SECONDARY_FGK) + +#define ALPHABET_SIZE 256 /* Used in test code--size of the secondary + * compressor alphabet. */ + +#define HASH_PERMUTE 1 /* The input is permuted by random nums */ +#define ADLER_LARGE_CKSUM 1 /* Adler checksum vs. RK checksum */ + +#define HASH_CKOFFSET 1U /* Table entries distinguish "no-entry" from + * offset 0 using this offset. */ + +#define MIN_SMALL_LOOK 2U /* Match-optimization stuff. */ +#define MIN_LARGE_LOOK 2U +#define MIN_MATCH_OFFSET 1U +#define MAX_MATCH_SPLIT 18U /* VCDIFF code table: 18 is the default limit + * for direct-coded ADD sizes */ + +#define LEAST_MATCH_INCR 0 /* The least number of bytes an overlapping + * match must beat the preceding match by. This + * is a bias for the lazy match optimization. A + * non-zero value means that an adjacent match + * has to be better by more than the step + * between them. 0. */ + +#define MIN_MATCH 4U /* VCDIFF code table: MIN_MATCH=4 */ +#define MIN_ADD 1U /* 1 */ +#define MIN_RUN 8U /* The shortest run, if it is shorter than this + * an immediate add/copy will be just as good. + * ADD1/COPY6 = 1I+1D+1A bytes, RUN18 = + * 1I+1D+1A. */ + +#define MAX_MODES 9 /* Maximum number of nodes used for + * compression--does not limit decompression. */ + +#define ENC_SECTS 4 /* Number of separate output sections. */ + +#define HDR_TAIL(s) ((s)->enc_tails[0]) +#define DATA_TAIL(s) ((s)->enc_tails[1]) +#define INST_TAIL(s) ((s)->enc_tails[2]) +#define ADDR_TAIL(s) ((s)->enc_tails[3]) + +#define HDR_HEAD(s) ((s)->enc_heads[0]) +#define DATA_HEAD(s) ((s)->enc_heads[1]) +#define INST_HEAD(s) ((s)->enc_heads[2]) +#define ADDR_HEAD(s) ((s)->enc_heads[3]) + +#define SIZEOF_ARRAY(x) (sizeof(x) / sizeof(x[0])) + +#define TOTAL_MODES(x) (2+(x)->acache.s_same+(x)->acache.s_near) + +/* Template instances. */ +#if XD3_BUILD_SLOW +#define IF_BUILD_SLOW(x) x +#else +#define IF_BUILD_SLOW(x) +#endif +#if XD3_BUILD_FAST +#define IF_BUILD_FAST(x) x +#else +#define IF_BUILD_FAST(x) +#endif +#if XD3_BUILD_FASTER +#define IF_BUILD_FASTER(x) x +#else +#define IF_BUILD_FASTER(x) +#endif +#if XD3_BUILD_FASTEST +#define IF_BUILD_FASTEST(x) x +#else +#define IF_BUILD_FASTEST(x) +#endif +#if XD3_BUILD_SOFT +#define IF_BUILD_SOFT(x) x +#else +#define IF_BUILD_SOFT(x) +#endif +#if XD3_BUILD_DEFAULT +#define IF_BUILD_DEFAULT(x) x +#else +#define IF_BUILD_DEFAULT(x) +#endif + +/* Consume N bytes of input, only used by the decoder. */ +#define DECODE_INPUT(n) \ + do { \ + stream->total_in += (xoff_t) (n); \ + stream->avail_in -= (n); \ + stream->next_in += (n); \ + } while (0) + +/* Update the run-length state */ +#define NEXTRUN(c) do { if ((c) == run_c) { run_l += 1; } \ + else { run_c = (c); run_l = 1; } } while (0) + +/* This CPP-conditional stuff can be cleaned up... */ +#if XD3_DEBUG +#define IF_DEBUG(x) x +#else +#define IF_DEBUG(x) +#endif +#if XD3_DEBUG > 1 +#define IF_DEBUG1(x) x +#else +#define IF_DEBUG1(x) +#endif +#if XD3_DEBUG > 2 +#define IF_DEBUG2(x) x +#else +#define IF_DEBUG2(x) +#endif +#if REGRESSION_TEST +#define IF_REGRESSION(x) x +#else +#define IF_REGRESSION(x) +#endif + +/***********************************************************************/ + +#if XD3_ENCODER +static void* xd3_alloc0 (xd3_stream *stream, + usize_t elts, + usize_t size); + + +static xd3_output* xd3_alloc_output (xd3_stream *stream, + xd3_output *old_output); + +static int xd3_alloc_iopt (xd3_stream *stream, int elts); + +static void xd3_free_output (xd3_stream *stream, + xd3_output *output); + +static int xd3_emit_byte (xd3_stream *stream, + xd3_output **outputp, + uint8_t code); + +static int xd3_emit_bytes (xd3_stream *stream, + xd3_output **outputp, + const uint8_t *base, + usize_t size); + +static int xd3_emit_double (xd3_stream *stream, xd3_rinst *first, + xd3_rinst *second, usize_t code); +static int xd3_emit_single (xd3_stream *stream, xd3_rinst *single, + usize_t code); + +static usize_t xd3_sizeof_output (xd3_output *output); +static void xd3_encode_reset (xd3_stream *stream); + +static int xd3_source_match_setup (xd3_stream *stream, xoff_t srcpos); +static int xd3_source_extend_match (xd3_stream *stream); +static int xd3_srcwin_setup (xd3_stream *stream); +static usize_t xd3_iopt_last_matched (xd3_stream *stream); +static int xd3_emit_uint32_t (xd3_stream *stream, xd3_output **output, + uint32_t num); + +static usize_t xd3_smatch (xd3_stream *stream, + usize_t base, + usize_t scksum, + usize_t *match_offset); +static int xd3_string_match_init (xd3_stream *stream); +static uint32_t xd3_scksum (uint32_t *state, const uint8_t *seg, const int ln); +static int xd3_comprun (const uint8_t *seg, int slook, uint8_t *run_cp); +static int xd3_srcwin_move_point (xd3_stream *stream, + usize_t *next_move_point); + +static int xd3_emit_run (xd3_stream *stream, usize_t pos, + usize_t size, uint8_t run_c); +static usize_t xd3_checksum_hash (const xd3_hash_cfg *cfg, + const usize_t cksum); +static xoff_t xd3_source_cksum_offset(xd3_stream *stream, usize_t low); +static void xd3_scksum_insert (xd3_stream *stream, + usize_t inx, + usize_t scksum, + usize_t pos); + + +#if XD3_DEBUG +static void xd3_verify_run_state (xd3_stream *stream, + const uint8_t *inp, + int x_run_l, + uint8_t x_run_c); +static void xd3_verify_large_state (xd3_stream *stream, + const uint8_t *inp, + uint32_t x_cksum); +static void xd3_verify_small_state (xd3_stream *stream, + const uint8_t *inp, + uint32_t x_cksum); + +#endif /* XD3_DEBUG */ +#endif /* XD3_ENCODER */ + +static int xd3_decode_allocate (xd3_stream *stream, usize_t size, + uint8_t **copied1, usize_t *alloc1); + +static void xd3_compute_code_table_string (const xd3_dinst *code_table, + uint8_t *str); +static void* xd3_alloc (xd3_stream *stream, usize_t elts, usize_t size); +static void xd3_free (xd3_stream *stream, void *ptr); + +static int xd3_read_uint32_t (xd3_stream *stream, const uint8_t **inpp, + const uint8_t *max, uint32_t *valp); + +#if REGRESSION_TEST +static int xd3_selftest (void); +#endif + +/***********************************************************************/ + +#define UINT32_OFLOW_MASK 0xfe000000U +#define UINT64_OFLOW_MASK 0xfe00000000000000ULL + +#ifndef UINT32_MAX +#define UINT32_MAX 4294967295U +#endif + +#ifndef UINT64_MAX +#define UINT64_MAX 18446744073709551615ULL +#endif + +#if SIZEOF_USIZE_T == 4 +#define USIZE_T_MAX UINT32_MAX +#define xd3_decode_size xd3_decode_uint32_t +#define xd3_emit_size xd3_emit_uint32_t +#define xd3_sizeof_size xd3_sizeof_uint32_t +#define xd3_read_size xd3_read_uint32_t +#elif SIZEOF_USIZE_T == 8 +#define USIZE_T_MAX UINT64_MAX +#define xd3_decode_size xd3_decode_uint64_t +#define xd3_emit_size xd3_emit_uint64_t +#define xd3_sizeof_size xd3_sizeof_uint64_t +#define xd3_read_size xd3_read_uint64_t +#endif + +#if SIZEOF_XOFF_T == 4 +#define XOFF_T_MAX UINT32_MAX +#define xd3_decode_offset xd3_decode_uint32_t +#define xd3_emit_offset xd3_emit_uint32_t +#elif SIZEOF_XOFF_T == 8 +#define XOFF_T_MAX UINT64_MAX +#define xd3_decode_offset xd3_decode_uint64_t +#define xd3_emit_offset xd3_emit_uint64_t +#endif + +#define USIZE_T_OVERFLOW(a,b) ((USIZE_T_MAX - (usize_t) (a)) < (usize_t) (b)) +#define XOFF_T_OVERFLOW(a,b) ((XOFF_T_MAX - (xoff_t) (a)) < (xoff_t) (b)) + +const char* xd3_strerror (int ret) +{ + switch (ret) + { + case XD3_INPUT: return "XD3_INPUT"; + case XD3_OUTPUT: return "XD3_OUTPUT"; + case XD3_GETSRCBLK: return "XD3_GETSRCBLK"; + case XD3_GOTHEADER: return "XD3_GOTHEADER"; + case XD3_WINSTART: return "XD3_WINSTART"; + case XD3_WINFINISH: return "XD3_WINFINISH"; + case XD3_TOOFARBACK: return "XD3_TOOFARBACK"; + case XD3_INTERNAL: return "XD3_INTERNAL"; + case XD3_INVALID_INPUT: return "XD3_INVALID_INPUT"; + } + return NULL; +} + +/***********************************************************************/ + +#define xd3_sec_data(s) ((s)->sec_stream_d) +#define xd3_sec_inst(s) ((s)->sec_stream_i) +#define xd3_sec_addr(s) ((s)->sec_stream_a) + +struct _xd3_sec_type +{ + int id; + const char *name; + xd3_secondary_flags flags; + + /* xd3_sec_stream is opaque to the generic code */ + xd3_sec_stream* (*alloc) (xd3_stream *stream); + void (*destroy) (xd3_stream *stream, + xd3_sec_stream *sec); + void (*init) (xd3_sec_stream *sec); + int (*decode) (xd3_stream *stream, + xd3_sec_stream *sec_stream, + const uint8_t **input, + const uint8_t *input_end, + uint8_t **output, + const uint8_t *output_end); +#if XD3_ENCODER + int (*encode) (xd3_stream *stream, + xd3_sec_stream *sec_stream, + xd3_output *input, + xd3_output *output, + xd3_sec_cfg *cfg); +#endif +}; + +#define BIT_STATE_ENCODE_INIT { 0, 1 } +#define BIT_STATE_DECODE_INIT { 0, 0x100 } + +typedef struct _bit_state bit_state; +struct _bit_state +{ + usize_t cur_byte; + usize_t cur_mask; +}; + +#if SECONDARY_ANY == 0 +#define IF_SEC(x) +#define IF_NSEC(x) x +#else /* yuck */ +#define IF_SEC(x) x +#define IF_NSEC(x) +static int +xd3_decode_secondary (xd3_stream *stream, + xd3_desect *sect, + xd3_sec_stream **sec_streamp); +#if XD3_ENCODER +static int +xd3_encode_secondary (xd3_stream *stream, + xd3_output **head, + xd3_output **tail, + xd3_sec_stream **sec_streamp, + xd3_sec_cfg *cfg, + int *did_it); +#endif +#endif /* SECONDARY_ANY */ + +#if SECONDARY_FGK +static const xd3_sec_type fgk_sec_type; +#define IF_FGK(x) x +#define FGK_CASE(s) \ + s->sec_type = & fgk_sec_type; \ + break; +#else +#define IF_FGK(x) +#define FGK_CASE(s) \ + s->msg = "unavailable secondary compressor: FGK Adaptive Huffman"; \ + return XD3_INTERNAL; +#endif + +#if SECONDARY_DJW +static const xd3_sec_type djw_sec_type; +#define IF_DJW(x) x +#define DJW_CASE(s) \ + s->sec_type = & djw_sec_type; \ + break; +#else +#define IF_DJW(x) +#define DJW_CASE(s) \ + s->msg = "unavailable secondary compressor: DJW Static Huffman"; \ + return XD3_INTERNAL; +#endif + +/***********************************************************************/ + +#include "xdelta3-hash.h" + +/* Process template passes - this includes xdelta3.c several times. */ +#define __XDELTA3_C_TEMPLATE_PASS__ +#include "xdelta3-cfgs.h" +#undef __XDELTA3_C_TEMPLATE_PASS__ + +/* Process the inline pass. */ +#define __XDELTA3_C_INLINE_PASS__ +#include "xdelta3.c" +#undef __XDELTA3_C_INLINE_PASS__ + +/* Secondary compression */ +#if SECONDARY_ANY +#include "xdelta3-second.h" +#endif + +#if SECONDARY_FGK +#include "xdelta3-fgk.h" +static const xd3_sec_type fgk_sec_type = +{ + VCD_FGK_ID, + "FGK Adaptive Huffman", + SEC_NOFLAGS, + (xd3_sec_stream* (*)()) fgk_alloc, + (void (*)()) fgk_destroy, + (void (*)()) fgk_init, + (int (*)()) xd3_decode_fgk, + IF_ENCODER((int (*)()) xd3_encode_fgk) +}; +#endif + +#if SECONDARY_DJW +#include "xdelta3-djw.h" +static const xd3_sec_type djw_sec_type = +{ + VCD_DJW_ID, + "Static Huffman", + SEC_COUNT_FREQS, + (xd3_sec_stream* (*)()) djw_alloc, + (void (*)()) djw_destroy, + (void (*)()) djw_init, + (int (*)()) xd3_decode_huff, + IF_ENCODER((int (*)()) xd3_encode_huff) +}; +#endif + +#if XD3_MAIN || PYTHON_MODULE || SWIG_MODULE || NOT_MAIN +#include "xdelta3-main.h" +#endif + +#if REGRESSION_TEST +#include "xdelta3-test.h" +#endif + +#if PYTHON_MODULE +#include "xdelta3-python.h" +#endif + +#endif /* __XDELTA3_C_HEADER_PASS__ */ +#ifdef __XDELTA3_C_INLINE_PASS__ + +/**************************************************************** + Instruction tables + *****************************************************************/ + +/* The following code implements a parametrized description of the + * code table given above for a few reasons. It is not necessary for + * implementing the standard, to support compression with variable + * tables, so an implementation is only required to know the default + * code table to begin decompression. (If the encoder uses an + * alternate table, the table is included in compressed form inside + * the VCDIFF file.) + * + * Before adding variable-table support there were two functions which + * were hard-coded to the default table above. + * xd3_compute_default_table() would create the default table by + * filling a 256-elt array of xd3_dinst values. The corresponding + * function, xd3_choose_instruction(), would choose an instruction + * based on the hard-coded parameters of the default code table. + * + * Notes: The parametrized code table description here only generates + * tables of a certain regularity similar to the default table by + * allowing to vary the distribution of single- and + * double-instructions and change the number of near and same copy + * modes. More exotic tables are only possible by extending this + * code. + * + * For performance reasons, both the parametrized and non-parametrized + * versions of xd3_choose_instruction remain. The parametrized + * version is only needed for testing multi-table decoding support. + * If ever multi-table encoding is required, this can be optimized by + * compiling static functions for each table. + */ + +/* The XD3_CHOOSE_INSTRUCTION calls xd3_choose_instruction with the + * table description when GENERIC_ENCODE_TABLES are in use. The + * IF_GENCODETBL macro enables generic-code-table specific code. */ +#if GENERIC_ENCODE_TABLES +#define XD3_CHOOSE_INSTRUCTION(stream,prev,inst) xd3_choose_instruction (stream->code_table_desc, prev, inst) +#define IF_GENCODETBL(x) x +#else +#define XD3_CHOOSE_INSTRUCTION(stream,prev,inst) xd3_choose_instruction (prev, inst) +#define IF_GENCODETBL(x) +#endif + +/* This structure maintains information needed by + * xd3_choose_instruction to compute the code for a double instruction + * by first indexing an array of code_table_sizes by copy mode, then + * using (offset + (muliplier * X)) */ +struct _xd3_code_table_sizes { + uint8_t cpy_max; + uint8_t offset; + uint8_t mult; +}; + +/* This contains a complete description of a code table. */ +struct _xd3_code_table_desc +{ + /* Assumes a single RUN instruction */ + /* Assumes that MIN_MATCH is 4 */ + + uint8_t add_sizes; /* Number of immediate-size single adds (default 17) */ + uint8_t near_modes; /* Number of near copy modes (default 4) */ + uint8_t same_modes; /* Number of same copy modes (default 3) */ + uint8_t cpy_sizes; /* Number of immediate-size single copies (default 15) */ + + uint8_t addcopy_add_max; /* Maximum add size for an add-copy double instruction, + all modes (default 4) */ + uint8_t addcopy_near_cpy_max; /* Maximum cpy size for an add-copy double instruction, + up through VCD_NEAR modes (default 6) */ + uint8_t addcopy_same_cpy_max; /* Maximum cpy size for an add-copy double instruction, + VCD_SAME modes (default 4) */ + + uint8_t copyadd_add_max; /* Maximum add size for a copy-add double instruction, + all modes (default 1) */ + uint8_t copyadd_near_cpy_max; /* Maximum cpy size for a copy-add double instruction, + up through VCD_NEAR modes (default 4) */ + uint8_t copyadd_same_cpy_max; /* Maximum cpy size for a copy-add double instruction, + VCD_SAME modes (default 4) */ + + xd3_code_table_sizes addcopy_max_sizes[MAX_MODES]; + xd3_code_table_sizes copyadd_max_sizes[MAX_MODES]; +}; + +/* The rfc3284 code table is represented: */ +static const xd3_code_table_desc __rfc3284_code_table_desc = { + 17, /* add sizes */ + 4, /* near modes */ + 3, /* same modes */ + 15, /* copy sizes */ + + 4, /* add-copy max add */ + 6, /* add-copy max cpy, near */ + 4, /* add-copy max cpy, same */ + + 1, /* copy-add max add */ + 4, /* copy-add max cpy, near */ + 4, /* copy-add max cpy, same */ + + /* addcopy */ + { {6,163,3},{6,175,3},{6,187,3},{6,199,3},{6,211,3},{6,223,3},{4,235,1},{4,239,1},{4,243,1} }, + /* copyadd */ + { {4,247,1},{4,248,1},{4,249,1},{4,250,1},{4,251,1},{4,252,1},{4,253,1},{4,254,1},{4,255,1} }, +}; + +#if GENERIC_ENCODE_TABLES +/* An alternate code table for testing (5 near, 0 same): + * + * TYPE SIZE MODE TYPE SIZE MODE INDEX + * --------------------------------------------------------------- + * 1. Run 0 0 Noop 0 0 0 + * 2. Add 0, [1,23] 0 Noop 0 0 [1,24] + * 3. Copy 0, [4,20] 0 Noop 0 0 [25,42] + * 4. Copy 0, [4,20] 1 Noop 0 0 [43,60] + * 5. Copy 0, [4,20] 2 Noop 0 0 [61,78] + * 6. Copy 0, [4,20] 3 Noop 0 0 [79,96] + * 7. Copy 0, [4,20] 4 Noop 0 0 [97,114] + * 8. Copy 0, [4,20] 5 Noop 0 0 [115,132] + * 9. Copy 0, [4,20] 6 Noop 0 0 [133,150] + * 10. Add [1,4] 0 Copy [4,6] 0 [151,162] + * 11. Add [1,4] 0 Copy [4,6] 1 [163,174] + * 12. Add [1,4] 0 Copy [4,6] 2 [175,186] + * 13. Add [1,4] 0 Copy [4,6] 3 [187,198] + * 14. Add [1,4] 0 Copy [4,6] 4 [199,210] + * 15. Add [1,4] 0 Copy [4,6] 5 [211,222] + * 16. Add [1,4] 0 Copy [4,6] 6 [223,234] + * 17. Copy 4 [0,6] Add [1,3] 0 [235,255] + * --------------------------------------------------------------- */ +static const xd3_code_table_desc __alternate_code_table_desc = { + 23, /* add sizes */ + 5, /* near modes */ + 0, /* same modes */ + 17, /* copy sizes */ + + 4, /* add-copy max add */ + 6, /* add-copy max cpy, near */ + 0, /* add-copy max cpy, same */ + + 3, /* copy-add max add */ + 4, /* copy-add max cpy, near */ + 0, /* copy-add max cpy, same */ + + /* addcopy */ + { {6,151,3},{6,163,3},{6,175,3},{6,187,3},{6,199,3},{6,211,3},{6,223,3},{0,0,0},{0,0,0} }, + /* copyadd */ + { {4,235,1},{4,238,1},{4,241,1},{4,244,1},{4,247,1},{4,250,1},{4,253,1},{0,0,0},{0,0,0} }, +}; +#endif + +/* Computes code table entries of TBL using the specified description. */ +static void +xd3_build_code_table (const xd3_code_table_desc *desc, xd3_dinst *tbl) +{ + usize_t size1, size2, mode; + usize_t cpy_modes = 2 + desc->near_modes + desc->same_modes; + xd3_dinst *d = tbl; + + (d++)->type1 = XD3_RUN; + (d++)->type1 = XD3_ADD; + + for (size1 = 1; size1 <= desc->add_sizes; size1 += 1, d += 1) + { + d->type1 = XD3_ADD; + d->size1 = size1; + } + + for (mode = 0; mode < cpy_modes; mode += 1) + { + (d++)->type1 = XD3_CPY + mode; + + for (size1 = MIN_MATCH; size1 < MIN_MATCH + desc->cpy_sizes; size1 += 1, d += 1) + { + d->type1 = XD3_CPY + mode; + d->size1 = size1; + } + } + + for (mode = 0; mode < cpy_modes; mode += 1) + { + for (size1 = 1; size1 <= desc->addcopy_add_max; size1 += 1) + { + usize_t max = (mode < 2U + desc->near_modes) ? + desc->addcopy_near_cpy_max : + desc->addcopy_same_cpy_max; + + for (size2 = MIN_MATCH; size2 <= max; size2 += 1, d += 1) + { + d->type1 = XD3_ADD; + d->size1 = size1; + d->type2 = XD3_CPY + mode; + d->size2 = size2; + } + } + } + + for (mode = 0; mode < cpy_modes; mode += 1) + { + usize_t max = (mode < 2U + desc->near_modes) ? + desc->copyadd_near_cpy_max : + desc->copyadd_same_cpy_max; + + for (size1 = MIN_MATCH; size1 <= max; size1 += 1) + { + for (size2 = 1; size2 <= desc->copyadd_add_max; size2 += 1, d += 1) + { + d->type1 = XD3_CPY + mode; + d->size1 = size1; + d->type2 = XD3_ADD; + d->size2 = size2; + } + } + } + + XD3_ASSERT (d - tbl == 256); +} + +/* This function generates the static default code table. */ +static const xd3_dinst* +xd3_rfc3284_code_table (void) +{ + static xd3_dinst __rfc3284_code_table[256]; + + if (__rfc3284_code_table[0].type1 != XD3_RUN) + { + xd3_build_code_table (& __rfc3284_code_table_desc, __rfc3284_code_table); + } + + return __rfc3284_code_table; +} + +#if XD3_ENCODER +#if GENERIC_ENCODE_TABLES +/* This function generates the alternate code table. */ +static const xd3_dinst* +xd3_alternate_code_table (void) +{ + static xd3_dinst __alternate_code_table[256]; + + if (__alternate_code_table[0].type1 != XD3_RUN) + { + xd3_build_code_table (& __alternate_code_table_desc, __alternate_code_table); + } + + return __alternate_code_table; +} + +/* This function computes the ideal second instruction INST based on + * preceding instruction PREV. If it is possible to issue a double + * instruction based on this pair it sets PREV->code2, otherwise it + * sets INST->code1. */ +static void +xd3_choose_instruction (const xd3_code_table_desc *desc, xd3_rinst *prev, xd3_rinst *inst) +{ + switch (inst->type) + { + case XD3_RUN: + /* The 0th instruction is RUN */ + inst->code1 = 0; + break; + + case XD3_ADD: + + if (inst->size > desc->add_sizes) + { + /* The first instruction is non-immediate ADD */ + inst->code1 = 1; + } + else + { + /* The following ADD_SIZES instructions are immediate ADDs */ + inst->code1 = 1 + inst->size; + + /* Now check for a possible COPY-ADD double instruction */ + if (prev != NULL) + { + int prev_mode = prev->type - XD3_CPY; + + /* If previous is a copy. Note: as long as the previous + * is not a RUN instruction, it should be a copy because + * it cannot be an add. This check is more clear. */ + if (prev_mode >= 0 && inst->size <= desc->copyadd_add_max) + { + const xd3_code_table_sizes *sizes = & desc->copyadd_max_sizes[prev_mode]; + + /* This check and the inst->size-<= above are == in + the default table. */ + if (prev->size <= sizes->cpy_max) + { + /* The second and third exprs are 0 in the + default table. */ + prev->code2 = sizes->offset + + (sizes->mult * (prev->size - MIN_MATCH)) + + (inst->size - MIN_ADD); + } + } + } + } + break; + + default: + { + int mode = inst->type - XD3_CPY; + + /* The large copy instruction is offset by the run, large add, + * and immediate adds, then multipled by the number of + * immediate copies plus one (the large copy) (i.e., if there + * are 15 immediate copy instructions then there are 16 copy + * instructions per mode). */ + inst->code1 = 2 + desc->add_sizes + (1 + desc->cpy_sizes) * mode; + + /* Now if the copy is short enough for an immediate instruction. */ + if (inst->size < MIN_MATCH + desc->cpy_sizes && + /* TODO: there needs to be a more comprehensive test for this + * boundary condition, merge is now exercising code in which + * size < MIN_MATCH is possible and it's unclear if the above + * size < (MIN_MATCH + cpy_sizes) should be a <= from inspection + * of the default table version below. */ + inst->size >= MIN_MATCH) + { + inst->code1 += inst->size + 1 - MIN_MATCH; + + /* Now check for a possible ADD-COPY double instruction. */ + if ( (prev != NULL) && + (prev->type == XD3_ADD) && + (prev->size <= desc->addcopy_add_max) ) + { + const xd3_code_table_sizes *sizes = & desc->addcopy_max_sizes[mode]; + + if (inst->size <= sizes->cpy_max) + { + prev->code2 = sizes->offset + + (sizes->mult * (prev->size - MIN_ADD)) + + (inst->size - MIN_MATCH); + } + } + } + } + } +} +#else /* GENERIC_ENCODE_TABLES */ + +/* This version of xd3_choose_instruction is hard-coded for the default + table. */ +static void +xd3_choose_instruction (xd3_rinst *prev, xd3_rinst *inst) +{ + switch (inst->type) + { + case XD3_RUN: + inst->code1 = 0; + break; + + case XD3_ADD: + inst->code1 = 1; + + if (inst->size <= 17) + { + inst->code1 += inst->size; + + if ( (inst->size == 1) && + (prev != NULL) && + (prev->size == 4) && + (prev->type >= XD3_CPY) ) + { + prev->code2 = 247 + (prev->type - XD3_CPY); + } + } + + break; + + default: + { + int mode = inst->type - XD3_CPY; + + XD3_ASSERT (inst->type >= XD3_CPY && inst->type < 12); + + inst->code1 = 19 + 16 * mode; + + if (inst->size <= 18 && inst->size >= 4) + { + inst->code1 += inst->size - 3; + + if ( (prev != NULL) && + (prev->type == XD3_ADD) && + (prev->size <= 4) ) + { + if ( (inst->size <= 6) && + (mode <= 5) ) + { + prev->code2 = 163 + (mode * 12) + (3 * (prev->size - 1)) + (inst->size - 4); + + XD3_ASSERT (prev->code2 <= 234); + } + else if ( (inst->size == 4) && + (mode >= 6) ) + { + prev->code2 = 235 + ((mode - 6) * 4) + (prev->size - 1); + + XD3_ASSERT (prev->code2 <= 246); + } + } + } + + XD3_ASSERT (inst->code1 <= 162); + } + break; + } +} +#endif /* GENERIC_ENCODE_TABLES */ + +/*********************************************************************** + Instruction table encoder/decoder + ***********************************************************************/ + +#if GENERIC_ENCODE_TABLES +#if GENERIC_ENCODE_TABLES_COMPUTE == 0 + +/* In this case, we hard-code the result of + * compute_code_table_encoding for each alternate code table, + * presuming that saves time/space. This has been 131 bytes, but + * secondary compression was turned off. */ +static const uint8_t __alternate_code_table_compressed[178] = +{0xd6,0xc3,0xc4,0x00,0x00,0x01,0x8a,0x6f,0x40,0x81,0x27,0x8c,0x00,0x00,0x4a,0x4a,0x0d,0x02,0x01,0x03, +0x01,0x03,0x00,0x01,0x00,0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x07,0x08,0x09,0x0a,0x0b,0x0c,0x0d,0x0e, +0x0f,0x10,0x11,0x12,0x13,0x14,0x15,0x16,0x17,0x00,0x01,0x01,0x01,0x02,0x02,0x02,0x03,0x03,0x03,0x04, +0x04,0x04,0x04,0x00,0x04,0x05,0x06,0x01,0x02,0x03,0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x05,0x05,0x05, +0x06,0x06,0x06,0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x00,0x02,0x00,0x18,0x13,0x63,0x00,0x1b,0x00,0x54, +0x00,0x15,0x23,0x6f,0x00,0x28,0x13,0x54,0x00,0x15,0x01,0x1a,0x31,0x23,0x6c,0x0d,0x23,0x48,0x00,0x15, +0x93,0x6f,0x00,0x28,0x04,0x23,0x51,0x04,0x32,0x00,0x2b,0x00,0x12,0x00,0x12,0x00,0x12,0x00,0x12,0x00, +0x12,0x00,0x12,0x53,0x57,0x9c,0x07,0x43,0x6f,0x00,0x34,0x00,0x0c,0x00,0x0c,0x00,0x0c,0x00,0x0c,0x00, +0x0c,0x00,0x0c,0x00,0x15,0x00,0x82,0x6f,0x00,0x15,0x12,0x0c,0x00,0x03,0x03,0x00,0x06,0x00,}; + +static int +xd3_compute_alternate_table_encoding (xd3_stream *stream, const uint8_t **data, usize_t *size) +{ + (*data) = __alternate_code_table_compressed; + (*size) = sizeof (__alternate_code_table_compressed); + return 0; +} + +#else + +/* The alternate code table will be computed and stored here. */ +static uint8_t __alternate_code_table_compressed[CODE_TABLE_VCDIFF_SIZE]; +static usize_t __alternate_code_table_compressed_size; + +/* This function generates a delta describing the code table for + * encoding within a VCDIFF file. This function is NOT thread safe + * because it is only intended that this function is used to generate + * statically-compiled strings. */ +int xd3_compute_code_table_encoding (xd3_stream *in_stream, + const xd3_dinst *code_table, + uint8_t *comp_string, + usize_t *comp_string_size) +{ + /* TODO: use xd3_encode_memory() */ + uint8_t dflt_string[CODE_TABLE_STRING_SIZE]; + uint8_t code_string[CODE_TABLE_STRING_SIZE]; + xd3_stream stream; + xd3_source source; + xd3_config config; + int ret; + + memset (& source, 0, sizeof (source)); + + xd3_compute_code_table_string (xd3_rfc3284_code_table (), dflt_string); + xd3_compute_code_table_string (code_table, code_string); + + /* Use DJW secondary compression if it is on by default. This saves + * about 20 bytes. */ + xd3_init_config (& config, XD3_FLUSH | (SECONDARY_DJW ? XD3_SEC_DJW : 0)); + + /* Be exhaustive. */ + config.sprevsz = 1<<11; + config.srcwin_maxsz = CODE_TABLE_STRING_SIZE; + + config.smatch_cfg = XD3_SMATCH_SOFT; + config.smatcher_soft.large_look = 4; + config.smatcher_soft.large_step = 1; + config.smatcher_soft.small_look = 4; + config.smatcher_soft.small_chain = CODE_TABLE_STRING_SIZE; + config.smatcher_soft.small_lchain = CODE_TABLE_STRING_SIZE; + config.smatcher_soft.max_lazy = CODE_TABLE_STRING_SIZE; + config.smatcher_soft.long_enough = CODE_TABLE_STRING_SIZE; + + if ((ret = xd3_config_stream (& stream, & config))) { goto fail; } + + source.size = CODE_TABLE_STRING_SIZE; + source.blksize = CODE_TABLE_STRING_SIZE; + source.onblk = CODE_TABLE_STRING_SIZE; + source.name = ""; + source.curblk = dflt_string; + source.curblkno = 0; + + if ((ret = xd3_set_source (& stream, & source))) { goto fail; } + + if ((ret = xd3_encode_stream (& stream, code_string, CODE_TABLE_STRING_SIZE, + comp_string, comp_string_size, CODE_TABLE_VCDIFF_SIZE))) { goto fail; } + + fail: + + in_stream->msg = stream.msg; + xd3_free_stream (& stream); + return ret; +} + +/* Compute a delta between alternate and rfc3284 tables. As soon as + * another alternate table is added, this code should become generic. + * For now there is only one alternate table for testing. */ +static int +xd3_compute_alternate_table_encoding (xd3_stream *stream, const uint8_t **data, usize_t *size) +{ + int ret; + + if (__alternate_code_table_compressed[0] == 0) + { + if ((ret = xd3_compute_code_table_encoding (stream, xd3_alternate_code_table (), + __alternate_code_table_compressed, + & __alternate_code_table_compressed_size))) + { + return ret; + } + + /* During development of a new code table, enable this variable to print + * the new static contents and determine its size. At run time the + * table will be filled in appropriately, but at least it should have + * the proper size beforehand. */ +#if GENERIC_ENCODE_TABLES_COMPUTE_PRINT + { + int i; + + DP(RINT, "\nstatic const usize_t __alternate_code_table_compressed_size = %u;\n", + __alternate_code_table_compressed_size); + + DP(RINT, "static const uint8_t __alternate_code_table_compressed[%u] =\n{", + __alternate_code_table_compressed_size); + + for (i = 0; i < __alternate_code_table_compressed_size; i += 1) + { + DP(RINT, "0x%02x,", __alternate_code_table_compressed[i]); + if ((i % 20) == 19) { DP(RINT, "\n"); } + } + + DP(RINT, "};\n"); + } +#endif + } + + (*data) = __alternate_code_table_compressed; + (*size) = __alternate_code_table_compressed_size; + + return 0; +} +#endif /* GENERIC_ENCODE_TABLES_COMPUTE != 0 */ +#endif /* GENERIC_ENCODE_TABLES */ + +#endif /* XD3_ENCODER */ + +/* This function generates the 1536-byte string specified in sections 5.4 and + * 7 of rfc3284, which is used to represent a code table within a VCDIFF + * file. */ +void xd3_compute_code_table_string (const xd3_dinst *code_table, uint8_t *str) +{ + int i, s; + + XD3_ASSERT (CODE_TABLE_STRING_SIZE == 6 * 256); + + for (s = 0; s < 6; s += 1) + { + for (i = 0; i < 256; i += 1) + { + switch (s) + { + case 0: *str++ = (code_table[i].type1 >= XD3_CPY ? XD3_CPY : code_table[i].type1); break; + case 1: *str++ = (code_table[i].type2 >= XD3_CPY ? XD3_CPY : code_table[i].type2); break; + case 2: *str++ = (code_table[i].size1); break; + case 3: *str++ = (code_table[i].size2); break; + case 4: *str++ = (code_table[i].type1 >= XD3_CPY ? code_table[i].type1 - XD3_CPY : 0); break; + case 5: *str++ = (code_table[i].type2 >= XD3_CPY ? code_table[i].type2 - XD3_CPY : 0); break; + } + } + } +} + +/* This function translates the code table string into the internal representation. The + * stream's near and same-modes should already be set. */ +static int +xd3_apply_table_string (xd3_stream *stream, const uint8_t *code_string) +{ + int i, s; + int modes = TOTAL_MODES (stream); + xd3_dinst *code_table; + + if ((code_table = stream->code_table_alloc = + (xd3_dinst*) xd3_alloc (stream, sizeof (xd3_dinst), 256)) == NULL) + { + return ENOMEM; + } + + for (s = 0; s < 6; s += 1) + { + for (i = 0; i < 256; i += 1) + { + switch (s) + { + case 0: + if (*code_string > XD3_CPY) + { + stream->msg = "invalid code-table opcode"; + return XD3_INTERNAL; + } + code_table[i].type1 = *code_string++; + break; + case 1: + if (*code_string > XD3_CPY) + { + stream->msg = "invalid code-table opcode"; + return XD3_INTERNAL; + } + code_table[i].type2 = *code_string++; + break; + case 2: + if (*code_string != 0 && code_table[i].type1 == XD3_NOOP) + { + stream->msg = "invalid code-table size"; + return XD3_INTERNAL; + } + code_table[i].size1 = *code_string++; + break; + case 3: + if (*code_string != 0 && code_table[i].type2 == XD3_NOOP) + { + stream->msg = "invalid code-table size"; + return XD3_INTERNAL; + } + code_table[i].size2 = *code_string++; + break; + case 4: + if (*code_string >= modes) + { + stream->msg = "invalid code-table mode"; + return XD3_INTERNAL; + } + if (*code_string != 0 && code_table[i].type1 != XD3_CPY) + { + stream->msg = "invalid code-table mode"; + return XD3_INTERNAL; + } + code_table[i].type1 += *code_string++; + break; + case 5: + if (*code_string >= modes) + { + stream->msg = "invalid code-table mode"; + return XD3_INTERNAL; + } + if (*code_string != 0 && code_table[i].type2 != XD3_CPY) + { + stream->msg = "invalid code-table mode"; + return XD3_INTERNAL; + } + code_table[i].type2 += *code_string++; + break; + } + } + } + + stream->code_table = code_table; + return 0; +} + +/* This function applies a code table delta and returns an actual code table. */ +static int +xd3_apply_table_encoding (xd3_stream *in_stream, const uint8_t *data, usize_t size) +{ + uint8_t dflt_string[CODE_TABLE_STRING_SIZE]; + uint8_t code_string[CODE_TABLE_STRING_SIZE]; + usize_t code_size; + xd3_stream stream; + xd3_source source; + int ret; + + /* The default code table string can be cached if alternate code tables ever become + * popular. */ + xd3_compute_code_table_string (xd3_rfc3284_code_table (), dflt_string); + + source.size = CODE_TABLE_STRING_SIZE; + source.blksize = CODE_TABLE_STRING_SIZE; + source.onblk = CODE_TABLE_STRING_SIZE; + source.name = "rfc3284 code table"; + source.curblk = dflt_string; + source.curblkno = 0; + + if ((ret = xd3_config_stream (& stream, NULL)) || + (ret = xd3_set_source (& stream, & source)) || + (ret = xd3_decode_stream (& stream, data, size, code_string, & code_size, sizeof (code_string)))) + { + in_stream->msg = stream.msg; + goto fail; + } + + if (code_size != sizeof (code_string)) + { + stream.msg = "corrupt code-table encoding"; + ret = XD3_INTERNAL; + goto fail; + } + + if ((ret = xd3_apply_table_string (in_stream, code_string))) { goto fail; } + + fail: + + xd3_free_stream (& stream); + return ret; +} + +/***********************************************************************/ + +static inline void +xd3_swap_uint8p (uint8_t** p1, uint8_t** p2) +{ + uint8_t *t = (*p1); + (*p1) = (*p2); + (*p2) = t; +} + +static inline void +xd3_swap_usize_t (usize_t* p1, usize_t* p2) +{ + usize_t t = (*p1); + (*p1) = (*p2); + (*p2) = t; +} + +/* It's not constant time, but it computes the log. */ +static int +xd3_check_pow2 (usize_t value, usize_t *logof) +{ + usize_t x = 1; + usize_t nolog; + if (logof == NULL) { + logof = &nolog; + } + + *logof = 0; + + for (; x != 0; x <<= 1, *logof += 1) + { + if (x == value) + { + return 0; + } + } + + return XD3_INTERNAL; +} + +static usize_t +xd3_pow2_roundup (usize_t x) +{ + usize_t i = 1; + while (x > i) { + i <<= 1; + } + return i; +} + +static usize_t +xd3_round_blksize (usize_t sz, usize_t blksz) +{ + usize_t mod = sz & (blksz-1); + + XD3_ASSERT (xd3_check_pow2 (blksz, NULL) == 0); + + return mod ? (sz + (blksz - mod)) : sz; +} + +/*********************************************************************** + Adler32 stream function: code copied from Zlib, defined in RFC1950 + ***********************************************************************/ + +#define A32_BASE 65521L /* Largest prime smaller than 2^16 */ +#define A32_NMAX 5552 /* NMAX is the largest n such that 255n(n+1)/2 + (n+1)(BASE-1) <= 2^32-1 */ + +#define A32_DO1(buf,i) {s1 += buf[i]; s2 += s1;} +#define A32_DO2(buf,i) A32_DO1(buf,i); A32_DO1(buf,i+1); +#define A32_DO4(buf,i) A32_DO2(buf,i); A32_DO2(buf,i+2); +#define A32_DO8(buf,i) A32_DO4(buf,i); A32_DO4(buf,i+4); +#define A32_DO16(buf) A32_DO8(buf,0); A32_DO8(buf,8); + +static unsigned long adler32 (unsigned long adler, const uint8_t *buf, usize_t len) +{ + unsigned long s1 = adler & 0xffff; + unsigned long s2 = (adler >> 16) & 0xffff; + int k; + + while (len > 0) + { + k = (len < A32_NMAX) ? len : A32_NMAX; + len -= k; + + while (k >= 16) + { + A32_DO16(buf); + buf += 16; + k -= 16; + } + + if (k != 0) + { + do + { + s1 += *buf++; + s2 += s1; + } + while (--k); + } + + s1 %= A32_BASE; + s2 %= A32_BASE; + } + + return (s2 << 16) | s1; +} + +/*********************************************************************** + Run-length function + ***********************************************************************/ + +#if XD3_ENCODER +static int +xd3_comprun (const uint8_t *seg, int slook, uint8_t *run_cp) +{ + int i; + int run_l = 0; + uint8_t run_c = 0; + + for (i = 0; i < slook; i += 1) + { + NEXTRUN(seg[i]); + } + + (*run_cp) = run_c; + + return run_l; +} +#endif + +/*********************************************************************** + Basic encoder/decoder functions + ***********************************************************************/ + +static inline int +xd3_decode_byte (xd3_stream *stream, usize_t *val) +{ + if (stream->avail_in == 0) + { + stream->msg = "further input required"; + return XD3_INPUT; + } + + (*val) = stream->next_in[0]; + + DECODE_INPUT (1); + return 0; +} + +static inline int +xd3_decode_bytes (xd3_stream *stream, uint8_t *buf, usize_t *pos, usize_t size) +{ + usize_t want; + usize_t take; + + /* Note: The case where (*pos == size) happens when a zero-length appheader or code + * table is transmitted, but there is nothing in the standard against that. */ + + while (*pos < size) + { + if (stream->avail_in == 0) + { + stream->msg = "further input required"; + return XD3_INPUT; + } + + want = size - *pos; + take = min (want, stream->avail_in); + + memcpy (buf + *pos, stream->next_in, take); + + DECODE_INPUT (take); + (*pos) += take; + } + + return 0; +} + +#if XD3_ENCODER +static inline int +xd3_emit_byte (xd3_stream *stream, + xd3_output **outputp, + uint8_t code) +{ + xd3_output *output = (*outputp); + + if (output->next == output->avail) + { + xd3_output *aoutput; + + if ((aoutput = xd3_alloc_output (stream, output)) == NULL) + { + return ENOMEM; + } + + output = (*outputp) = aoutput; + } + + output->base[output->next++] = code; + + return 0; +} + +static inline int +xd3_emit_bytes (xd3_stream *stream, + xd3_output **outputp, + const uint8_t *base, + usize_t size) +{ + xd3_output *output = (*outputp); + + do + { + usize_t take; + + if (output->next == output->avail) + { + xd3_output *aoutput; + + if ((aoutput = xd3_alloc_output (stream, output)) == NULL) + { + return ENOMEM; + } + + output = (*outputp) = aoutput; + } + + take = min (output->avail - output->next, size); + + memcpy (output->base + output->next, base, take); + + output->next += take; + size -= take; + base += take; + } + while (size > 0); + + return 0; +} +#endif /* XD3_ENCODER */ + +/********************************************************************* + Integer encoder/decoder functions + **********************************************************************/ + +#define DECODE_INTEGER_TYPE(PART,OFLOW) \ + while (stream->avail_in != 0) \ + { \ + usize_t next = stream->next_in[0]; \ + \ + DECODE_INPUT(1); \ + \ + if (PART & OFLOW) \ + { \ + stream->msg = "overflow in decode_integer"; \ + return XD3_INVALID_INPUT; \ + } \ + \ + PART = (PART << 7) | (next & 127); \ + \ + if ((next & 128) == 0) \ + { \ + (*val) = PART; \ + PART = 0; \ + return 0; \ + } \ + } \ + \ + stream->msg = "further input required"; \ + return XD3_INPUT + +#define READ_INTEGER_TYPE(TYPE, OFLOW) \ + TYPE val = 0; \ + const uint8_t *inp = (*inpp); \ + usize_t next; \ + \ + do \ + { \ + if (inp == max) \ + { \ + stream->msg = "end-of-input in read_integer"; \ + return XD3_INVALID_INPUT; \ + } \ + \ + if (val & OFLOW) \ + { \ + stream->msg = "overflow in read_intger"; \ + return XD3_INVALID_INPUT; \ + } \ + \ + next = (*inp++); \ + val = (val << 7) | (next & 127); \ + } \ + while (next & 128); \ + \ + (*valp) = val; \ + (*inpp) = inp; \ + \ + return 0 + +#define EMIT_INTEGER_TYPE() \ + /* max 64-bit value in base-7 encoding is 9.1 bytes */ \ + uint8_t buf[10]; \ + usize_t bufi = 10; \ + \ + XD3_ASSERT (num >= 0); \ + \ + /* This loop performs division and turns on all MSBs. */ \ + do \ + { \ + buf[--bufi] = (num & 127) | 128; \ + num >>= 7; \ + } \ + while (num != 0); \ + \ + /* Turn off MSB of the last byte. */ \ + buf[9] &= 127; \ + \ + XD3_ASSERT (bufi >= 0); \ + \ + return xd3_emit_bytes (stream, output, buf + bufi, 10 - bufi) + +#define IF_SIZEOF32(x) if (num < (1U << (7 * (x)))) return (x); +#define IF_SIZEOF64(x) if (num < (1ULL << (7 * (x)))) return (x); + +#if USE_UINT32 +static inline uint32_t +xd3_sizeof_uint32_t (uint32_t num) +{ + IF_SIZEOF32(1); + IF_SIZEOF32(2); + IF_SIZEOF32(3); + IF_SIZEOF32(4); + return 5; +} + +static inline int +xd3_decode_uint32_t (xd3_stream *stream, uint32_t *val) +{ DECODE_INTEGER_TYPE (stream->dec_32part, UINT32_OFLOW_MASK); } + +static inline int +xd3_read_uint32_t (xd3_stream *stream, const uint8_t **inpp, + const uint8_t *max, uint32_t *valp) +{ READ_INTEGER_TYPE (uint32_t, UINT32_OFLOW_MASK); } + +#if XD3_ENCODER +static inline int +xd3_emit_uint32_t (xd3_stream *stream, xd3_output **output, uint32_t num) +{ EMIT_INTEGER_TYPE (); } +#endif +#endif + +#if USE_UINT64 +static inline int +xd3_decode_uint64_t (xd3_stream *stream, uint64_t *val) +{ DECODE_INTEGER_TYPE (stream->dec_64part, UINT64_OFLOW_MASK); } + +#if XD3_ENCODER +static inline int +xd3_emit_uint64_t (xd3_stream *stream, xd3_output **output, uint64_t num) +{ EMIT_INTEGER_TYPE (); } +#endif + +/* These are tested but not used */ +#if REGRESSION_TEST +static int +xd3_read_uint64_t (xd3_stream *stream, const uint8_t **inpp, + const uint8_t *max, uint64_t *valp) +{ READ_INTEGER_TYPE (uint64_t, UINT64_OFLOW_MASK); } + +static uint32_t +xd3_sizeof_uint64_t (uint64_t num) +{ + IF_SIZEOF64(1); + IF_SIZEOF64(2); + IF_SIZEOF64(3); + IF_SIZEOF64(4); + IF_SIZEOF64(5); + IF_SIZEOF64(6); + IF_SIZEOF64(7); + IF_SIZEOF64(8); + IF_SIZEOF64(9); + + return 10; +} +#endif + +#endif + +/*********************************************************************** + Address cache stuff + ***********************************************************************/ + +static int +xd3_alloc_cache (xd3_stream *stream) +{ + if (stream->acache.near_array != NULL) + { + xd3_free (stream, stream->acache.near_array); + } + + if (stream->acache.same_array != NULL) + { + xd3_free (stream, stream->acache.same_array); + } + + if (((stream->acache.s_near > 0) && + (stream->acache.near_array = (usize_t*) + xd3_alloc (stream, stream->acache.s_near, sizeof (usize_t))) + == NULL) || + ((stream->acache.s_same > 0) && + (stream->acache.same_array = (usize_t*) + xd3_alloc (stream, stream->acache.s_same * 256, sizeof (usize_t))) + == NULL)) + { + return ENOMEM; + } + + return 0; +} + +void +xd3_init_cache (xd3_addr_cache* acache) +{ + if (acache->s_near > 0) + { + memset (acache->near_array, 0, acache->s_near * sizeof (usize_t)); + acache->next_slot = 0; + } + + if (acache->s_same > 0) + { + memset (acache->same_array, 0, acache->s_same * 256 * sizeof (usize_t)); + } +} + +static void +xd3_update_cache (xd3_addr_cache* acache, usize_t addr) +{ + if (acache->s_near > 0) + { + acache->near_array[acache->next_slot] = addr; + acache->next_slot = (acache->next_slot + 1) % acache->s_near; + } + + if (acache->s_same > 0) + { + acache->same_array[addr % (acache->s_same*256)] = addr; + } +} + +#if XD3_ENCODER +/* OPT: this gets called a lot, can it be optimized? */ +static int +xd3_encode_address (xd3_stream *stream, usize_t addr, usize_t here, uint8_t* mode) +{ + usize_t d, bestd; + usize_t i, bestm, ret; + xd3_addr_cache* acache = & stream->acache; + +#define SMALLEST_INT(x) do { if (((x) & ~127) == 0) { goto good; } } while (0) + + /* Attempt to find the address mode that yields the smallest integer value + * for "d", the encoded address value, thereby minimizing the encoded size + * of the address. */ + bestd = addr; + bestm = VCD_SELF; + + XD3_ASSERT (addr < here); + + SMALLEST_INT (bestd); + + if ((d = here-addr) < bestd) + { + bestd = d; + bestm = VCD_HERE; + + SMALLEST_INT (bestd); + } + + for (i = 0; i < acache->s_near; i += 1) + { + d = addr - acache->near_array[i]; + + if (d >= 0 && d < bestd) + { + bestd = d; + bestm = i+2; /* 2 counts the VCD_SELF, VCD_HERE modes */ + + SMALLEST_INT (bestd); + } + } + + if (acache->s_same > 0 && acache->same_array[d = addr%(acache->s_same*256)] == addr) + { + bestd = d%256; + bestm = acache->s_near + 2 + d/256; /* 2 + s_near offsets past the VCD_NEAR modes */ + + if ((ret = xd3_emit_byte (stream, & ADDR_TAIL (stream), bestd))) { return ret; } + } + else + { + good: + + if ((ret = xd3_emit_size (stream, & ADDR_TAIL (stream), bestd))) { return ret; } + } + + xd3_update_cache (acache, addr); + + (*mode) += bestm; + + return 0; +} +#endif + +static int +xd3_decode_address (xd3_stream *stream, usize_t here, + usize_t mode, const uint8_t **inpp, + const uint8_t *max, uint32_t *valp) +{ + int ret; + usize_t same_start = 2 + stream->acache.s_near; + + if (mode < same_start) + { + if ((ret = xd3_read_size (stream, inpp, max, valp))) { return ret; } + + switch (mode) + { + case VCD_SELF: + break; + case VCD_HERE: + (*valp) = here - (*valp); + break; + default: + (*valp) += stream->acache.near_array[mode - 2]; + break; + } + } + else + { + if (*inpp == max) + { + stream->msg = "address underflow"; + return XD3_INVALID_INPUT; + } + + mode -= same_start; + + (*valp) = stream->acache.same_array[mode*256 + (**inpp)]; + + (*inpp) += 1; + } + + xd3_update_cache (& stream->acache, *valp); + + return 0; +} + +/*********************************************************************** + Alloc/free +***********************************************************************/ + +static void* +__xd3_alloc_func (void* opaque, usize_t items, usize_t size) +{ + return malloc (items * size); +} + +static void +__xd3_free_func (void* opaque, void* address) +{ + free (address); +} + +static void* +xd3_alloc (xd3_stream *stream, + usize_t elts, + usize_t size) +{ + void *a = stream->alloc (stream->opaque, elts, size); + + if (a != NULL) + { + IF_DEBUG (stream->alloc_cnt += 1); + IF_DEBUG2 (DP(RINT "[stream %p malloc] size %u ptr %p\n", + stream, elts * size, a)); + } + else + { + stream->msg = "out of memory"; + } + + return a; +} + +static void +xd3_free (xd3_stream *stream, + void *ptr) +{ + if (ptr != NULL) + { + IF_DEBUG (stream->free_cnt += 1); + XD3_ASSERT (stream->free_cnt <= stream->alloc_cnt); + IF_DEBUG2 (DP(RINT "[stream %p free] %p\n", + stream, ptr)); + stream->free (stream->opaque, ptr); + } +} + +#if XD3_ENCODER +static void* +xd3_alloc0 (xd3_stream *stream, + usize_t elts, + usize_t size) +{ + void *a = xd3_alloc (stream, elts, size); + + if (a != NULL) + { + memset (a, 0, elts * size); + } + + return a; +} + +static xd3_output* +xd3_alloc_output (xd3_stream *stream, + xd3_output *old_output) +{ + xd3_output *output; + uint8_t *base; + + if (stream->enc_free != NULL) + { + output = stream->enc_free; + stream->enc_free = output->next_page; + } + else + { + if ((output = (xd3_output*) xd3_alloc (stream, 1, sizeof (xd3_output))) == NULL) + { + return NULL; + } + + if ((base = (uint8_t*) xd3_alloc (stream, XD3_ALLOCSIZE, sizeof (uint8_t))) == NULL) + { + xd3_free (stream, output); + return NULL; + } + + output->base = base; + output->avail = XD3_ALLOCSIZE; + } + + output->next = 0; + + if (old_output) + { + old_output->next_page = output; + } + + output->next_page = NULL; + + return output; +} + +static usize_t +xd3_sizeof_output (xd3_output *output) +{ + usize_t s = 0; + + for (; output; output = output->next_page) + { + s += output->next; + } + + return s; +} + +static void +xd3_freelist_output (xd3_stream *stream, + xd3_output *output) +{ + xd3_output *tmp; + + while (output) + { + tmp = output; + output = output->next_page; + + tmp->next = 0; + tmp->next_page = stream->enc_free; + stream->enc_free = tmp; + } +} + +static void +xd3_free_output (xd3_stream *stream, + xd3_output *output) +{ + xd3_output *next; + + again: + if (output == NULL) + { + return; + } + + next = output->next_page; + + xd3_free (stream, output->base); + xd3_free (stream, output); + + output = next; + goto again; +} +#endif /* XD3_ENCODER */ + +void +xd3_free_stream (xd3_stream *stream) +{ + xd3_iopt_buflist *blist = stream->iopt_alloc; + + while (blist != NULL) + { + xd3_iopt_buflist *tmp = blist; + blist = blist->next; + xd3_free (stream, tmp->buffer); + xd3_free (stream, tmp); + } + + xd3_free (stream, stream->large_table); + xd3_free (stream, stream->small_table); + xd3_free (stream, stream->small_prev); + +#if XD3_ENCODER + { + int i; + for (i = 0; i < ENC_SECTS; i += 1) + { + xd3_free_output (stream, stream->enc_heads[i]); + } + xd3_free_output (stream, stream->enc_free); + } +#endif + + xd3_free (stream, stream->acache.near_array); + xd3_free (stream, stream->acache.same_array); + + xd3_free (stream, stream->inst_sect.copied1); + xd3_free (stream, stream->addr_sect.copied1); + xd3_free (stream, stream->data_sect.copied1); + + xd3_free (stream, stream->dec_buffer); + xd3_free (stream, (uint8_t*) stream->dec_lastwin); + + xd3_free (stream, stream->buf_in); + xd3_free (stream, stream->dec_appheader); + xd3_free (stream, stream->dec_codetbl); + xd3_free (stream, stream->code_table_alloc); + +#if SECONDARY_ANY + xd3_free (stream, stream->inst_sect.copied2); + xd3_free (stream, stream->addr_sect.copied2); + xd3_free (stream, stream->data_sect.copied2); + + if (stream->sec_type != NULL) + { + stream->sec_type->destroy (stream, stream->sec_stream_d); + stream->sec_type->destroy (stream, stream->sec_stream_i); + stream->sec_type->destroy (stream, stream->sec_stream_a); + } +#endif + + xd3_free (stream, stream->whole_target.adds); + xd3_free (stream, stream->whole_target.inst); + xd3_free (stream, stream->whole_target.wininfo); + + XD3_ASSERT (stream->alloc_cnt == stream->free_cnt); + + memset (stream, 0, sizeof (xd3_stream)); +} + +#if (XD3_DEBUG > 1 || VCDIFF_TOOLS) +static const char* +xd3_rtype_to_string (xd3_rtype type, int print_mode) +{ + switch (type) + { + case XD3_NOOP: + return "NOOP "; + case XD3_RUN: + return "RUN "; + case XD3_ADD: + return "ADD "; + default: break; + } + if (! print_mode) + { + return "CPY "; + } + switch (type) + { + case XD3_CPY + 0: return "CPY_0"; + case XD3_CPY + 1: return "CPY_1"; + case XD3_CPY + 2: return "CPY_2"; + case XD3_CPY + 3: return "CPY_3"; + case XD3_CPY + 4: return "CPY_4"; + case XD3_CPY + 5: return "CPY_5"; + case XD3_CPY + 6: return "CPY_6"; + case XD3_CPY + 7: return "CPY_7"; + case XD3_CPY + 8: return "CPY_8"; + case XD3_CPY + 9: return "CPY_9"; + default: return "CPY>9"; + } +} +#endif + +/**************************************************************** + Stream configuration + ******************************************************************/ + +int +xd3_config_stream(xd3_stream *stream, + xd3_config *config) +{ + int ret; + xd3_config defcfg; + xd3_smatcher *smatcher = &stream->smatcher; + + if (config == NULL) + { + config = & defcfg; + memset (config, 0, sizeof (*config)); + } + + /* Initial setup: no error checks yet */ + memset (stream, 0, sizeof (*stream)); + + stream->winsize = config->winsize ? config->winsize : XD3_DEFAULT_WINSIZE; + stream->sprevsz = config->sprevsz ? config->sprevsz : XD3_DEFAULT_SPREVSZ; + stream->srcwin_maxsz = config->srcwin_maxsz ? + config->srcwin_maxsz : XD3_DEFAULT_SRCWINSZ; + + if (config->iopt_size == 0) + { + stream->iopt_size = XD3_ALLOCSIZE / sizeof(xd3_rinst); + stream->iopt_unlimited = 1; + } + else + { + stream->iopt_size = config->iopt_size; + } + + stream->getblk = config->getblk; + stream->alloc = config->alloc ? config->alloc : __xd3_alloc_func; + stream->free = config->freef ? config->freef : __xd3_free_func; + stream->opaque = config->opaque; + stream->flags = config->flags; + + /* Secondary setup. */ + stream->sec_data = config->sec_data; + stream->sec_inst = config->sec_inst; + stream->sec_addr = config->sec_addr; + + stream->sec_data.data_type = DATA_SECTION; + stream->sec_inst.data_type = INST_SECTION; + stream->sec_addr.data_type = ADDR_SECTION; + + /* Check static sizes. */ + if (sizeof (usize_t) != SIZEOF_USIZE_T || + sizeof (xoff_t) != SIZEOF_XOFF_T || + (ret = xd3_check_pow2(XD3_ALLOCSIZE, NULL))) + { + stream->msg = "incorrect compilation: wrong integer sizes"; + return XD3_INTERNAL; + } + + /* Check/set secondary compressor. */ + switch (stream->flags & XD3_SEC_TYPE) + { + case 0: + if (stream->flags & XD3_SEC_NOALL) + { + stream->msg = "XD3_SEC flags require a secondary compressor type"; + return XD3_INTERNAL; + } + break; + case XD3_SEC_FGK: + FGK_CASE (stream); + case XD3_SEC_DJW: + DJW_CASE (stream); + default: + stream->msg = "too many secondary compressor types set"; + return XD3_INTERNAL; + } + + /* Check/set encoder code table. */ + switch (stream->flags & XD3_ALT_CODE_TABLE) { + case 0: + stream->code_table_desc = & __rfc3284_code_table_desc; + stream->code_table_func = xd3_rfc3284_code_table; + break; +#if GENERIC_ENCODE_TABLES + case XD3_ALT_CODE_TABLE: + stream->code_table_desc = & __alternate_code_table_desc; + stream->code_table_func = xd3_alternate_code_table; + stream->comp_table_func = xd3_compute_alternate_table_encoding; + break; +#endif + default: + stream->msg = "alternate code table support was not compiled"; + return XD3_INTERNAL; + } + + /* Check sprevsz */ + if (smatcher->small_chain == 1 && + smatcher->small_lchain == 1) + { + stream->sprevsz = 0; + } + else + { + if ((ret = xd3_check_pow2 (stream->sprevsz, NULL))) + { + stream->msg = "sprevsz is required to be a power of two"; + return XD3_INTERNAL; + } + + stream->sprevmask = stream->sprevsz - 1; + } + + /* Default scanner settings. */ +#if XD3_ENCODER + switch (config->smatch_cfg) + { + IF_BUILD_SOFT(case XD3_SMATCH_SOFT: + { + *smatcher = config->smatcher_soft; + smatcher->string_match = __smatcher_soft.string_match; + smatcher->name = __smatcher_soft.name; + if (smatcher->large_look < MIN_MATCH || + smatcher->large_step < 1 || + smatcher->small_look < MIN_MATCH) + { + stream->msg = "invalid soft string-match config"; + return XD3_INVALID; + } + break; + }) + + IF_BUILD_DEFAULT(case XD3_SMATCH_DEFAULT: + *smatcher = __smatcher_default; + break;) + IF_BUILD_SLOW(case XD3_SMATCH_SLOW: + *smatcher = __smatcher_slow; + break;) + IF_BUILD_FASTEST(case XD3_SMATCH_FASTEST: + *smatcher = __smatcher_fastest; + break;) + IF_BUILD_FASTER(case XD3_SMATCH_FASTER: + *smatcher = __smatcher_faster; + break;) + IF_BUILD_FAST(case XD3_SMATCH_FAST: + *smatcher = __smatcher_fast; + break;) + default: + stream->msg = "invalid string match config type"; + return XD3_INTERNAL; + } + + if (config->smatch_cfg == XD3_SMATCH_DEFAULT && + (stream->flags & XD3_COMPLEVEL_MASK) != 0) + { + int level = (stream->flags & XD3_COMPLEVEL_MASK) >> XD3_COMPLEVEL_SHIFT; + + switch (level) + { + case 1: + IF_BUILD_FASTEST(*smatcher = __smatcher_fastest; + break;) + case 2: + IF_BUILD_FASTER(*smatcher = __smatcher_faster; + break;) + case 3: case 4: case 5: + IF_BUILD_FAST(*smatcher = __smatcher_fast; + break;) + case 6: + IF_BUILD_DEFAULT(*smatcher = __smatcher_default; + break;) + default: + IF_BUILD_SLOW(*smatcher = __smatcher_slow; + break;) + IF_BUILD_DEFAULT(*smatcher = __smatcher_default; + break;) + IF_BUILD_FAST(*smatcher = __smatcher_fast; + break;) + IF_BUILD_FASTER(*smatcher = __smatcher_faster; + break;) + IF_BUILD_FASTEST(*smatcher = __smatcher_fastest; + break;) + } + } +#endif + + return 0; +} + +/***************************************************************** + Getblk interface + ***********************************************************/ + +/* This function interfaces with the client getblk function, checks + * its results, etc. */ +static int +xd3_getblk (xd3_stream *stream, xoff_t blkno) +{ + int ret; + xd3_source *source = stream->src; + + if (source->curblk == NULL || + blkno != source->curblkno) + { + if (blkno >= source->blocks) + { + stream->msg = "source file too short"; + return XD3_INTERNAL; + } + + XD3_ASSERT (source->curblk != NULL || blkno != source->curblkno); + + source->getblkno = blkno; + + if (stream->getblk == NULL) + { + stream->msg = "getblk source input"; + return XD3_GETSRCBLK; + } + else if ((ret = stream->getblk (stream, source, blkno)) != 0) + { + stream->msg = "getblk failed"; + return ret; + } + + XD3_ASSERT (source->curblk != NULL); + } + + if (source->onblk != (blkno == source->blocks - 1 ? + source->onlastblk : source->blksize)) + { + stream->msg = "getblk returned short block"; + return XD3_INTERNAL; + } + + return 0; +} + +/*********************************************************** + Stream open/close + ***************************************************************/ + +int +xd3_set_source (xd3_stream *stream, + xd3_source *src) +{ + xoff_t blk_num; + usize_t tail_size, shiftby; + + IF_DEBUG1 (DP(RINT "[set source] size %"Q"u\n", src->size)); + + if (src == NULL || src->size < stream->smatcher.large_look) { return 0; } + + stream->src = src; + + // If src->blksize is a power-of-two, xd3_blksize_div() will use + // shift and mask rather than divide. Check that here. + if (xd3_check_pow2 (src->blksize, &shiftby) == 0) + { + src->shiftby = shiftby; + src->maskby = (1 << shiftby) - 1; + } + else if (src->size <= src->blksize) + { + int x = xd3_pow2_roundup (src->blksize); + xd3_check_pow2 (x, &shiftby); + src->shiftby = shiftby; + src->maskby = (1 << shiftby) - 1; + } + else + { + src->shiftby = 0; + src->maskby = 0; + } + + xd3_blksize_div (src->size, src, &blk_num, &tail_size); + src->blocks = blk_num + (tail_size > 0); + src->onlastblk = xd3_bytes_on_srcblk (src, src->blocks - 1); + src->srclen = 0; + src->srcbase = 0; + + return 0; +} + +void +xd3_abort_stream (xd3_stream *stream) +{ + stream->dec_state = DEC_ABORTED; + stream->enc_state = ENC_ABORTED; +} + +int +xd3_close_stream (xd3_stream *stream) +{ + if (stream->enc_state != 0 && stream->enc_state != ENC_ABORTED) + { + if (stream->buf_leftover != NULL) + { + stream->msg = "encoding is incomplete"; + return XD3_INTERNAL; + } + + if (stream->enc_state == ENC_POSTWIN) + { +#if XD3_ENCODER + xd3_encode_reset (stream); +#endif + stream->current_window += 1; + stream->enc_state = ENC_INPUT; + } + + /* If encoding, should be ready for more input but not actually + have any. */ + if (stream->enc_state != ENC_INPUT || stream->avail_in != 0) + { + stream->msg = "encoding is incomplete"; + return XD3_INTERNAL; + } + } + else + { + switch (stream->dec_state) + { + case DEC_VCHEAD: + case DEC_WININD: + /* TODO: Address the zero-byte ambiguity. Does the encoder + * emit a window or not? If so, then catch an error here. + * If not, need another routine to say + * decode_at_least_one_if_empty. */ + case DEC_ABORTED: + break; + default: + /* If decoding, should be ready for the next window. */ + stream->msg = "EOF in decode"; + return XD3_INTERNAL; + } + } + + return 0; +} + +/************************************************************** + Application header + ****************************************************************/ + +int +xd3_get_appheader (xd3_stream *stream, + uint8_t **data, + usize_t *size) +{ + if (stream->dec_state < DEC_WININD) + { + stream->msg = "application header not available"; + return XD3_INTERNAL; + } + + (*data) = stream->dec_appheader; + (*size) = stream->dec_appheadsz; + return 0; +} + +/********************************************************** + Decoder stuff + *************************************************/ + +#include "xdelta3-decode.h" + +/**************************************************************** + Encoder stuff + *****************************************************************/ + +#if XD3_ENCODER +void +xd3_set_appheader (xd3_stream *stream, + const uint8_t *data, + usize_t size) +{ + stream->enc_appheader = data; + stream->enc_appheadsz = size; +} + +#if XD3_DEBUG +static int +xd3_iopt_check (xd3_stream *stream) +{ + usize_t ul = xd3_rlist_length (& stream->iopt_used); + usize_t fl = xd3_rlist_length (& stream->iopt_free); + + return (ul + fl + (stream->iout ? 1 : 0)) == stream->iopt_size; +} +#endif + +static xd3_rinst* +xd3_iopt_free (xd3_stream *stream, xd3_rinst *i) +{ + xd3_rinst *n = xd3_rlist_remove (i); + xd3_rlist_push_back (& stream->iopt_free, i); + return n; +} + +static void +xd3_iopt_free_nonadd (xd3_stream *stream, xd3_rinst *i) +{ + if (i->type != XD3_ADD) + { + xd3_rlist_push_back (& stream->iopt_free, i); + } +} + +/* When an instruction is ready to flush from the iopt buffer, this + * function is called to produce an encoding. It writes the + * instruction plus size, address, and data to the various encoding + * sections. */ +static int +xd3_iopt_finish_encoding (xd3_stream *stream, xd3_rinst *inst) +{ + int ret; + + /* Check for input overflow. */ + XD3_ASSERT (inst->pos + inst->size <= stream->avail_in); + + switch (inst->type) + { + case XD3_CPY: + { + /* the address may have an offset if there is a source window. */ + usize_t addr; + xd3_source *src = stream->src; + + if (src != NULL) + { + /* If there is a source copy, the source must have its + * source window decided before we can encode. This can + * be bad -- we have to make this decision even if no + * source matches have been found. */ + if (stream->srcwin_decided == 0) + { + if ((ret = xd3_srcwin_setup (stream))) { return ret; } + } + + /* xtra field indicates the copy is from the source */ + if (inst->xtra) + { + XD3_ASSERT (inst->addr >= src->srcbase); + XD3_ASSERT (inst->addr + inst->size <= src->srcbase + src->srclen); + addr = (inst->addr - src->srcbase); + stream->n_scpy += 1; + stream->l_scpy += inst->size; + } + else + { + /* with source window: target copy address is offset by taroff. */ + addr = stream->taroff + (usize_t) inst->addr; + stream->n_tcpy += 1; + stream->l_tcpy += inst->size; + } + } + else + { + addr = (usize_t) inst->addr; + stream->n_tcpy += 1; + stream->l_tcpy += inst->size; + } + + /* Note: used to assert inst->size >= MIN_MATCH, but not true + * for merge operations & identical match heuristics. */ + /* the "here" position is always offset by taroff */ + if ((ret = xd3_encode_address (stream, addr, inst->pos + stream->taroff, + & inst->type))) + { + return ret; + } + + IF_DEBUG1 ({ + static int cnt; + DP(RINT "[iopt copy:%d] pos %"Q"u-%"Q"u addr %"Q"u-%"Q"u size %u\n", + cnt++, + stream->total_in + inst->pos, + stream->total_in + inst->pos + inst->size, + inst->addr, inst->addr + inst->size, inst->size); + }); + break; + } + case XD3_RUN: + { + XD3_ASSERT (inst->size >= MIN_MATCH); + + if ((ret = xd3_emit_byte (stream, & DATA_TAIL (stream), inst->xtra))) { return ret; } + + stream->n_run += 1; + stream->l_run += inst->size; + + IF_DEBUG1 ({ + static int cnt; + DP(RINT "[iopt run:%d] pos %"Q"u size %u\n", cnt++, stream->total_in + inst->pos, inst->size); + }); + break; + } + case XD3_ADD: + { + if ((ret = xd3_emit_bytes (stream, & DATA_TAIL (stream), + stream->next_in + inst->pos, inst->size))) { return ret; } + + stream->n_add += 1; + stream->l_add += inst->size; + + IF_DEBUG1 ({ + static int cnt; + DP(RINT "[iopt add:%d] pos %"Q"u size %u\n", cnt++, stream->total_in + inst->pos, inst->size); + }); + + break; + } + } + + /* This is the only place stream->unencoded_offset is incremented. */ + XD3_ASSERT (stream->unencoded_offset == inst->pos); + stream->unencoded_offset += inst->size; + + inst->code2 = 0; + + XD3_CHOOSE_INSTRUCTION (stream, stream->iout, inst); + + if (stream->iout != NULL) + { + if (stream->iout->code2 != 0) + { + if ((ret = xd3_emit_double (stream, stream->iout, inst, stream->iout->code2))) { return ret; } + + xd3_iopt_free_nonadd (stream, stream->iout); + xd3_iopt_free_nonadd (stream, inst); + stream->iout = NULL; + return 0; + } + else + { + if ((ret = xd3_emit_single (stream, stream->iout, stream->iout->code1))) { return ret; } + + xd3_iopt_free_nonadd (stream, stream->iout); + } + } + + stream->iout = inst; + + return 0; +} + +/* This possibly encodes an add instruction, iadd, which must remain + * on the stack until the following call to + * xd3_iopt_finish_encoding. */ +static int +xd3_iopt_add (xd3_stream *stream, usize_t pos, xd3_rinst *iadd) +{ + int ret; + usize_t off = stream->unencoded_offset; + + if (pos > off) + { + iadd->type = XD3_ADD; + iadd->pos = off; + iadd->size = pos - off; + + if ((ret = xd3_iopt_finish_encoding (stream, iadd))) { return ret; } + } + + return 0; +} + +/* This function calls xd3_iopt_finish_encoding to finish encoding an + * instruction, and it may also produce an add instruction for an + * unmatched region. */ +static int +xd3_iopt_add_encoding (xd3_stream *stream, xd3_rinst *inst) +{ + int ret; + xd3_rinst iadd; + + if ((ret = xd3_iopt_add (stream, inst->pos, & iadd))) { return ret; } + + if ((ret = xd3_iopt_finish_encoding (stream, inst))) { return ret; } + + return 0; +} + +/* Generates a final add instruction to encode the remaining input. */ +static int +xd3_iopt_add_finalize (xd3_stream *stream) +{ + int ret; + xd3_rinst iadd; + + if ((ret = xd3_iopt_add (stream, stream->avail_in, & iadd))) { return ret; } + + if (stream->iout) + { + if ((ret = xd3_emit_single (stream, stream->iout, stream->iout->code1))) { return ret; } + + xd3_iopt_free_nonadd (stream, stream->iout); + stream->iout = NULL; + } + + return 0; +} + +/* Compact the instruction buffer by choosing the best non-overlapping + * instructions when lazy string-matching. There are no ADDs in the + * iopt buffer because those are synthesized in xd3_iopt_add_encoding + * and during xd3_iopt_add_finalize. */ +static int +xd3_iopt_flush_instructions (xd3_stream *stream, int force) +{ + xd3_rinst *r1 = xd3_rlist_front (& stream->iopt_used); + xd3_rinst *r2; + xd3_rinst *r3; + usize_t r1end; + usize_t r2end; + usize_t r2off; + usize_t r2moff; + usize_t gap; + usize_t flushed; + int ret; + + XD3_ASSERT (xd3_iopt_check (stream)); + + /* Note: once tried to skip this step if it's possible to assert + * there are no overlapping instructions. Doesn't work because + * xd3_opt_erase leaves overlapping instructions. */ + while (! xd3_rlist_end (& stream->iopt_used, r1) && + ! xd3_rlist_end (& stream->iopt_used, r2 = xd3_rlist_next (r1))) + { + r1end = r1->pos + r1->size; + + /* If the instructions do not overlap, continue. */ + if (r1end <= r2->pos) + { + r1 = r2; + continue; + } + + r2end = r2->pos + r2->size; + + /* The min_match adjustments prevent this. */ + XD3_ASSERT (r2end > (r1end + LEAST_MATCH_INCR)); + + /* If r3 is available... */ + if (! xd3_rlist_end (& stream->iopt_used, r3 = xd3_rlist_next (r2))) + { + /* If r3 starts before r1 finishes or just about, r2 is irrelevant */ + if (r3->pos <= r1end + 1) + { + xd3_iopt_free (stream, r2); + continue; + } + } + else if (! force) + { + /* Unless force, end the loop when r3 is not available. */ + break; + } + + r2off = r2->pos - r1->pos; + r2moff = r2end - r1end; + gap = r2end - r1->pos; + + /* If the two matches overlap almost entirely, choose the better match + * and discard the other. The else branch can still create inefficient + * copies, e.g., a 4-byte copy that takes 4 bytes to encode, which + * xd3_smatch() wouldn't allow by its crude efficiency check. However, + * in this case there are adjacent copies which mean the add would cost + * one extra byte. Allow the inefficiency here. */ + if (gap < 2*MIN_MATCH || r2moff <= 2 || r2off <= 2) + { + /* Only one match should be used, choose the longer one. */ + if (r1->size < r2->size) + { + xd3_iopt_free (stream, r1); + r1 = r2; + } + else + { + /* We are guaranteed that r1 does not overlap now, so advance past r2 */ + r1 = xd3_iopt_free (stream, r2); + } + continue; + } + else + { + /* Shorten one of the instructions -- could be optimized + * based on the address cache. */ + usize_t average; + usize_t newsize; + usize_t adjust1; + + XD3_ASSERT (r1end > r2->pos && r2end > r1->pos); + + /* Try to balance the length of both instructions, but avoid + * making both longer than MAX_MATCH_SPLIT . */ + average = gap / 2; + newsize = min (MAX_MATCH_SPLIT, gap - average); + + /* Should be possible to simplify this code. */ + if (newsize > r1->size) + { + /* shorten r2 */ + adjust1 = r1end - r2->pos; + } + else if (newsize > r2->size) + { + /* shorten r1 */ + adjust1 = r1end - r2->pos; + + XD3_ASSERT (r1->size > adjust1); + + r1->size -= adjust1; + + /* don't shorten r2 */ + adjust1 = 0; + } + else + { + /* shorten r1 */ + adjust1 = r1->size - newsize; + + if (r2->pos > r1end - adjust1) + { + adjust1 -= r2->pos - (r1end - adjust1); + } + + XD3_ASSERT (r1->size > adjust1); + + r1->size -= adjust1; + + /* shorten r2 */ + XD3_ASSERT (r1->pos + r1->size >= r2->pos); + + adjust1 = r1->pos + r1->size - r2->pos; + } + + /* Fallthrough above if-else, shorten r2 */ + XD3_ASSERT (r2->size > adjust1); + + r2->size -= adjust1; + r2->pos += adjust1; + r2->addr += adjust1; + + XD3_ASSERT (r1->size >= MIN_MATCH); + XD3_ASSERT (r2->size >= MIN_MATCH); + + r1 = r2; + } + } + + XD3_ASSERT (xd3_iopt_check (stream)); + + /* If forcing, pick instructions until the list is empty, otherwise + * this empties 50% of the queue. */ + for (flushed = 0; ! xd3_rlist_empty (& stream->iopt_used); ) + { + xd3_rinst *renc = xd3_rlist_pop_front (& stream->iopt_used); + if ((ret = xd3_iopt_add_encoding (stream, renc))) + { + return ret; + } + + if (! force) + { + if (++flushed > stream->iopt_size / 2) + { + break; + } + + /* If there are only two instructions remaining, break, + * because they were not optimized. This means there were + * more than 50% eliminated by the loop above. */ + r1 = xd3_rlist_front (& stream->iopt_used); + if (xd3_rlist_end(& stream->iopt_used, r1) || + xd3_rlist_end(& stream->iopt_used, r2 = xd3_rlist_next (r1)) || + xd3_rlist_end(& stream->iopt_used, r3 = xd3_rlist_next (r2))) + { + break; + } + } + } + + XD3_ASSERT (xd3_iopt_check (stream)); + + XD3_ASSERT (!force || xd3_rlist_length (& stream->iopt_used) == 0); + + return 0; +} + +static int +xd3_iopt_get_slot (xd3_stream *stream, xd3_rinst** iptr) +{ + xd3_rinst *i; + int ret; + + if (xd3_rlist_empty (& stream->iopt_free)) + { + if (stream->iopt_unlimited) + { + int elts = XD3_ALLOCSIZE / sizeof(xd3_rinst); + + if ((ret = xd3_alloc_iopt (stream, elts))) + { + return ret; + } + + stream->iopt_size += elts; + } + else + { + if ((ret = xd3_iopt_flush_instructions (stream, 0))) { return ret; } + + XD3_ASSERT (! xd3_rlist_empty (& stream->iopt_free)); + } + } + + i = xd3_rlist_pop_back (& stream->iopt_free); + + xd3_rlist_push_back (& stream->iopt_used, i); + + (*iptr) = i; + + ++stream->i_slots_used; + + return 0; +} + +/* A copy is about to be emitted that extends backwards to POS, + * therefore it may completely cover some existing instructions in the + * buffer. If an instruction is completely covered by this new match, + * erase it. If the new instruction is covered by the previous one, + * return 1 to skip it. */ +static void +xd3_iopt_erase (xd3_stream *stream, usize_t pos, usize_t size) +{ + while (! xd3_rlist_empty (& stream->iopt_used)) + { + xd3_rinst *r = xd3_rlist_back (& stream->iopt_used); + + /* Verify that greedy is working. The previous instruction + * should end before the new one begins. */ + XD3_ASSERT ((stream->flags & XD3_BEGREEDY) == 0 || (r->pos + r->size <= pos)); + /* Verify that min_match is working. The previous instruction + * should end before the new one ends. */ + XD3_ASSERT ((stream->flags & XD3_BEGREEDY) != 0 || (r->pos + r->size < pos + size)); + + /* See if the last instruction starts before the new + * instruction. If so, there is nothing to erase. */ + if (r->pos < pos) + { + return; + } + + /* Otherwise, the new instruction covers the old one, delete it + and repeat. */ + xd3_rlist_remove (r); + xd3_rlist_push_back (& stream->iopt_free, r); + --stream->i_slots_used; + } +} + +/* This function tells the last matched input position. */ +static usize_t +xd3_iopt_last_matched (xd3_stream *stream) +{ + xd3_rinst *r; + + if (xd3_rlist_empty (& stream->iopt_used)) + { + return 0; + } + + r = xd3_rlist_back (& stream->iopt_used); + + return r->pos + r->size; +} + +/********************************************************* + Emit routines + ***********************************************************/ + +static int +xd3_emit_single (xd3_stream *stream, xd3_rinst *single, usize_t code) +{ + int has_size = stream->code_table[code].size1 == 0; + int ret; + + IF_DEBUG1 (DP(RINT "[emit1] %u %s (%u) code %u\n", + single->pos, + xd3_rtype_to_string ((xd3_rtype) single->type, 0), + single->size, + code)); + + if ((ret = xd3_emit_byte (stream, & INST_TAIL (stream), code))) + { + return ret; + } + + if (has_size) + { + if ((ret = xd3_emit_size (stream, & INST_TAIL (stream), single->size))) + { + return ret; + } + } + + return 0; +} + +static int +xd3_emit_double (xd3_stream *stream, xd3_rinst *first, + xd3_rinst *second, usize_t code) +{ + int ret; + + /* All double instructions use fixed sizes, so all we need to do is + * output the instruction code, no sizes. */ + XD3_ASSERT (stream->code_table[code].size1 != 0 && + stream->code_table[code].size2 != 0); + + if ((ret = xd3_emit_byte (stream, & INST_TAIL (stream), code))) + { + return ret; + } + + IF_DEBUG1 (DP(RINT "[emit2]: %u %s (%u) %s (%u) code %u\n", + first->pos, + xd3_rtype_to_string ((xd3_rtype) first->type, 0), + first->size, + xd3_rtype_to_string ((xd3_rtype) second->type, 0), + second->size, + code)); + + return 0; +} + +/* This enters a potential run instruction into the iopt buffer. The + * position argument is relative to the target window. */ +static int +xd3_emit_run (xd3_stream *stream, usize_t pos, usize_t size, uint8_t run_c) +{ + xd3_rinst* ri; + int ret; + + if ((ret = xd3_iopt_get_slot (stream, & ri))) { return ret; } + + ri->type = XD3_RUN; + ri->xtra = run_c; + ri->pos = pos; + ri->size = size; + + return 0; +} + +/* This enters a potential copy instruction into the iopt buffer. The + * position argument is relative to the target window.. */ +int +xd3_found_match (xd3_stream *stream, usize_t pos, + usize_t size, xoff_t addr, int is_source) +{ + xd3_rinst* ri; + int ret; + + if ((ret = xd3_iopt_get_slot (stream, & ri))) { return ret; } + + ri->type = XD3_CPY; + ri->xtra = is_source; + ri->pos = pos; + ri->size = size; + ri->addr = addr; + + return 0; +} + +static int +xd3_emit_hdr (xd3_stream *stream) +{ + int ret; + int use_secondary = stream->sec_type != NULL; + int use_adler32 = stream->flags & (XD3_ADLER32 | XD3_ADLER32_RECODE); + int vcd_source = xd3_encoder_used_source (stream); + usize_t win_ind = 0; + usize_t del_ind = 0; + usize_t enc_len; + usize_t tgt_len; + usize_t data_len; + usize_t inst_len; + usize_t addr_len; + + if (stream->current_window == 0) + { + usize_t hdr_ind = 0; + int use_appheader = stream->enc_appheader != NULL; + int use_gencodetbl = GENERIC_ENCODE_TABLES && + (stream->code_table_desc != & __rfc3284_code_table_desc); + + if (use_secondary) { hdr_ind |= VCD_SECONDARY; } + if (use_gencodetbl) { hdr_ind |= VCD_CODETABLE; } + if (use_appheader) { hdr_ind |= VCD_APPHEADER; } + + if ((ret = xd3_emit_byte (stream, & HDR_TAIL (stream), + VCDIFF_MAGIC1)) != 0 || + (ret = xd3_emit_byte (stream, & HDR_TAIL (stream), + VCDIFF_MAGIC2)) != 0 || + (ret = xd3_emit_byte (stream, & HDR_TAIL (stream), + VCDIFF_MAGIC3)) != 0 || + (ret = xd3_emit_byte (stream, & HDR_TAIL (stream), + VCDIFF_VERSION)) != 0 || + (ret = xd3_emit_byte (stream, & HDR_TAIL (stream), hdr_ind)) != 0) + { + return ret; + } + + /* Secondary compressor ID */ +#if SECONDARY_ANY + if (use_secondary && + (ret = xd3_emit_byte (stream, & HDR_TAIL (stream), + stream->sec_type->id))) + { + return ret; + } +#endif + + /* Compressed code table */ + if (use_gencodetbl) + { + usize_t code_table_size; + const uint8_t *code_table_data; + + if ((ret = stream->comp_table_func (stream, & code_table_data, + & code_table_size))) + { + return ret; + } + + if ((ret = xd3_emit_size (stream, & HDR_TAIL (stream), + code_table_size + 2)) || + (ret = xd3_emit_byte (stream, & HDR_TAIL (stream), + stream->code_table_desc->near_modes)) || + (ret = xd3_emit_byte (stream, & HDR_TAIL (stream), + stream->code_table_desc->same_modes)) || + (ret = xd3_emit_bytes (stream, & HDR_TAIL (stream), + code_table_data, code_table_size))) + { + return ret; + } + } + + /* Application header */ + if (use_appheader) + { + if ((ret = xd3_emit_size (stream, & HDR_TAIL (stream), + stream->enc_appheadsz)) || + (ret = xd3_emit_bytes (stream, & HDR_TAIL (stream), + stream->enc_appheader, + stream->enc_appheadsz))) + { + return ret; + } + } + } + + /* try to compress this window */ +#if SECONDARY_ANY + if (use_secondary) + { + int data_sec = 0; + int inst_sec = 0; + int addr_sec = 0; + +# define ENCODE_SECONDARY_SECTION(UPPER,LOWER) \ + ((stream->flags & XD3_SEC_NO ## UPPER) == 0 && \ + (ret = xd3_encode_secondary (stream, \ + & UPPER ## _HEAD (stream), \ + & UPPER ## _TAIL (stream), \ + & xd3_sec_ ## LOWER (stream), \ + & stream->sec_ ## LOWER, \ + & LOWER ## _sec))) + + if (ENCODE_SECONDARY_SECTION (DATA, data) || + ENCODE_SECONDARY_SECTION (INST, inst) || + ENCODE_SECONDARY_SECTION (ADDR, addr)) + { + return ret; + } + + del_ind |= (data_sec ? VCD_DATACOMP : 0); + del_ind |= (inst_sec ? VCD_INSTCOMP : 0); + del_ind |= (addr_sec ? VCD_ADDRCOMP : 0); + } +#endif + + /* if (vcd_target) { win_ind |= VCD_TARGET; } */ + if (vcd_source) { win_ind |= VCD_SOURCE; } + if (use_adler32) { win_ind |= VCD_ADLER32; } + + /* window indicator */ + if ((ret = xd3_emit_byte (stream, & HDR_TAIL (stream), win_ind))) + { + return ret; + } + + /* source window */ + if (vcd_source) + { + /* or (vcd_target) { ... } */ + if ((ret = xd3_emit_size (stream, & HDR_TAIL (stream), + stream->src->srclen)) || + (ret = xd3_emit_offset (stream, & HDR_TAIL (stream), + stream->src->srcbase))) { return ret; } + } + + tgt_len = stream->avail_in; + data_len = xd3_sizeof_output (DATA_HEAD (stream)); + inst_len = xd3_sizeof_output (INST_HEAD (stream)); + addr_len = xd3_sizeof_output (ADDR_HEAD (stream)); + + /* The enc_len field is a redundency for future extensions.*/ + enc_len = (1 + (xd3_sizeof_size (tgt_len) + + xd3_sizeof_size (data_len) + + xd3_sizeof_size (inst_len) + + xd3_sizeof_size (addr_len)) + + data_len + + inst_len + + addr_len + + (use_adler32 ? 4 : 0)); + + if ((ret = xd3_emit_size (stream, & HDR_TAIL (stream), enc_len)) || + (ret = xd3_emit_size (stream, & HDR_TAIL (stream), tgt_len)) || + (ret = xd3_emit_byte (stream, & HDR_TAIL (stream), del_ind)) || + (ret = xd3_emit_size (stream, & HDR_TAIL (stream), data_len)) || + (ret = xd3_emit_size (stream, & HDR_TAIL (stream), inst_len)) || + (ret = xd3_emit_size (stream, & HDR_TAIL (stream), addr_len))) + { + return ret; + } + + if (use_adler32) + { + uint8_t send[4]; + uint32_t a32; + + if (stream->flags & XD3_ADLER32) + { + a32 = adler32 (1L, stream->next_in, stream->avail_in); + } + else + { + a32 = stream->recode_adler32; + } + + send[0] = (a32 >> 24); + send[1] = (a32 >> 16); + send[2] = (a32 >> 8); + send[3] = (a32 & 0xff); + + if ((ret = xd3_emit_bytes (stream, & HDR_TAIL (stream), send, 4))) + { + return ret; + } + } + + return 0; +} + +/**************************************************************** + Encode routines + ****************************************************************/ + +static int +xd3_encode_buffer_leftover (xd3_stream *stream) +{ + usize_t take; + usize_t room; + + /* Allocate the buffer. */ + if (stream->buf_in == NULL && + (stream->buf_in = (uint8_t*) xd3_alloc (stream, stream->winsize, 1)) == NULL) + { + return ENOMEM; + } + + /* Take leftover input first. */ + if (stream->buf_leftover != NULL) + { + XD3_ASSERT (stream->buf_avail == 0); + XD3_ASSERT (stream->buf_leftavail < stream->winsize); + + IF_DEBUG1 (DP(RINT "[leftover] previous %u avail %u\n", stream->buf_leftavail, stream->avail_in)); + + memcpy (stream->buf_in, stream->buf_leftover, stream->buf_leftavail); + + stream->buf_leftover = NULL; + stream->buf_avail = stream->buf_leftavail; + } + + /* Copy into the buffer. */ + room = stream->winsize - stream->buf_avail; + take = min (room, stream->avail_in); + + memcpy (stream->buf_in + stream->buf_avail, stream->next_in, take); + + stream->buf_avail += take; + + if (take < stream->avail_in) + { + /* Buffer is full */ + stream->buf_leftover = stream->next_in + take; + stream->buf_leftavail = stream->avail_in - take; + + IF_DEBUG1 (DP(RINT "[leftover] take %u remaining %u\n", take, stream->buf_leftavail)); + } + else if ((stream->buf_avail < stream->winsize) && !(stream->flags & XD3_FLUSH)) + { + /* Buffer has space */ + IF_DEBUG1 (DP(RINT "[leftover] %u emptied\n", take)); + return XD3_INPUT; + } + + /* Use the buffer: */ + stream->next_in = stream->buf_in; + stream->avail_in = stream->buf_avail; + stream->buf_avail = 0; + + return 0; +} + +/* Allocates one block of xd3_rlist elements */ +static int +xd3_alloc_iopt (xd3_stream *stream, int elts) +{ + int i; + xd3_iopt_buflist* last = + (xd3_iopt_buflist*) xd3_alloc (stream, sizeof (xd3_iopt_buflist), 1); + + if (last == NULL || + (last->buffer = (xd3_rinst*) xd3_alloc (stream, sizeof (xd3_rinst), elts)) == NULL) + { + return ENOMEM; + } + + last->next = stream->iopt_alloc; + stream->iopt_alloc = last; + + for (i = 0; i < elts; i += 1) + { + xd3_rlist_push_back (& stream->iopt_free, & last->buffer[i]); + } + + return 0; +} + +/* This function allocates all memory initially used by the encoder. */ +static int +xd3_encode_init (xd3_stream *stream, int full_init) +{ + int i; + + if (full_init) + { + int large_comp = (stream->src != NULL); + int small_comp = ! (stream->flags & XD3_NOCOMPRESS); + + /* Memory allocations for checksum tables are delayed until + * xd3_string_match_init in the first call to string_match--that way + * identical or short inputs require no table allocation. */ + if (large_comp) + { + usize_t hash_values = (stream->srcwin_maxsz / stream->smatcher.large_step); + + xd3_size_hashtable (stream, + hash_values, + & stream->large_hash); + } + + if (small_comp) + { + /* TODO: This is under devel: used to have min(sprevsz) here, which sort + * of makes sense, but observed fast performance w/ larger tables, which + * also sort of makes sense. @@@ */ + usize_t hash_values = stream->winsize; + + xd3_size_hashtable (stream, + hash_values, + & stream->small_hash); + } + } + + /* data buffers */ + for (i = 0; i < ENC_SECTS; i += 1) + { + if ((stream->enc_heads[i] = + stream->enc_tails[i] = + xd3_alloc_output (stream, NULL)) == NULL) + { + return ENOMEM; + } + } + + /* iopt buffer */ + xd3_rlist_init (& stream->iopt_used); + xd3_rlist_init (& stream->iopt_free); + + if (xd3_alloc_iopt (stream, stream->iopt_size) != 0) { goto fail; } + + XD3_ASSERT (xd3_rlist_length (& stream->iopt_free) == stream->iopt_size); + XD3_ASSERT (xd3_rlist_length (& stream->iopt_used) == 0); + + /* address cache, code table */ + stream->acache.s_near = stream->code_table_desc->near_modes; + stream->acache.s_same = stream->code_table_desc->same_modes; + stream->code_table = stream->code_table_func (); + + return xd3_alloc_cache (stream); + + fail: + + return ENOMEM; +} + +int +xd3_encode_init_full (xd3_stream *stream) +{ + return xd3_encode_init (stream, 1); +} + +int +xd3_encode_init_partial (xd3_stream *stream) +{ + return xd3_encode_init (stream, 0); +} + +/* Called after the ENC_POSTOUT state, this puts the output buffers + * back into separate lists and re-initializes some variables. (The + * output lists were spliced together during the ENC_FLUSH state.) */ +static void +xd3_encode_reset (xd3_stream *stream) +{ + int i; + xd3_output *olist; + + stream->avail_in = 0; + stream->small_reset = 1; + stream->i_slots_used = 0; + + if (stream->src != NULL) + { + stream->src->srcbase = 0; + stream->src->srclen = 0; + stream->srcwin_decided = 0; + stream->match_minaddr = 0; + stream->match_maxaddr = 0; + stream->taroff = 0; + } + + /* Reset output chains. */ + olist = stream->enc_heads[0]; + + for (i = 0; i < ENC_SECTS; i += 1) + { + XD3_ASSERT (olist != NULL); + + stream->enc_heads[i] = olist; + stream->enc_tails[i] = olist; + olist = olist->next_page; + + stream->enc_heads[i]->next = 0; + stream->enc_heads[i]->next_page = NULL; + + stream->enc_tails[i]->next_page = NULL; + stream->enc_tails[i] = stream->enc_heads[i]; + } + + xd3_freelist_output (stream, olist); +} + +/* The main encoding routine. */ +int +xd3_encode_input (xd3_stream *stream) +{ + int ret, i; + + if (stream->dec_state != 0) + { + stream->msg = "encoder/decoder transition"; + return XD3_INTERNAL; + } + + switch (stream->enc_state) + { + case ENC_INIT: + /* Only reached on first time through: memory setup. */ + if ((ret = xd3_encode_init_full (stream))) { return ret; } + + stream->enc_state = ENC_INPUT; + + case ENC_INPUT: + + /* If there is no input yet, just return. This checks for + * next_in == NULL, not avail_in == 0 since zero bytes is a + * valid input. There is an assertion in xd3_avail_input() that + * next_in != NULL for this reason. By returning right away we + * avoid creating an input buffer before the caller has supplied + * its first data. It is possible for xd3_avail_input to be + * called both before and after the first call to + * xd3_encode_input(). */ + if (stream->next_in == NULL) + { + return XD3_INPUT; + } + + enc_flush: + /* See if we should buffer the input: either if there is already + * a leftover buffer, or if the input is short of winsize + * without flush. The label at this point is reached by a goto + * below, when there is leftover input after postout. */ + if ((stream->buf_leftover != NULL) || + (stream->avail_in < stream->winsize && ! (stream->flags & XD3_FLUSH))) + { + if ((ret = xd3_encode_buffer_leftover (stream))) { return ret; } + } + + /* Initalize the address cache before each window. */ + xd3_init_cache (& stream->acache); + + stream->input_position = 0; + stream->min_match = MIN_MATCH; + stream->unencoded_offset = 0; + + stream->enc_state = ENC_SEARCH; + + IF_DEBUG1 (DP(RINT "[WINSTART:%"Q"u] input bytes %u offset %"Q"u\n", + stream->current_window, stream->avail_in, + stream->total_in)); + return XD3_WINSTART; + + case ENC_SEARCH: + + /* Reentrant matching. */ + if (stream->src != NULL) + { + switch (stream->match_state) + { + case MATCH_TARGET: + /* Try matching forward at the start of the target. + * This is entered the first time through, to check for + * a perfect match, and whenever there is a source match + * that extends to the end of the previous window. The + * match_srcpos field is initially zero and later set + * during xd3_source_extend_match. */ + + if (stream->avail_in > 0) + { + /* This call can't fail because the source window is + unrestricted. */ + ret = xd3_source_match_setup (stream, stream->match_srcpos); + XD3_ASSERT (ret == 0); + stream->match_state = MATCH_FORWARD; + } + else + { + stream->match_state = MATCH_SEARCHING; + stream->match_fwd = 0; + } + XD3_ASSERT (stream->match_fwd == 0); + + case MATCH_FORWARD: + case MATCH_BACKWARD: + if (stream->avail_in != 0) + { + if ((ret = xd3_source_extend_match (stream)) != 0) + { + return ret; + } + + /* The search has to make forward progress here + * or else it can get stuck in a match-backward + * (getsrcblk) then match-forward (getsrcblk), + * find insufficient match length, then repeat + * exactly the same search. + */ + stream->input_position += stream->match_fwd; + } + + case MATCH_SEARCHING: + /* Continue string matching. (It's possible that the + * initial match continued through the entire input, in + * which case we're still in MATCH_FORWARD and should + * remain so for the next input window.) */ + break; + } + } + + /* String matching... */ + if (stream->avail_in != 0 && + (ret = stream->smatcher.string_match (stream))) + { + return ret; + } + + stream->enc_state = ENC_INSTR; + + case ENC_INSTR: + /* Note: Jump here to encode VCDIFF deltas w/o using this + * string-matching code. */ + + /* Flush the instrution buffer, then possibly add one more + * instruction, then emit the header. */ + if ((ret = xd3_iopt_flush_instructions (stream, 1)) || + (ret = xd3_iopt_add_finalize (stream))) + { + return ret; + } + + stream->enc_state = ENC_FLUSH; + + case ENC_FLUSH: + /* Note: main_recode_func() bypasses string-matching by setting + * ENC_FLUSH. */ + if ((ret = xd3_emit_hdr (stream))) + { + return ret; + } + + /* Begin output. */ + stream->enc_current = HDR_HEAD (stream); + + /* Chain all the outputs together. After doing this, it looks + * as if there is only one section. The other enc_heads are set + * to NULL to avoid freeing them more than once. */ + for (i = 1; i < ENC_SECTS; i += 1) + { + stream->enc_tails[i-1]->next_page = stream->enc_heads[i]; + stream->enc_heads[i] = NULL; + } + + enc_output: + + stream->enc_state = ENC_POSTOUT; + stream->next_out = stream->enc_current->base; + stream->avail_out = stream->enc_current->next; + stream->total_out += (xoff_t) stream->avail_out; + + /* If there is any output in this buffer, return it, otherwise + * fall through to handle the next buffer or finish the window + * after all buffers have been output. */ + if (stream->avail_out > 0) + { + /* This is the only place xd3_encode returns XD3_OUTPUT */ + return XD3_OUTPUT; + } + + case ENC_POSTOUT: + + if (stream->avail_out != 0) + { + stream->msg = "missed call to consume output"; + return XD3_INTERNAL; + } + + /* Continue outputting one buffer at a time, until the next is NULL. */ + if ((stream->enc_current = stream->enc_current->next_page) != NULL) + { + goto enc_output; + } + + stream->total_in += (xoff_t) stream->avail_in; + stream->enc_state = ENC_POSTWIN; + + IF_DEBUG1 (DP(RINT "[WINFINISH:%"Q"u] in=%"Q"u\n", + stream->current_window, + stream->total_in)); + return XD3_WINFINISH; + + case ENC_POSTWIN: + + xd3_encode_reset (stream); + + stream->current_window += 1; + stream->enc_state = ENC_INPUT; + + /* If there is leftover input to flush, repeat. */ + if ((stream->buf_leftover != NULL) && (stream->flags & XD3_FLUSH)) + { + goto enc_flush; + } + + /* Ready for more input. */ + return XD3_INPUT; + + default: + stream->msg = "invalid state"; + return XD3_INTERNAL; + } +} +#endif /* XD3_ENCODER */ + +/***************************************************************** + Client convenience functions + ******************************************************************/ + +static int +xd3_process_stream (int is_encode, + xd3_stream *stream, + int (*func) (xd3_stream *), + int close_stream, + const uint8_t *input, + usize_t input_size, + uint8_t *output, + usize_t *output_size, + usize_t output_size_max) +{ + usize_t ipos = 0; + usize_t n = min(stream->winsize, input_size); + + (*output_size) = 0; + + stream->flags |= XD3_FLUSH; + + xd3_avail_input (stream, input + ipos, n); + ipos += n; + + for (;;) + { + int ret; + switch((ret = func (stream))) + { + case XD3_OUTPUT: { /* memcpy below */ break; } + case XD3_INPUT: { + n = min(stream->winsize, input_size - ipos); + if (n == 0) { + goto done; + } + xd3_avail_input (stream, input + ipos, n); + ipos += n; + continue; + } + case XD3_GOTHEADER: { /* ignore */ continue; } + case XD3_WINSTART: { /* ignore */ continue; } + case XD3_WINFINISH: { /* ignore */ continue; } + case XD3_GETSRCBLK: + { + stream->msg = "stream requires source input"; + return XD3_INTERNAL; + } + case 0: + { + /* xd3_encode_input/xd3_decode_input never return 0 */ + stream->msg = "invalid return: 0"; + return XD3_INTERNAL; + } + default: + return ret; + } + + if (*output_size + stream->avail_out > output_size_max) + { + stream->msg = "insufficient output space"; + return ENOSPC; + } + + memcpy (output + *output_size, stream->next_out, stream->avail_out); + + *output_size += stream->avail_out; + + xd3_consume_output (stream); + } + done: + return (close_stream == 0) ? 0 : xd3_close_stream (stream); +} + +static int +xd3_process_memory (int is_encode, + int (*func) (xd3_stream *), + int close_stream, + const uint8_t *input, + usize_t input_size, + const uint8_t *source, + usize_t source_size, + uint8_t *output, + usize_t *output_size, + usize_t output_size_max, + int flags) { + xd3_stream stream; + xd3_config config; + xd3_source src; + int ret; + + memset (& stream, 0, sizeof (stream)); + memset (& config, 0, sizeof (config)); + + if (input == NULL || output == NULL) { + stream.msg = "invalid input/output buffer"; + ret = XD3_INTERNAL; + goto exit; + } + + config.flags = flags; + + if (is_encode) + { + config.srcwin_maxsz = source_size; + config.winsize = min(input_size, (usize_t) XD3_DEFAULT_WINSIZE); + config.iopt_size = min(input_size / 32, XD3_DEFAULT_IOPT_SIZE); + config.iopt_size = max(config.iopt_size, 128U); + config.sprevsz = xd3_pow2_roundup (config.winsize); + } + + if ((ret = xd3_config_stream (&stream, &config)) != 0) + { + goto exit; + } + + if (source != NULL) + { + memset (& src, 0, sizeof (src)); + src.size = source_size; + + src.blksize = source_size; + src.onblk = source_size; + src.curblk = source; + src.curblkno = 0; + + if ((ret = xd3_set_source (&stream, &src)) != 0) + { + goto exit; + } + } + + if ((ret = xd3_process_stream (is_encode, + & stream, + func, 1, + input, input_size, + output, + output_size, + output_size_max)) != 0) + { + goto exit; + } + + exit: + if (ret != 0) { + IF_DEBUG1 (DP(RINT "process_memory: %d: %s", ret, stream.msg)); + } + xd3_free_stream(&stream); + return ret; +} + +int +xd3_decode_stream (xd3_stream *stream, + const uint8_t *input, + usize_t input_size, + uint8_t *output, + usize_t *output_size, + usize_t output_size_max) +{ + return xd3_process_stream (0, stream, & xd3_decode_input, 1, + input, input_size, + output, output_size, output_size_max); +} + +int +xd3_decode_memory (const uint8_t *input, + usize_t input_size, + const uint8_t *source, + usize_t source_size, + uint8_t *output, + usize_t *output_size, + usize_t output_size_max, + int flags) { + return xd3_process_memory (0, & xd3_decode_input, 1, + input, input_size, + source, source_size, + output, output_size, output_size_max, + flags); +} + + +#if XD3_ENCODER +int +xd3_encode_stream (xd3_stream *stream, + const uint8_t *input, + usize_t input_size, + uint8_t *output, + usize_t *output_size, + usize_t output_size_max) +{ + return xd3_process_stream (1, stream, & xd3_encode_input, 1, + input, input_size, + output, output_size, output_size_max); +} + +int +xd3_encode_memory (const uint8_t *input, + usize_t input_size, + const uint8_t *source, + usize_t source_size, + uint8_t *output, + usize_t *output_size, + usize_t output_size_max, + int flags) { + return xd3_process_memory (1, & xd3_encode_input, 1, + input, input_size, + source, source_size, + output, output_size, output_size_max, + flags); +} +#endif + + +/************************************************************* + String matching helpers + *************************************************************/ + +#if XD3_ENCODER +/* Do the initial xd3_string_match() checksum table setup. + * Allocations are delayed until first use to avoid allocation + * sometimes (e.g., perfect matches, zero-length inputs). */ +static int +xd3_string_match_init (xd3_stream *stream) +{ + const int DO_SMALL = ! (stream->flags & XD3_NOCOMPRESS); + const int DO_LARGE = (stream->src != NULL); + + if (DO_LARGE && stream->large_table == NULL) + { + if ((stream->large_table = + (usize_t*) xd3_alloc0 (stream, stream->large_hash.size, sizeof (usize_t))) == NULL) + { + return ENOMEM; + } + } + + if (DO_SMALL) + { + /* Subsequent calls can return immediately after checking reset. */ + if (stream->small_table != NULL) + { + /* The target hash table is reinitialized once per window. */ + /* TODO: This would not have to be reinitialized if absolute + * offsets were being stored. */ + if (stream->small_reset) + { + stream->small_reset = 0; + memset (stream->small_table, 0, + sizeof (usize_t) * stream->small_hash.size); + } + + return 0; + } + + if ((stream->small_table = + (usize_t*) xd3_alloc0 (stream, + stream->small_hash.size, + sizeof (usize_t))) == NULL) + { + return ENOMEM; + } + + /* If there is a previous table needed. */ + if (stream->smatcher.small_lchain > 1 || + stream->smatcher.small_chain > 1) + { + if ((stream->small_prev = + (xd3_slist*) xd3_alloc (stream, + stream->sprevsz, + sizeof (xd3_slist))) == NULL) + { + return ENOMEM; + } + } + } + + return 0; +} + +#if XD3_USE_LARGEFILE64 +/* This function handles the 32/64bit ambiguity -- file positions are 64bit + * but the hash table for source-offsets is 32bit. */ +static xoff_t +xd3_source_cksum_offset(xd3_stream *stream, usize_t low) +{ + xoff_t scp = stream->srcwin_cksum_pos; + xoff_t s0 = scp >> 32; + + usize_t sr = (usize_t) scp; + + if (s0 == 0) { + return low; + } + + /* This should not be >= because srcwin_cksum_pos is the next + * position to index. */ + if (low > sr) { + return (--s0 << 32) | low; + } + + return (s0 << 32) | low; +} +#else +static xoff_t +xd3_source_cksum_offset(xd3_stream *stream, usize_t low) +{ + return (xoff_t) low; +} +#endif + +/* This function sets up the stream->src fields srcbase, srclen. The + * call is delayed until these values are needed to encode a copy + * address. At this point the decision has to be made. */ +static int +xd3_srcwin_setup (xd3_stream *stream) +{ + xd3_source *src = stream->src; + xoff_t length, x; + + /* Check the undecided state. */ + XD3_ASSERT (src->srclen == 0 && src->srcbase == 0); + + /* Avoid repeating this call. */ + stream->srcwin_decided = 1; + + /* If the stream is flushing, then the iopt buffer was able to + * contain the complete encoding. If no copies were issued no + * source window is actually needed. This prevents the VCDIFF + * header from including source base/len. xd3_emit_hdr checks for + * srclen == 0. */ + if (stream->enc_state == ENC_INSTR && stream->match_maxaddr == 0) + { + goto done; + } + + /* Check for overflow, srclen is usize_t - this can't happen unless + * XD3_DEFAULT_SRCBACK and related parameters are extreme - should + * use smaller windows. */ + length = stream->match_maxaddr - stream->match_minaddr; + + x = (xoff_t) USIZE_T_MAX; + if (length > x) + { + stream->msg = "source window length overflow (not 64bit)"; + return XD3_INTERNAL; + } + + /* If ENC_INSTR, then we know the exact source window to use because + * no more copies can be issued. */ + if (stream->enc_state == ENC_INSTR) + { + src->srcbase = stream->match_minaddr; + src->srclen = (usize_t) length; + XD3_ASSERT (src->srclen); + goto done; + } + + /* Otherwise, we have to make a guess. More copies may still be + * issued, but we have to decide the source window base and length + * now. */ + src->srcbase = stream->match_minaddr; + src->srclen = max ((usize_t) length, stream->avail_in + (stream->avail_in >> 2)); + if (src->size < src->srcbase + (xoff_t) src->srclen) + { + /* Could reduce srcbase, as well. */ + src->srclen = src->size - src->srcbase; + } + + XD3_ASSERT (src->srclen); + done: + /* Set the taroff. This convenience variable is used even when + stream->src == NULL. */ + stream->taroff = src->srclen; + return 0; +} + +/* Sets the bounding region for a newly discovered source match, prior + * to calling xd3_source_extend_match(). This sets the match_maxfwd, + * match_maxback variables. Note: srcpos is an absolute position + * (xoff_t) but the match_maxfwd, match_maxback variables are usize_t. + * Returns 0 if the setup succeeds, or 1 if the source position lies + * outside an already-decided srcbase/srclen window. */ +static int +xd3_source_match_setup (xd3_stream *stream, xoff_t srcpos) +{ + xd3_source *src = stream->src; + usize_t greedy_or_not; + + stream->match_maxback = 0; + stream->match_maxfwd = 0; + stream->match_back = 0; + stream->match_fwd = 0; + + /* This avoids a non-blocking endless loop caused by scanning + * backwards across a block boundary, only to find not enough + * matching bytes to beat the current min_match due to a better lazy + * target match: the re-entry to xd3_string_match() repeats the same + * long match because the input position hasn't changed. TODO: if + * ever duplicates are added to the source hash table, this logic + * won't suffice to avoid loops. See testing/regtest.cc's + * TestNonBlockingProgress test! */ + if (srcpos != 0 && srcpos == stream->match_last_srcpos) + { + goto bad; + } + + /* Going backwards, the 1.5-pass algorithm allows some + * already-matched input may be covered by a longer source match. + * The greedy algorithm does not allow this. */ + if (stream->flags & XD3_BEGREEDY) + { + /* The greedy algorithm allows backward matching to the last + matched position. */ + greedy_or_not = xd3_iopt_last_matched (stream); + } + else + { + /* The 1.5-pass algorithm allows backward matching to go back as + * far as the unencoded offset, which is updated as instructions + * pass out of the iopt buffer. If this (default) is chosen, it + * means xd3_iopt_erase may be called to eliminate instructions + * when a covering source match is found. */ + greedy_or_not = stream->unencoded_offset; + } + + /* Backward target match limit. */ + XD3_ASSERT (stream->input_position >= greedy_or_not); + stream->match_maxback = stream->input_position - greedy_or_not; + + /* Forward target match limit. */ + XD3_ASSERT (stream->avail_in > stream->input_position); + stream->match_maxfwd = stream->avail_in - stream->input_position; + + /* Now we take the source position into account. It depends whether + * the srclen/srcbase have been decided yet. */ + if (stream->srcwin_decided == 0) + { + /* Unrestricted case: the match can cover the entire source, + * 0--src->size. We compare the usize_t + * match_maxfwd/match_maxback against the xoff_t + * src->size/srcpos values and take the min. */ + xoff_t srcavail; + + if (srcpos < (xoff_t) stream->match_maxback) + { + stream->match_maxback = srcpos; + } + + srcavail = src->size - srcpos; + if (srcavail < (xoff_t) stream->match_maxfwd) + { + stream->match_maxfwd = srcavail; + } + + goto good; + } + + /* Decided some source window. */ + XD3_ASSERT (src->srclen > 0); + + /* Restricted case: fail if the srcpos lies outside the source window */ + if ((srcpos < src->srcbase) || (srcpos > (src->srcbase + (xoff_t) src->srclen))) + { + goto bad; + } + else + { + usize_t srcavail; + + srcavail = (usize_t) (srcpos - src->srcbase); + if (srcavail < stream->match_maxback) + { + stream->match_maxback = srcavail; + } + + srcavail = (usize_t) (src->srcbase + (xoff_t) src->srclen - srcpos); + if (srcavail < stream->match_maxfwd) { + stream->match_maxfwd = srcavail; + } + + goto good; + } + + good: + stream->match_state = MATCH_BACKWARD; + stream->match_srcpos = srcpos; + stream->match_last_srcpos = srcpos; + return 0; + + bad: + stream->match_state = MATCH_SEARCHING; + return 1; +} + +/* This code is experimental, and I'm having trouble benchmarking + * it reliably. */ +#if 0 +static inline int +xd3_forward_match(const uint8_t *s1c, const uint8_t *s2c, size_t n) +{ + size_t i = 0; +#if UNALIGNED_OK + size_t nint = n / sizeof(int); + + if (nint >> 3) + { + size_t j = 0; + const int *s1 = (const int*)s1c; + const int *s2 = (const int*)s2c; + size_t nint_8 = nint - 8; + + while (i <= nint_8 && + s1[i++] == s2[j++] && + s1[i++] == s2[j++] && + s1[i++] == s2[j++] && + s1[i++] == s2[j++] && + s1[i++] == s2[j++] && + s1[i++] == s2[j++] && + s1[i++] == s2[j++] && + s1[i++] == s2[j++]) { } + + i = (i - 1) * sizeof(int); + } +#endif + + while (i < n && s1c[i] == s2c[i]) + { + i++; + } + return i; +} +#else +static inline usize_t +xd3_forward_match(const uint8_t *s1c, + const uint8_t *s2c, + usize_t n) { + usize_t i = 0; + while (i < n && s1c[i] == s2c[i]) + { + i++; + } + return i; +} +#endif + + +/* This function expands the source match backward and forward. It is + * reentrant, since xd3_getblk may return XD3_GETSRCBLK, so most + * variables are kept in xd3_stream. There are two callers of this + * function, the string_matching routine when a checksum match is + * discovered, and xd3_encode_input whenever a continuing (or initial) + * match is suspected. The two callers do different things with the + * input_position, thus this function leaves that variable untouched. + * If a match is taken the resulting stream->match_fwd is left + * non-zero. */ +static int +xd3_source_extend_match (xd3_stream *stream) +{ + int ret; + xd3_source *src = stream->src; + xoff_t matchoff; /* matchoff is the current right/left-boundary of + the source match being tested. */ + usize_t streamoff; /* streamoff is the current right/left-boundary + of the input match being tested. */ + xoff_t tryblk; /* tryblk, tryoff are the block, offset position + of matchoff */ + usize_t tryoff; + usize_t tryrem; /* tryrem is the number of matchable bytes */ + usize_t matched; + + XD3_ASSERT (src != NULL); + + /* Does it make sense to compute backward match AFTER forward match? */ + if (stream->match_state == MATCH_BACKWARD) + { + /* Note: this code is practically duplicated below, substituting + * match_fwd/match_back and direction. Consolidate? */ + matchoff = stream->match_srcpos - stream->match_back; + streamoff = stream->input_position - stream->match_back; + xd3_blksize_div (matchoff, src, &tryblk, &tryoff); + + /* this loops backward over source blocks */ + while (stream->match_back < stream->match_maxback) + { + /* see if we're backing across a source block boundary */ + if (tryoff == 0) + { + tryoff = src->blksize; + tryblk -= 1; + } + + if ((ret = xd3_getblk (stream, tryblk))) + { + /* if search went too far back, continue forward. */ + if (ret == XD3_TOOFARBACK) + { + break; + } + + /* could be a XD3_GETSRCBLK failure. */ + return ret; + } + + /* TODO: This code can be optimized similar to xd3_match_forward() */ + for (tryrem = min (tryoff, stream->match_maxback - + stream->match_back); + tryrem != 0; + tryrem -= 1, stream->match_back += 1) + { + if (src->curblk[tryoff-1] != stream->next_in[streamoff-1]) + { + goto doneback; + } + + tryoff -= 1; + streamoff -= 1; + } + } + + doneback: + stream->match_state = MATCH_FORWARD; + } + + XD3_ASSERT (stream->match_state == MATCH_FORWARD); + + matchoff = stream->match_srcpos + stream->match_fwd; + streamoff = stream->input_position + stream->match_fwd; + xd3_blksize_div (matchoff, src, & tryblk, & tryoff); + + /* Note: practically the same code as backwards case above: same comments */ + while (stream->match_fwd < stream->match_maxfwd) + { + if (tryoff == src->blksize) + { + tryoff = 0; + tryblk += 1; + } + + if ((ret = xd3_getblk (stream, tryblk))) + { + /* if search went too far back, continue forward. */ + if (ret == XD3_TOOFARBACK) + { + break; + } + + /* could be a XD3_GETSRCBLK failure. */ + return ret; + } + + tryrem = min(stream->match_maxfwd - stream->match_fwd, + src->blksize - tryoff); + + matched = xd3_forward_match(src->curblk + tryoff, + stream->next_in + streamoff, + tryrem); + tryoff += matched; + streamoff += matched; + stream->match_fwd += matched; + + if (tryrem != matched) + { + break; + } + } + + stream->match_state = MATCH_SEARCHING; + + /* If the match ends short of the last instruction end, we probably + * don't want it. There is the possibility that a copy ends short + * of the last copy but also goes further back, in which case we + * might want it. This code does not implement such: if so we would + * need more complicated xd3_iopt_erase logic. */ + if (stream->match_fwd < stream->min_match) + { + stream->match_fwd = 0; + } + else + { + usize_t total = stream->match_fwd + stream->match_back; + + /* Correct the variables to remove match_back from the equation. */ + usize_t target_position = stream->input_position - stream->match_back; + usize_t match_length = stream->match_back + stream->match_fwd; + xoff_t match_position = stream->match_srcpos - stream->match_back; + xoff_t match_end = stream->match_srcpos + stream->match_fwd; + + /* At this point we may have to erase any iopt-buffer + * instructions that are fully covered by a backward-extending + * copy. */ + if (stream->match_back > 0) + { + xd3_iopt_erase (stream, target_position, total); + } + + stream->match_back = 0; + + /* Update ranges. The first source match occurs with both + values set to 0. */ + if (stream->match_maxaddr == 0 || + match_position < stream->match_minaddr) + { + stream->match_minaddr = match_position; + } + + if (match_end > stream->match_maxaddr) + { + /* Note: per-window */ + stream->match_maxaddr = match_end; + } + + if (match_end > stream->maxsrcaddr) + { + /* Note: across windows */ + stream->maxsrcaddr = match_end; + } + + IF_DEBUG1 ({ + static int x = 0; + DP(RINT "[source match:%d] <inp %"Q"u %"Q"u> <src %"Q"u %"Q"u> (%s) [ %u bytes ]\n", + x++, + stream->total_in + target_position, + stream->total_in + target_position + match_length, + match_position, + match_position + match_length, + (stream->total_in + target_position == match_position) ? "same" : "diff", + match_length); + }); + + if ((ret = xd3_found_match (stream, + /* decoder position */ target_position, + /* length */ match_length, + /* address */ match_position, + /* is_source */ 1))) + { + return ret; + } + + /* If the match ends with the available input: */ + if (target_position + match_length == stream->avail_in) + { + /* Setup continuing match for the next window. */ + stream->match_state = MATCH_TARGET; + stream->match_srcpos = match_end; + } + } + + return 0; +} + +/* Update the small hash. Values in the small_table are offset by + * HASH_CKOFFSET (1) to distinguish empty buckets from real offsets. */ +static void +xd3_scksum_insert (xd3_stream *stream, + usize_t inx, + usize_t scksum, + usize_t pos) +{ + /* If we are maintaining previous duplicates. */ + if (stream->small_prev) + { + usize_t last_pos = stream->small_table[inx]; + xd3_slist *pos_list = & stream->small_prev[pos & stream->sprevmask]; + + /* Note last_pos is offset by HASH_CKOFFSET. */ + pos_list->last_pos = last_pos; + } + + /* Enter the new position into the hash bucket. */ + stream->small_table[inx] = pos + HASH_CKOFFSET; +} + +#if XD3_DEBUG +static int +xd3_check_smatch (const uint8_t *ref0, const uint8_t *inp0, + const uint8_t *inp_max, usize_t cmp_len) +{ + usize_t i; + + for (i = 0; i < cmp_len; i += 1) + { + XD3_ASSERT (ref0[i] == inp0[i]); + } + + if (inp0 + cmp_len < inp_max) + { + XD3_ASSERT (inp0[i] != ref0[i]); + } + + return 1; +} +#endif /* XD3_DEBUG */ + +/* When the hash table indicates a possible small string match, it + * calls this routine to find the best match. The first matching + * position is taken from the small_table, HASH_CKOFFSET is subtracted + * to get the actual position. After checking that match, if previous + * linked lists are in use (because stream->smatcher.small_chain > 1), + * previous matches are tested searching for the longest match. If + * (stream->min_match > MIN_MATCH) then a lazy match is in effect. + */ +static usize_t +xd3_smatch (xd3_stream *stream, + usize_t base, + usize_t scksum, + usize_t *match_offset) +{ + usize_t cmp_len; + usize_t match_length = 0; + usize_t chain = (stream->min_match == MIN_MATCH ? + stream->smatcher.small_chain : + stream->smatcher.small_lchain); + const uint8_t *inp_max = stream->next_in + stream->avail_in; + const uint8_t *inp; + const uint8_t *ref; + + SMALL_HASH_DEBUG1 (stream, stream->next_in + stream->input_position); + + XD3_ASSERT (stream->min_match + stream->input_position <= stream->avail_in); + + base -= HASH_CKOFFSET; + + again: + + IF_DEBUG2 (DP(RINT "smatch at base=%u inp=%u cksum=%u\n", base, + stream->input_position, scksum)); + + /* For small matches, we can always go to the end-of-input because + * the matching position must be less than the input position. */ + XD3_ASSERT (base < stream->input_position); + + ref = stream->next_in + base; + inp = stream->next_in + stream->input_position; + + SMALL_HASH_DEBUG2 (stream, ref); + + /* Expand potential match forward. */ + while (inp < inp_max && *inp == *ref) + { + ++inp; + ++ref; + } + + cmp_len = inp - (stream->next_in + stream->input_position); + + /* Verify correctness */ + XD3_ASSERT (xd3_check_smatch (stream->next_in + base, + stream->next_in + stream->input_position, + inp_max, cmp_len)); + + /* Update longest match */ + if (cmp_len > match_length) + { + ( match_length) = cmp_len; + (*match_offset) = base; + + /* Stop if we match the entire input or have a long_enough match. */ + if (inp == inp_max || cmp_len >= stream->smatcher.long_enough) + { + goto done; + } + } + + /* If we have not reached the chain limit, see if there is another + previous position. */ + while (--chain != 0) + { + /* Calculate the previous offset. */ + usize_t prev_pos = stream->small_prev[base & stream->sprevmask].last_pos; + usize_t diff_pos; + + if (prev_pos == 0) + { + break; + } + + prev_pos -= HASH_CKOFFSET; + + if (prev_pos > base) + { + break; + } + + base = prev_pos; + + XD3_ASSERT (stream->input_position > base); + diff_pos = stream->input_position - base; + + /* Stop searching if we go beyond sprevsz, since those entries + * are for unrelated checksum entries. */ + if (diff_pos & ~stream->sprevmask) + { + break; + } + + goto again; + } + + done: + /* Crude efficiency test: if the match is very short and very far back, it's + * unlikely to help, but the exact calculation requires knowing the state of + * the address cache and adjacent instructions, which we can't do here. + * Rather than encode a probably inefficient copy here and check it later + * (which complicates the code a lot), do this: + */ + if (match_length == 4 && stream->input_position - (*match_offset) >= 1<<14) + { + /* It probably takes >2 bytes to encode an address >= 2^14 from here */ + return 0; + } + if (match_length == 5 && stream->input_position - (*match_offset) >= 1<<21) + { + /* It probably takes >3 bytes to encode an address >= 2^21 from here */ + return 0; + } + + /* It's unlikely that a window is large enough for the (match_length == 6 && + * address >= 2^28) check */ + return match_length; +} + +#if XD3_DEBUG +static void +xd3_verify_small_state (xd3_stream *stream, + const uint8_t *inp, + uint32_t x_cksum) +{ + uint32_t state; + uint32_t cksum = xd3_scksum (&state, inp, stream->smatcher.small_look); + + XD3_ASSERT (cksum == x_cksum); +} + +static void +xd3_verify_large_state (xd3_stream *stream, + const uint8_t *inp, + uint32_t x_cksum) +{ + uint32_t cksum = xd3_lcksum (inp, stream->smatcher.large_look); + XD3_ASSERT (cksum == x_cksum); +} +static void +xd3_verify_run_state (xd3_stream *stream, + const uint8_t *inp, + int x_run_l, + uint8_t x_run_c) +{ + int slook = stream->smatcher.small_look; + uint8_t run_c; + int run_l = xd3_comprun (inp, slook, &run_c); + + XD3_ASSERT (run_l == 0 || run_c == x_run_c); + XD3_ASSERT (x_run_l > slook || run_l == x_run_l); +} +#endif /* XD3_DEBUG */ + +/* This function computes more source checksums to advance the window. + * Called at every entrance to the string-match loop and each time + * stream->input_position reaches the value returned as + * *next_move_point. NB: this is one of the most expensive functions + * in this code and also the most critical for good compression. + * + * TODO: really would like a good test for this logic. how? + * Update: testing/regtest.cc has some basic tests, more would be nice. + * TODO: optimize the inner loop + */ +static int +xd3_srcwin_move_point (xd3_stream *stream, usize_t *next_move_point) +{ + xoff_t logical_input_cksum_pos; + + XD3_ASSERT(stream->srcwin_cksum_pos <= stream->src->size); + if (stream->srcwin_cksum_pos == stream->src->size) + { + *next_move_point = USIZE_T_MAX; + return 0; + } + + /* Begin by advancing at twice the input rate, up to half the + * maximum window size. */ + logical_input_cksum_pos = min((stream->total_in + stream->input_position) * 2, + (stream->total_in + stream->input_position) + + (stream->srcwin_maxsz / 2)); + + /* If srcwin_cksum_pos is already greater, wait until the difference + * is met. */ + if (stream->srcwin_cksum_pos > logical_input_cksum_pos) + { + *next_move_point = stream->input_position + + (usize_t)(stream->srcwin_cksum_pos - logical_input_cksum_pos); + return 0; + } + + /* A long match may have extended past srcwin_cksum_pos. Don't + * start checksumming already-matched source data. */ + if (stream->maxsrcaddr > stream->srcwin_cksum_pos) + { + stream->srcwin_cksum_pos = stream->maxsrcaddr; + } + + if (logical_input_cksum_pos < stream->srcwin_cksum_pos) + { + logical_input_cksum_pos = stream->srcwin_cksum_pos; + } + + /* Advance at least one source block. With the command-line + * defaults this means: + * + * if (src->size <= srcwin_maxsz), index the entire source at once + * using the position of the first non-match. This is good for + * small inputs, especially when the content may have moved anywhere + * in the file (e.g., tar files). + * + * if (src->size > srcwin_maxsz), index at least one block (which + * the command-line sets to 1/32 of srcwin_maxsz) ahead of the + * logical position. This is good for different reasons: when a + * long match spanning several source blocks is encountered, this + * avoids computing checksums for those blocks. If the data can + * move anywhere, this is bad. + */ + logical_input_cksum_pos += stream->src->blksize; + + IF_DEBUG1 (DP(RINT "[srcwin_move_point] T=%"Q"u S=%"Q"u/%"Q"u\n", + stream->total_in + stream->input_position, + stream->srcwin_cksum_pos, + logical_input_cksum_pos)); + + while (stream->srcwin_cksum_pos < logical_input_cksum_pos && + stream->srcwin_cksum_pos < stream->src->size) + { + xoff_t blkno; + xoff_t blkbaseoffset; + usize_t blkrem; + ssize_t oldpos; + ssize_t blkpos; + int ret; + xd3_blksize_div (stream->srcwin_cksum_pos, + stream->src, &blkno, &blkrem); + oldpos = blkrem; + blkpos = xd3_bytes_on_srcblk_fast (stream->src, blkno); + + if (oldpos + stream->smatcher.large_look > (usize_t) blkpos) + { + stream->srcwin_cksum_pos = (blkno + 1) * stream->src->blksize; + continue; + } + + if ((ret = xd3_getblk (stream, blkno))) + { + /* TOOFARBACK should never occur here, since we read forward. */ + if (ret == XD3_TOOFARBACK) + { + ret = XD3_INTERNAL; + } + return ret; + } + + /* This inserts checksums for the entire block, in reverse, + * starting from the end of the block. This logic does not test + * stream->srcwin_cksum_pos because it always advances it to the + * start of the next block. + * + * oldpos is the srcwin_cksum_pos within this block. blkpos is + * the number of bytes available. Each iteration inspects + * large_look bytes then steps back large_step bytes. The + * if-stmt above ensures at least one large_look of data. */ + blkpos -= stream->smatcher.large_look; + blkbaseoffset = stream->src->blksize * blkno; + + do + { + uint32_t cksum = xd3_lcksum (stream->src->curblk + blkpos, + stream->smatcher.large_look); + usize_t hval = xd3_checksum_hash (& stream->large_hash, cksum); + + stream->large_table[hval] = + (usize_t) (blkbaseoffset + + (xoff_t)(blkpos + HASH_CKOFFSET)); + + IF_DEBUG (stream->large_ckcnt += 1); + + blkpos -= stream->smatcher.large_step; + } + while (blkpos >= oldpos); + + stream->srcwin_cksum_pos = (blkno + 1) * stream->src->blksize; + } + + if (stream->srcwin_cksum_pos >= stream->src->size) + { + /* This invariant is needed for xd3_source_cksum_offset() */ + stream->srcwin_cksum_pos = stream->src->size; + *next_move_point = USIZE_T_MAX; + return 0; + } + + /* How long until this function should be called again. */ + XD3_ASSERT(stream->srcwin_cksum_pos >= logical_input_cksum_pos); + *next_move_point = stream->input_position + 1 + + (usize_t)(stream->srcwin_cksum_pos - logical_input_cksum_pos); + return 0; +} + +#endif /* XD3_ENCODER */ + +/******************************************************************** + TEMPLATE pass + *********************************************************************/ + +#endif /* __XDELTA3_C_INLINE_PASS__ */ +#ifdef __XDELTA3_C_TEMPLATE_PASS__ + +#if XD3_ENCODER + +/******************************************************************** + Templates + *******************************************************************/ + +/* Template macros */ +#define XD3_TEMPLATE(x) XD3_TEMPLATE2(x,TEMPLATE) +#define XD3_TEMPLATE2(x,n) XD3_TEMPLATE3(x,n) +#define XD3_TEMPLATE3(x,n) x ## n +#define XD3_STRINGIFY(x) XD3_STRINGIFY2(x) +#define XD3_STRINGIFY2(x) #x + +static int XD3_TEMPLATE(xd3_string_match_) (xd3_stream *stream); + +static const xd3_smatcher XD3_TEMPLATE(__smatcher_) = +{ + XD3_STRINGIFY(TEMPLATE), + XD3_TEMPLATE(xd3_string_match_), +#if SOFTCFG == 1 + 0, 0, 0, 0, 0, 0, 0 +#else + LLOOK, LSTEP, SLOOK, SCHAIN, SLCHAIN, MAXLAZY, LONGENOUGH +#endif +}; + +static int +XD3_TEMPLATE(xd3_string_match_) (xd3_stream *stream) +{ + const int DO_SMALL = ! (stream->flags & XD3_NOCOMPRESS); + const int DO_LARGE = (stream->src != NULL); + const int DO_RUN = (1); + + const uint8_t *inp; + uint32_t scksum = 0; + uint32_t scksum_state; + uint32_t lcksum = 0; + usize_t sinx; + usize_t linx; + uint8_t run_c; + size_t run_l; + int ret; + usize_t match_length; + usize_t match_offset = 0; + usize_t next_move_point; + + /* If there will be no compression due to settings or short input, + * skip it entirely. */ + if (! (DO_SMALL || DO_LARGE || DO_RUN) || + stream->input_position + SLOOK > stream->avail_in) { goto loopnomore; } + + if ((ret = xd3_string_match_init (stream))) { return ret; } + + /* The restartloop label is reached when the incremental loop state + * needs to be reset. */ + restartloop: + + /* If there is not enough input remaining for any kind of match, + skip it. */ + if (stream->input_position + SLOOK > stream->avail_in) { goto loopnomore; } + + /* Now reset the incremental loop state: */ + + /* The min_match variable is updated to avoid matching the same lazy + * match over and over again. For example, if you find a (small) + * match of length 9 at one position, you will likely find a match + * of length 8 at the next position. */ + if (xd3_iopt_last_matched (stream) > stream->input_position) + { + stream->min_match = max(MIN_MATCH, + 1 + xd3_iopt_last_matched(stream) - + stream->input_position); + } + else + { + stream->min_match = MIN_MATCH; + } + + /* The current input byte. */ + inp = stream->next_in + stream->input_position; + + /* Small match state. */ + if (DO_SMALL) + { + scksum = xd3_scksum (&scksum_state, inp, SLOOK); + } + + /* Run state. */ + if (DO_RUN) + { + run_l = xd3_comprun (inp, SLOOK, & run_c); + } + + /* Large match state. We continue the loop even after not enough + * bytes for LLOOK remain, so always check stream->input_position in + * DO_LARGE code. */ + if (DO_LARGE && (stream->input_position + LLOOK <= stream->avail_in)) + { + /* Source window: next_move_point is the point that + * stream->input_position must reach before computing more + * source checksum. */ + if ((ret = xd3_srcwin_move_point (stream, & next_move_point))) + { + return ret; + } + + lcksum = xd3_lcksum (inp, LLOOK); + } + + /* TRYLAZYLEN: True if a certain length match should be followed by + * lazy search. This checks that LEN is shorter than MAXLAZY and + * that there is enough leftover data to consider lazy matching. + * "Enough" is set to 2 since the next match will start at the next + * offset, it must match two extra characters. */ +#define TRYLAZYLEN(LEN,POS,MAX) ((MAXLAZY) > 0 && (LEN) < (MAXLAZY) \ + && (POS) + (LEN) <= (MAX) - 2) + + /* HANDLELAZY: This statement is called each time an instruciton is + * emitted (three cases). If the instruction is large enough, the + * loop is restarted, otherwise lazy matching may ensue. */ +#define HANDLELAZY(mlen) \ + if (TRYLAZYLEN ((mlen), (stream->input_position), (stream->avail_in))) \ + { stream->min_match = (mlen) + LEAST_MATCH_INCR; goto updateone; } \ + else \ + { stream->input_position += (mlen); goto restartloop; } + + /* Now loop over one input byte at a time until a match is found... */ + for (;; inp += 1, stream->input_position += 1) + { + /* Now we try three kinds of string match in order of expense: + * run, large match, small match. */ + + /* Expand the start of a RUN. The test for (run_l == SLOOK) + * avoids repeating this check when we pass through a run area + * performing lazy matching. The run is only expanded once when + * the min_match is first reached. If lazy matching is + * performed, the run_l variable will remain inconsistent until + * the first non-running input character is reached, at which + * time the run_l may then again grow to SLOOK. */ + if (DO_RUN && run_l == SLOOK) + { + usize_t max_len = stream->avail_in - stream->input_position; + + IF_DEBUG (xd3_verify_run_state (stream, inp, run_l, run_c)); + + while (run_l < max_len && inp[run_l] == run_c) { run_l += 1; } + + /* Output a RUN instruction. */ + if (run_l >= stream->min_match && run_l >= MIN_RUN) + { + if ((ret = xd3_emit_run (stream, stream->input_position, + run_l, run_c))) { return ret; } + + HANDLELAZY (run_l); + } + } + + /* If there is enough input remaining. */ + if (DO_LARGE && (stream->input_position + LLOOK <= stream->avail_in)) + { + if ((stream->input_position >= next_move_point) && + (ret = xd3_srcwin_move_point (stream, & next_move_point))) + { + return ret; + } + + linx = xd3_checksum_hash (& stream->large_hash, lcksum); + + IF_DEBUG (xd3_verify_large_state (stream, inp, lcksum)); + + if (stream->large_table[linx] != 0) + { + /* the match_setup will fail if the source window has + * been decided and the match lies outside it. You + * could consider forcing a window at this point to + * permit a new source window. */ + xoff_t adj_offset = + xd3_source_cksum_offset(stream, + stream->large_table[linx] - + HASH_CKOFFSET); + if (xd3_source_match_setup (stream, adj_offset) == 0) + { + if ((ret = xd3_source_extend_match (stream))) + { + return ret; + } + + /* Update stream position. match_fwd is zero if no + * match. */ + if (stream->match_fwd > 0) + { + HANDLELAZY (stream->match_fwd); + } + } + } + } + + /* Small matches. */ + if (DO_SMALL) + { + sinx = xd3_checksum_hash (& stream->small_hash, scksum); + + /* Verify incremental state in debugging mode. */ + IF_DEBUG (xd3_verify_small_state (stream, inp, scksum)); + + /* Search for the longest match */ + if (stream->small_table[sinx] != 0) + { + match_length = xd3_smatch (stream, + stream->small_table[sinx], + scksum, + & match_offset); + } + else + { + match_length = 0; + } + + /* Insert a hash for this string. */ + xd3_scksum_insert (stream, sinx, scksum, stream->input_position); + + /* Maybe output a COPY instruction */ + if (match_length >= stream->min_match) + { + IF_DEBUG1 ({ + static int x = 0; + DP(RINT "[target match:%d] <inp %u %u> <cpy %u %u> " + "(-%d) [ %u bytes ]\n", + x++, + stream->input_position, + stream->input_position + match_length, + match_offset, + match_offset + match_length, + stream->input_position - match_offset, + match_length); + }); + + if ((ret = xd3_found_match (stream, + /* decoder position */ + stream->input_position, + /* length */ match_length, + /* address */ match_offset, + /* is_source */ 0))) + { + return ret; + } + + /* Copy instruction. */ + HANDLELAZY (match_length); + } + } + + /* The logic above prevents excess work during lazy matching by + * increasing min_match to avoid smaller matches. Each time we + * advance stream->input_position by one, the minimum match + * shortens as well. */ + if (stream->min_match > MIN_MATCH) + { + stream->min_match -= 1; + } + + updateone: + + /* See if there are no more incremental cksums to compute. */ + if (stream->input_position + SLOOK == stream->avail_in) + { + goto loopnomore; + } + + /* Compute next RUN, CKSUM */ + if (DO_RUN) { NEXTRUN (inp[SLOOK]); } + if (DO_SMALL) + { + scksum = xd3_small_cksum_update (&scksum_state, inp, SLOOK); + } + if (DO_LARGE && (stream->input_position + LLOOK < stream->avail_in)) + { + lcksum = xd3_large_cksum_update (lcksum, inp, LLOOK); + } + } + + loopnomore: + return 0; +} +#endif /* XD3_ENCODER */ +#endif /* __XDELTA3_C_TEMPLATE_PASS__ */ |