| 1 | 2021-08-14 Jim Meyering <meyering@fb.com>
|
|---|
| 2 |
|
|---|
| 3 | version 3.7
|
|---|
| 4 | * NEWS: Record release date.
|
|---|
| 5 |
|
|---|
| 6 | 2021-08-09 Jim Meyering <meyering@fb.com>
|
|---|
| 7 |
|
|---|
| 8 | tests: provide an awk-based seq replacement
|
|---|
| 9 | ...so we can continue to use seq, but the wrapper when needed.
|
|---|
| 10 | * tests/init.cfg (seq): Some systems lask seq.
|
|---|
| 11 | Provide a replacement.
|
|---|
| 12 | * tests/hash-collision-perf: Use seq once again.
|
|---|
| 13 | * tests/long-pattern-perf: Likewise. And remove a comment about seq.
|
|---|
| 14 |
|
|---|
| 15 | 2021-08-09 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 16 |
|
|---|
| 17 | grep: simplify EGexecute
|
|---|
| 18 | * src/dfasearch.c (EGexecute): Remove a label and goto.
|
|---|
| 19 | This also makes the machine code a bit shorter, on x86-64 gcc.
|
|---|
| 20 |
|
|---|
| 21 | grep: simplify data movement slightly
|
|---|
| 22 | * src/grep.c (fillbuf): Simplify movement of saved data.
|
|---|
| 23 |
|
|---|
| 24 | grep: pointer-integer cast nit
|
|---|
| 25 | * src/grep.c (ALIGN_TO): When converting pointers to unsigned
|
|---|
| 26 | integers, convert to uintptr_t not size_t, as size_t in theory
|
|---|
| 27 | might be too narrow.
|
|---|
| 28 |
|
|---|
| 29 | tests: use awk, not seq
|
|---|
| 30 | Portability problem reported by Dagobert Michelsen in:
|
|---|
| 31 | https://lists.gnu.org/r/grep-devel/2021-08/msg00004.html
|
|---|
| 32 | * tests/hash-collision-perf, tests/long-pattern-perf:
|
|---|
| 33 | Don’t assume seq is installed; use awk instead.
|
|---|
| 34 |
|
|---|
| 35 | 2021-08-08 Jim Meyering <meyering@fb.com>
|
|---|
| 36 |
|
|---|
| 37 | build: update gnulib to latest
|
|---|
| 38 |
|
|---|
| 39 | build: update gnulib to latest
|
|---|
| 40 |
|
|---|
| 41 | 2021-08-06 Kevin Locke <kevin@kevinlocke.name>
|
|---|
| 42 |
|
|---|
| 43 | doc: usage: --group-separator/--no-group-separator
|
|---|
| 44 | * src/grep.c (usage): Document --group-separator
|
|---|
| 45 | and --no-group-separator.
|
|---|
| 46 |
|
|---|
| 47 | doc: man: add --group-separator/--no-group-separator
|
|---|
| 48 | * doc/grep.in.1:
|
|---|
| 49 | Add copy of docs for --group-separator from doc/grep.texi.
|
|---|
| 50 | Add copy of docs for --no-group-separator from doc/grep.texi.
|
|---|
| 51 |
|
|---|
| 52 | 2021-08-06 Jim Meyering <meyering@fb.com>
|
|---|
| 53 |
|
|---|
| 54 | build: update gnulib to latest
|
|---|
| 55 |
|
|---|
| 56 | 2021-06-19 Mateusz Okulus <mmokulus@gmail.com>
|
|---|
| 57 |
|
|---|
| 58 | doc: note that -H is a GNU extension in man page, too
|
|---|
| 59 | * doc/grep.in.1 (-H): Mention that this is a GNU extension.
|
|---|
| 60 |
|
|---|
| 61 | 2021-06-13 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 62 |
|
|---|
| 63 | build: update gnulib submodule to latest
|
|---|
| 64 |
|
|---|
| 65 | 2021-06-11 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 66 |
|
|---|
| 67 | build: update gnulib submodule to latest
|
|---|
| 68 |
|
|---|
| 69 | 2021-06-10 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 70 |
|
|---|
| 71 | doc: improve examples and wording
|
|---|
| 72 | * doc/grep.texi (The Backslash Character and Special Expressions)
|
|---|
| 73 | (Usage): Improve doc (Bug#48948).
|
|---|
| 74 |
|
|---|
| 75 | 2021-01-31 Jim Meyering <meyering@fb.com>
|
|---|
| 76 |
|
|---|
| 77 | doc: man: fix -L description and improve -l's
|
|---|
| 78 | * doc/grep.texi (-L): Remove erroneous sentence about stopping early.
|
|---|
| 79 | With -L, grep cannot stop scanning early.
|
|---|
| 80 | (-l): Tweak existing wording.
|
|---|
| 81 | * doc/grep.in.1: Remove the -L sentence here, too.
|
|---|
| 82 | (-l): Copy the sentence from grep.texi, to clarify: it's only per-file
|
|---|
| 83 | scanning that stops upon match. Reported by Robert Bruntz
|
|---|
| 84 | in http://debbugs.gnu.org/46179
|
|---|
| 85 |
|
|---|
| 86 | 2021-01-05 Jim Meyering <meyering@fb.com>
|
|---|
| 87 |
|
|---|
| 88 | build: avoid long-string warnings in gnulib tests
|
|---|
| 89 | * configure.ac (GNULIB_TEST_WARN_CFLAGS): Add
|
|---|
| 90 | -Woverlength-strings to avoid clang warnings.
|
|---|
| 91 |
|
|---|
| 92 | 2021-01-01 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 93 |
|
|---|
| 94 | doc: further clarify regexp structure
|
|---|
| 95 | * doc/grep.texi (Fundamental Structure)
|
|---|
| 96 | (Back-references and Subexpressions, Basic vs Extended):
|
|---|
| 97 | Further clarifications.
|
|---|
| 98 |
|
|---|
| 99 | maint: copy bootstrap, tests/init.sh from Gnulib
|
|---|
| 100 |
|
|---|
| 101 | doc: update grep.texi cite to 2021
|
|---|
| 102 |
|
|---|
| 103 | maint: run "make update-copyright"
|
|---|
| 104 |
|
|---|
| 105 | build: update gnulib submodule to latest
|
|---|
| 106 |
|
|---|
| 107 | 2020-12-30 Jim Meyering <meyering@fb.com>
|
|---|
| 108 |
|
|---|
| 109 | build: update gnulib to latest
|
|---|
| 110 | * gnulib: update for clang-10 warning warning-avoidance
|
|---|
| 111 | fixes in hash and regex-tests.
|
|---|
| 112 |
|
|---|
| 113 | maint: add parentheses to avoid new clang-10 warning
|
|---|
| 114 | * src/dfasearch.c (regex_compile): Parenthesize arith-OR vs
|
|---|
| 115 | ternary, to placate clang-10.
|
|---|
| 116 |
|
|---|
| 117 | 2020-12-29 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 118 |
|
|---|
| 119 | doc: clarify special chars and }
|
|---|
| 120 | * doc/grep.texi (Fundamental Structure)
|
|---|
| 121 | (Character Classes and Bracket Expressions)
|
|---|
| 122 | (The Backslash Character and Special Expressions, Anchoring)
|
|---|
| 123 | (Basic vs Extended): Clarify which characters are special,
|
|---|
| 124 | and why \ is needed before } in grep even though } is not special.
|
|---|
| 125 | Use Posix terminology for ordinary and special characters and for
|
|---|
| 126 | interval expressions.
|
|---|
| 127 |
|
|---|
| 128 | 2020-12-29 Marek Suppa <mr@shu.io>
|
|---|
| 129 |
|
|---|
| 130 | doc: fix missing right curly brace
|
|---|
| 131 | * doc/grep.texi (Basic vs Extended Regular Expressions): Mention that
|
|---|
| 132 | the right curly brace (}) meta-character must be backslash-escaped.
|
|---|
| 133 | It had been omitted from the list.
|
|---|
| 134 |
|
|---|
| 135 | 2020-12-25 Jim Meyering <meyering@fb.com>
|
|---|
| 136 |
|
|---|
| 137 | build: update gnulib to latest
|
|---|
| 138 |
|
|---|
| 139 | grep: use of --unix-byte-offsets (-u) now elicits a warning
|
|---|
| 140 | * NEWS (Change in behavior): Mention this.
|
|---|
| 141 | * src/grep.c (main): Warn about each use of obsolete
|
|---|
| 142 | --unix-byte-offsets (-u).
|
|---|
| 143 | * doc/grep.in.1 (-u): Remove its documentation.
|
|---|
| 144 |
|
|---|
| 145 | 2020-12-23 Helge Kreutzmann <debian@helgefjell.de>
|
|---|
| 146 |
|
|---|
| 147 | doc: adjust man page syntax
|
|---|
| 148 | * doc/grep.in.1: Mark some manual names with B<...>.
|
|---|
| 149 | Mark PATTERNS with I<...>.
|
|---|
| 150 | Drop final period in SEE ALSO.
|
|---|
| 151 | With suggestions from of several members of the manpage-l10n
|
|---|
| 152 | translation community. This resolves https://bugs.gnu.org/45353
|
|---|
| 153 |
|
|---|
| 154 | 2020-11-26 Jim Meyering <meyering@fb.com>
|
|---|
| 155 |
|
|---|
| 156 | grep: avoid performance regression with many patterns
|
|---|
| 157 | * src/grep.c (hash_pattern): Switch from PJW to DJB2, to avoid an
|
|---|
| 158 | O(N) to O(N^2) performance regression due to hash collisions with
|
|---|
| 159 | patterns from e.g., seq 500000|tr 0-9 A-J
|
|---|
| 160 | Reported by Frank Heckenbach in https://bugs.gnu.org/44754
|
|---|
| 161 | * NEWS (Bug fixes): Mention it.
|
|---|
| 162 | * tests/hash-collision-perf: New file.
|
|---|
| 163 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 164 |
|
|---|
| 165 | build: update gnulib to latest for warning fixes
|
|---|
| 166 | * gnulib: Update submodule to latest.
|
|---|
| 167 | * src/grep.c (printf_errno): Reflect gnulib's renaming: change
|
|---|
| 168 | _GL_ATTRIBUTE_FORMAT_PRINTF to
|
|---|
| 169 | _GL_ATTRIBUTE_FORMAT_PRINTF_STANDARD
|
|---|
| 170 |
|
|---|
| 171 | tests: enable warnings for the gnulib-tests subdir
|
|---|
| 172 | * gnulib-tests/Makefile.am (AM_CFLAGS): Enable gnulib
|
|---|
| 173 | warning options for these tests.
|
|---|
| 174 | * configure.ac (GNULIB_TEST_WARN_CFLAGS): Disable the same three
|
|---|
| 175 | warning options that coreutils does, and a few more for GCC11.
|
|---|
| 176 |
|
|---|
| 177 | 2020-11-08 Jim Meyering <meyering@fb.com>
|
|---|
| 178 |
|
|---|
| 179 | maint: post-release administrivia
|
|---|
| 180 | * NEWS: Add header line for next release.
|
|---|
| 181 | * .prev-version: Record previous version.
|
|---|
| 182 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 183 |
|
|---|
| 184 | version 3.6
|
|---|
| 185 | * NEWS: Record release date.
|
|---|
| 186 |
|
|---|
| 187 | 2020-11-05 Jim Meyering <meyering@fb.com>
|
|---|
| 188 |
|
|---|
| 189 | build: update gnulib to latest for test improvements
|
|---|
| 190 |
|
|---|
| 191 | 2020-11-03 Jim Meyering <meyering@fb.com>
|
|---|
| 192 |
|
|---|
| 193 | build: update gnulib to latest for C++-ready dfa.h and test-verify.c fix
|
|---|
| 194 |
|
|---|
| 195 | 2020-11-03 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 196 |
|
|---|
| 197 | grep: remove GREP_OPTIONS
|
|---|
| 198 | * NEWS: Mention this.
|
|---|
| 199 | * doc/grep.in.1:
|
|---|
| 200 | Remove GREP_OPTIONS documentation.
|
|---|
| 201 | * doc/grep.texi (Environment Variables):
|
|---|
| 202 | Move GREP_OPTIONS stuff into a “no longer implemented” paragraph.
|
|---|
| 203 | * src/grep.c (prepend_args, prepend_default_options): Remove.
|
|---|
| 204 | (main): Do not look at GREP_OPTIONS.
|
|---|
| 205 | * tests/Makefile.am (TESTS_ENVIRONMENTS):
|
|---|
| 206 | * tests/init.cfg (vars_): Remove GREP_OPTIONS.
|
|---|
| 207 |
|
|---|
| 208 | 2020-11-01 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 209 |
|
|---|
| 210 | grep: use RE_NO_SUB when calling regex solely to check syntax
|
|---|
| 211 | * src/dfasearch.c (regex_compile): New parameter. All callers changed.
|
|---|
| 212 | (GEAcompile): Move setting syntax for regex into regex_compile()
|
|---|
| 213 | function. This addresses a performance problem exposed by extreme
|
|---|
| 214 | regular expressions, as described in https://bugs.gnu.org/43862 .
|
|---|
| 215 |
|
|---|
| 216 | tests: add the test for bugfix in gnulib's dfa
|
|---|
| 217 | * tests/ere.tests: Add new test.
|
|---|
| 218 |
|
|---|
| 219 | 2020-11-01 Jim Meyering <meyering@fb.com>
|
|---|
| 220 |
|
|---|
| 221 | grep: avoid erroneous matches for e.g., a+a+a+
|
|---|
| 222 | * gnulib: Update to latest, for dfa's invalid-merge fix.
|
|---|
| 223 | * NEWS (Bug fixes): Mention this.
|
|---|
| 224 |
|
|---|
| 225 | 2020-10-11 Jim Meyering <meyering@fb.com>
|
|---|
| 226 |
|
|---|
| 227 | grep: -P: report input filename upon PCRE execution failure
|
|---|
| 228 | Without this, it could be tedious to determine which input
|
|---|
| 229 | file evokes a PCRE-execution-time failure.
|
|---|
| 230 | * src/pcresearch.c (Pexecute): When failing, include the
|
|---|
| 231 | error-provoking file name in the diagnostic.
|
|---|
| 232 | * src/grep.c (input_filename): Make extern, since used above.
|
|---|
| 233 | * src/search.h (input_filename): Declare.
|
|---|
| 234 | * tests/filename-lineno.pl: Test for this.
|
|---|
| 235 | ($no_pcre): Factor out.
|
|---|
| 236 | * NEWS (Bug fixes): Mention this.
|
|---|
| 237 |
|
|---|
| 238 | 2020-10-11 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 239 |
|
|---|
| 240 | grep: minor kwset cleanups
|
|---|
| 241 | * src/kwsearch.c (Fexecute):
|
|---|
| 242 | Assume C99 to put declarations nearer uses.
|
|---|
| 243 | * src/kwset.c (bmexec): Omit unnecessary test.
|
|---|
| 244 | * src/kwset.h (struct kwsmatch): Make OFFSET and SIZE individual
|
|---|
| 245 | elements, not arrays of size 1 (a revenant of an earlier API).
|
|---|
| 246 | All uses changed.
|
|---|
| 247 |
|
|---|
| 248 | 2020-10-11 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 249 |
|
|---|
| 250 | grep: remove unused code
|
|---|
| 251 | * src/kwsearch.c (Fcompile, Fexecute): Remove unused code. No longer these
|
|---|
| 252 | are used after commit 016e590a8198009bce0e1078f6d4c7e037e2df3c.
|
|---|
| 253 |
|
|---|
| 254 | 2020-10-05 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 255 |
|
|---|
| 256 | build: update gnulib submodule to latest
|
|---|
| 257 |
|
|---|
| 258 | 2020-10-05 Jim Meyering <meyering@fb.com>
|
|---|
| 259 |
|
|---|
| 260 | tests: correct filename-lineno.pl
|
|---|
| 261 | * tests/filename-lineno.pl: Remove a stray envvar
|
|---|
| 262 | that somehow slipped into expected output string.
|
|---|
| 263 |
|
|---|
| 264 | 2020-10-05 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 265 |
|
|---|
| 266 | tests: fix tests when PCRE is not used
|
|---|
| 267 | * tests/Makefile.am (TESTS_ENVIRONMENT):
|
|---|
| 268 | Set PATH before setting PCRE_WORKS, so that the latter test
|
|---|
| 269 | uses the just-built grep.
|
|---|
| 270 | * tests/filename-lineno.pl (invalid-re-P-paren)
|
|---|
| 271 | (invalid-re-P-star-paren): Adjust non-PCRE case to match
|
|---|
| 272 | recently-changed behavior.
|
|---|
| 273 |
|
|---|
| 274 | build: update gnulib submodule to latest
|
|---|
| 275 |
|
|---|
| 276 | 2020-10-03 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 277 |
|
|---|
| 278 | doc: document --include/--exclude better
|
|---|
| 279 | Problem reported by John Ruckstuhl (Bug#43782).
|
|---|
| 280 | * doc/grep.texi (File and Directory Selection):
|
|---|
| 281 | Document what happens if contradictory options are given,
|
|---|
| 282 | or if no option matches a file name.
|
|---|
| 283 | * doc/grep.in.1:
|
|---|
| 284 |
|
|---|
| 285 | 2020-10-01 Jim Meyering <meyering@fb.com>
|
|---|
| 286 |
|
|---|
| 287 | maint: add technically-required quotes
|
|---|
| 288 | * configure.ac: Quote args of AC_CONFIG_AUX_DIR, AC_CONFIG_SRCDIR
|
|---|
| 289 | and AC_CHECK_FUNCS_ONCE.
|
|---|
| 290 |
|
|---|
| 291 | 2020-09-28 Jim Meyering <meyering@fb.com>
|
|---|
| 292 |
|
|---|
| 293 | tests: restore deleted -P tests
|
|---|
| 294 | v3.4-almost-45-g8577dda deleted these two -P-using tests because a
|
|---|
| 295 | grep built without PCRE support would fail those tests. This sets
|
|---|
| 296 | an envvar with the equivalent of the result from the require_pcre_
|
|---|
| 297 | function and restores the now-guarded tests. Tested by running this:
|
|---|
| 298 | ./configure --disable-perl-regexp && make check
|
|---|
| 299 | * tests/Makefile.am (PCRE_WORKS): Set this envvar.
|
|---|
| 300 | * tests/filename-lineno.pl: Restore invalid-re-P-paren and
|
|---|
| 301 | invalid-re-P-star-paren, now each with a guard.
|
|---|
| 302 |
|
|---|
| 303 | 2020-09-27 Jim Meyering <meyering@fb.com>
|
|---|
| 304 |
|
|---|
| 305 | maint: post-release administrivia
|
|---|
| 306 | * NEWS: Add header line for next release.
|
|---|
| 307 | * .prev-version: Record previous version.
|
|---|
| 308 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 309 |
|
|---|
| 310 | version 3.5
|
|---|
| 311 | * NEWS: Record release date.
|
|---|
| 312 |
|
|---|
| 313 | maint: avoid autoconf warnings * configure.ac (AC_HEADER_STDC): Remove. It's been assumed for ages. * m4/pcre.m4 (gl_FUNC_PCRE): Use AS_HELP_STRING, not AC_HELP_STRING.
|
|---|
| 314 |
|
|---|
| 315 | build: update gnulib to latest
|
|---|
| 316 |
|
|---|
| 317 | 2020-09-26 Jim Meyering <meyering@fb.com>
|
|---|
| 318 |
|
|---|
| 319 | build: update gnulib to latest
|
|---|
| 320 |
|
|---|
| 321 | tests: skip stack-overflow test when built with ASAN
|
|---|
| 322 | * tests/stack-overflow: Skip this test when the binary was built
|
|---|
| 323 | with ASAN, to avoid spurious failures.
|
|---|
| 324 |
|
|---|
| 325 | 2020-09-25 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 326 |
|
|---|
| 327 | build: update gnulib submodule to latest
|
|---|
| 328 |
|
|---|
| 329 | build: update gnulib submodule to latest
|
|---|
| 330 |
|
|---|
| 331 | 2020-09-24 Jim Meyering <meyering@fb.com>
|
|---|
| 332 |
|
|---|
| 333 | tests: fix surrogate-pair test to work on 16-bit wchar_t systems
|
|---|
| 334 | * tests/surrogate-pair: Avoid new failure on systems with
|
|---|
| 335 | 16-bit wchar_t. Detect the condition and exit before the
|
|---|
| 336 | otherwise-failing tests. Remove the now-incorrect in-loop
|
|---|
| 337 | test for that alternate failure mode. This was exposed by
|
|---|
| 338 | testing on gcc119.fsffrance.org, a power8 AIX 7.2 system.
|
|---|
| 339 |
|
|---|
| 340 | 2020-09-23 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 341 |
|
|---|
| 342 | grep: don't assume PCRE in tests
|
|---|
| 343 | * tests/filename-lineno.pl: Remove invalid-re-P-paren and
|
|---|
| 344 | invalid-re-P-star-paren as they assume PCRE support, which
|
|---|
| 345 | causes a false alarm "grep: Perl matching not supported in a
|
|---|
| 346 | --disable-perl-regexp build" on platforms without PCRE.
|
|---|
| 347 |
|
|---|
| 348 | grep: pacify Sun C 5.15
|
|---|
| 349 | This suppresses a false alarm '"grep.c", line 720: warning:
|
|---|
| 350 | initializer will be sign-extended: -1'.
|
|---|
| 351 | * src/grep.c (uword_max): New static constant.
|
|---|
| 352 | (initialize_unibyte_mask): Use it.
|
|---|
| 353 |
|
|---|
| 354 | 2020-09-23 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 355 | Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 356 |
|
|---|
| 357 | grep: fix more Turkish-eyes bugs
|
|---|
| 358 | Fix more bugs recently uncovered by Norihiro Tanaka (Bug#43577).
|
|---|
| 359 | * NEWS: Mention new bug report.
|
|---|
| 360 | * src/grep.c (ok_fold): New static var.
|
|---|
| 361 | (setup_ok_fold): New function.
|
|---|
| 362 | (fgrep_icase_charlen): Reject single-byte characters
|
|---|
| 363 | if they match some multibyte characters when ignoring case.
|
|---|
| 364 | This part of the patch is partly derived from
|
|---|
| 365 | <https://bugs.gnu.org/43577#14>, which means it is:
|
|---|
| 366 | (main): Call setup_ok_fold if ok_fold might be needed.
|
|---|
| 367 | * src/searchutils.c (kwsinit): With the grep.c changes,
|
|---|
| 368 | this code can now revert to classic 7th Edition Unix style;
|
|---|
| 369 | aborting would be wrong.
|
|---|
| 370 | * tests/turkish-eyes: Add tests for these bugs.
|
|---|
| 371 |
|
|---|
| 372 | 2020-09-23 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 373 |
|
|---|
| 374 | build: update gnulib submodule to latest
|
|---|
| 375 | * NEWS: Mention Bug#43577, which this fixes.
|
|---|
| 376 |
|
|---|
| 377 | grep: fix recently-introduced performance glitch
|
|---|
| 378 | * src/grep.c (main): Do not double-increment update_patterns.
|
|---|
| 379 | update_patterns increments n_patterns now; do not increment it
|
|---|
| 380 | again, as the incorrect count would hurt performance heuristics later.
|
|---|
| 381 |
|
|---|
| 382 | 2020-09-22 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 383 |
|
|---|
| 384 | doc: improve --line-buffer doc
|
|---|
| 385 | * doc/grep.texi (Other Options): Document --line-buffered more
|
|---|
| 386 | carefully, and say what happens when it is not used. Problem
|
|---|
| 387 | reported by Dan Jacobson (Bug#35339).
|
|---|
| 388 |
|
|---|
| 389 | tests: port timeout test to Alpine
|
|---|
| 390 | Problem reported by Bruno Haible in:
|
|---|
| 391 | https://lists.gnu.org/r/grep-devel/2020-09/msg00080.html
|
|---|
| 392 | * tests/init.cfg (require_timeout_): Check that ‘timeout 0.01
|
|---|
| 393 | sleep 0.02’ works as expected, to avoid spurious test failure
|
|---|
| 394 | on Alpine.
|
|---|
| 395 |
|
|---|
| 396 | 2020-09-22 Jim Meyering <meyering@fb.com>
|
|---|
| 397 |
|
|---|
| 398 | tests: test for many-regexp N^2 RSS regression
|
|---|
| 399 | * tests/many-regex-performance: New test for this performance
|
|---|
| 400 | regression.
|
|---|
| 401 | * tests/Makefile.am: Add it.
|
|---|
| 402 | * NEWS (Bug fixes): Describe it.
|
|---|
| 403 |
|
|---|
| 404 | 2020-09-22 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 405 |
|
|---|
| 406 | grep: avoid unnecessary regex compilation
|
|---|
| 407 | Grep resorts to using the regex engine when the precision of either
|
|---|
| 408 | -o or --color is required, or when the pattern is not supported by
|
|---|
| 409 | our DFA engine (e.g., backref). Otherwise, grep would perform regex
|
|---|
| 410 | compilation solely to check the syntax. This change makes grep skip
|
|---|
| 411 | that compilation in the common case for which it is unnecessary.
|
|---|
| 412 |
|
|---|
| 413 | The compilation we are avoiding is quite costly, consuming O(N^2)
|
|---|
| 414 | RSS for N regular expressions.
|
|---|
| 415 |
|
|---|
| 416 | * src/dfasearch.c (GEAcompile): Add new argument, and avoid unneeded
|
|---|
| 417 | compilation of regex.
|
|---|
| 418 | * src/grep.c (compile_fp_t): Update prototype.
|
|---|
| 419 | (main): Update caller.
|
|---|
| 420 | * src/kwsearch.c (Fcompile): Update caller and add new argument.
|
|---|
| 421 | * src/pcresearch.c (Pcompile): Add new argument.
|
|---|
| 422 | * src/search.h (GEAcompile, Fcompile, Pcompile): Update prototype.
|
|---|
| 423 |
|
|---|
| 424 | 2020-09-22 Jim Meyering <meyering@fb.com>
|
|---|
| 425 |
|
|---|
| 426 | build: update gnulib to latest
|
|---|
| 427 |
|
|---|
| 428 | tests: skip stack-overflow test on midnightbsd*
|
|---|
| 429 | * tests/stack-overflow: skip_ when run on this OS. See details
|
|---|
| 430 | in https://lists.gnu.org/r/grep-devel/2020-09/msg00062.html
|
|---|
| 431 | * tests/Makefile.am (host_triplet): Export.
|
|---|
| 432 |
|
|---|
| 433 | 2020-09-21 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 434 |
|
|---|
| 435 | doc: say how to match chars by code
|
|---|
| 436 | From a suggestion in Bug#41004.
|
|---|
| 437 | * doc/grep.texi (Character Encoding, Matching Non-ASCII):
|
|---|
| 438 | New sections. Move some material from Environment Variables
|
|---|
| 439 | into these sections.
|
|---|
| 440 |
|
|---|
| 441 | 2020-09-18 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 442 |
|
|---|
| 443 | * src/dfasearch.c (struct dfa_comp): Fix out-of-date comment.
|
|---|
| 444 |
|
|---|
| 445 | grep: "grep '\)'" reports an error again
|
|---|
| 446 | * src/grep.c (try_fgrep_pattern): With -G, pass \) through to
|
|---|
| 447 | GEAcompile so that it can complain. This fixes an unexpected
|
|---|
| 448 | change in behavior from grep 3.4 and earlier.
|
|---|
| 449 | * tests/filename-lineno.pl: Add tests for this sort of thing.
|
|---|
| 450 |
|
|---|
| 451 | grep: tweak by using mempcpy
|
|---|
| 452 | * src/grep.c (try_fgrep_pattern): Tweak previous change
|
|---|
| 453 | by using mempcpy.
|
|---|
| 454 |
|
|---|
| 455 | 2020-09-18 Jim Meyering <meyering@fb.com>
|
|---|
| 456 |
|
|---|
| 457 | grep: make echo .|grep '\.' match once again
|
|---|
| 458 | The same applied for many other backslash-escaped bytes, not just
|
|---|
| 459 | metacharacters. The switch to rawmemchr in v3.4-almost-10-g9393b97
|
|---|
| 460 | made some parts of the code require the usually-guaranteed newline
|
|---|
| 461 | sentinel at the end of each pattern. Before, some consumers used a
|
|---|
| 462 | (correct) pattern length and did not care that try_fgrep_pattern could
|
|---|
| 463 | transform a pattern (with sentinel) like "\\.\n" to "..\n", thus
|
|---|
| 464 | violating that assumption.
|
|---|
| 465 | * src/grep.c (try_fgrep_pattern): Preserve the invariant
|
|---|
| 466 | that each regexp is newline-terminated.
|
|---|
| 467 | * tests/backslash-dot: New file. Test for this.
|
|---|
| 468 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 469 |
|
|---|
| 470 | tests: triple-backref: print a reference to glibc bug
|
|---|
| 471 | * tests/triple-backref (MALLOC_CHECK_): And tell glibc not to
|
|---|
| 472 | bother with a core dump. Suggested by Pádraig Brady.
|
|---|
| 473 |
|
|---|
| 474 | 2020-09-18 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 475 |
|
|---|
| 476 | grep: be more consistent about diagnostic format
|
|---|
| 477 | * NEWS: Mention this.
|
|---|
| 478 | * bootstrap.conf (gnulib_modules): Remove 'quote'.
|
|---|
| 479 | * src/grep.c: Do not include quote.h.
|
|---|
| 480 | (grep, grepdirent, grepdesc): Put the three unusual diagnostics
|
|---|
| 481 | into the same "grep: FOO: message" form that grep uses elsewhere.
|
|---|
| 482 | * tests/binary-file-matches, tests/in-eq-out-infloop:
|
|---|
| 483 | Adjust tests to match new diagnostic format.
|
|---|
| 484 |
|
|---|
| 485 | 2020-09-17 Jim Meyering <meyering@fb.com>
|
|---|
| 486 |
|
|---|
| 487 | build: update gnulib to latest
|
|---|
| 488 |
|
|---|
| 489 | 2020-09-17 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 490 |
|
|---|
| 491 | * tests/triple-backref: Add comment.
|
|---|
| 492 |
|
|---|
| 493 | 2020-09-17 Jim Meyering <meyering@fb.com>
|
|---|
| 494 |
|
|---|
| 495 | tests: make new test executable, to placate distcheck
|
|---|
| 496 | * tests/binary-file-matches: Make this executable.
|
|---|
| 497 |
|
|---|
| 498 | tests: add coverage for code that emits the new diagnostic
|
|---|
| 499 | * tests/binary-file-matches: New file.
|
|---|
| 500 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 501 |
|
|---|
| 502 | maint: avoid syntax-check failure
|
|---|
| 503 | * src/grep.c (grep): Lower-case the "B" in "Binary file... matches"
|
|---|
| 504 | diagnostic that we now emit to stderr. This avoids the following
|
|---|
| 505 | when running "make syntax-check":
|
|---|
| 506 | maint.mk: found capitalized error message
|
|---|
| 507 | make: *** [maint.mk:469: sc_error_message_uppercase] Error 1
|
|---|
| 508 |
|
|---|
| 509 | 2020-09-17 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 510 |
|
|---|
| 511 | Send "Binary file FOO matches" to stderr
|
|---|
| 512 | * NEWS, doc/grep.texi: Mention this change (Bug#29668).
|
|---|
| 513 | * src/grep.c (grep): Send "Binary file FOO matches" to stderr
|
|---|
| 514 | instead of stdout.
|
|---|
| 515 | * tests/encoding-error, tests/invalid-multibyte-infloop:
|
|---|
| 516 | * tests/null-byte, tests/pcre-count, tests/surrogate-pair:
|
|---|
| 517 | * tests/symlink, tests/unibyte-binary:
|
|---|
| 518 | Adjust tests to match new behavior. In all cases this
|
|---|
| 519 | simplifies the tests, which is a good sign.
|
|---|
| 520 |
|
|---|
| 521 | Suppress "Binary file FOO matches" if -I
|
|---|
| 522 | Problem reported by Jason Franklin (Bug#33552).
|
|---|
| 523 | * NEWS: Mention this.
|
|---|
| 524 | * src/grep.c (grep): Do not output "Binary file FOO matches" if -I.
|
|---|
| 525 | * tests/encoding-error: Add test for this bug.
|
|---|
| 526 |
|
|---|
| 527 | 2020-09-15 Jim Meyering <meyering@fb.com>
|
|---|
| 528 |
|
|---|
| 529 | maint: keep two blank lines before each old Noteworthy line.
|
|---|
| 530 | * NEWS: Insert a blank line.
|
|---|
| 531 |
|
|---|
| 532 | 2020-09-15 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 533 |
|
|---|
| 534 | build: update gnulib submodule to latest
|
|---|
| 535 |
|
|---|
| 536 | 2020-09-13 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 537 |
|
|---|
| 538 | build: update gnulib submodule to latest
|
|---|
| 539 |
|
|---|
| 540 | 2020-09-12 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 541 |
|
|---|
| 542 | build: update gnulib submodule to latest
|
|---|
| 543 |
|
|---|
| 544 | 2020-09-11 Jim Meyering <meyering@fb.com>
|
|---|
| 545 |
|
|---|
| 546 | build: update gnulib to latest
|
|---|
| 547 |
|
|---|
| 548 | 2020-09-09 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 549 |
|
|---|
| 550 | grep: fix logic for growing PCRE JIT stack
|
|---|
| 551 | * src/pcresearch.c (jit_exec) [PCRE_EXTRA_MATCH_LIMIT_RECURSION]:
|
|---|
| 552 | When growing the match_limit_recursion limit, do not use the old
|
|---|
| 553 | value if ! (flags & PCRE_EXTRA_MATCH_LIMIT_RECURSION), as it is
|
|---|
| 554 | uninitialized in that case.
|
|---|
| 555 |
|
|---|
| 556 | grep: fix PCRE JIT test when JIT not available
|
|---|
| 557 | Problem reported by Thomas Deutschmann (Bug#29446#23).
|
|---|
| 558 | * src/pcresearch.c (Pexecute): Diagnose PCRE_ERROR_RECURSIONLIMIT.
|
|---|
| 559 | * tests/pcre-jitstack: Treat recursion limit overflow like stack
|
|---|
| 560 | overflow.
|
|---|
| 561 |
|
|---|
| 562 | grep: fix -w bug in UTF-8 locales
|
|---|
| 563 | Problem reported by Mayo Fark (Bug#43225).
|
|---|
| 564 | * src/searchutils.c (wordchar_prev): In a UTF-8 locale, do not
|
|---|
| 565 | assume that an encoding-error byte cannot be part of a word
|
|---|
| 566 | constituent, as this assumption is incorrect for the last byte
|
|---|
| 567 | of a multibyte word constituent.
|
|---|
| 568 | * tests/word-delim-multibyte: Add a test for the bug.
|
|---|
| 569 |
|
|---|
| 570 | Distribute a gzip tarball again
|
|---|
| 571 | Requested by Issam E. Maghni in:
|
|---|
| 572 | https://lists.gnu.org/r/grep-devel/2020-09/msg00000.html
|
|---|
| 573 | * configure.ac (AM_INIT_AUTOMAKE): Remove no-dist-gzip.
|
|---|
| 574 |
|
|---|
| 575 | * README-prereq: Also mention xz.
|
|---|
| 576 |
|
|---|
| 577 | 2020-09-07 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 578 |
|
|---|
| 579 | Prefer rawmemchr to memchr when it’s easy
|
|---|
| 580 | * bootstrap.conf (gnulib_modules): Add rawmemchr.
|
|---|
| 581 | * src/dfasearch.c (GEAcompile, EGexecute):
|
|---|
| 582 | * src/grep.c (update_patterns, prpending, prtext):
|
|---|
| 583 | * src/kwsearch.c (Fcompile, Fexecute):
|
|---|
| 584 | * src/pcresearch.c (Pcompile, Pexecute):
|
|---|
| 585 | Simplify (and presumably speed up a little) by using rawmemchr
|
|---|
| 586 | with a sentinel, instead of using memchr.
|
|---|
| 587 |
|
|---|
| 588 | Simplify pattern_file_name
|
|---|
| 589 | * src/grep.c (pattern_file_name): Make first argument
|
|---|
| 590 | origin-0, not origin-1, as this simplifies both caller and
|
|---|
| 591 | callee. All uses changed.
|
|---|
| 592 |
|
|---|
| 593 | Simplify regex_compile
|
|---|
| 594 | * src/dfasearch.c (regex_compile): "" suffices; we don’t need "\0".
|
|---|
| 595 | No need to initialize pat_lineno.
|
|---|
| 596 |
|
|---|
| 597 | Omit duplicate regexps
|
|---|
| 598 | Do not pass two copies of the same regexp to the
|
|---|
| 599 | regular-expression engine. Although the engines should
|
|---|
| 600 | perform nearly as well even with the copies, in practice they do not.
|
|---|
| 601 | Problem reported by Luca Borzacchiello (Bug#43040).
|
|---|
| 602 | * bootstrap.conf (gnulib_modules): Add hash.
|
|---|
| 603 | * src/grep.c: Include stdint.h, for SIZE_WIDTH.
|
|---|
| 604 | Include hash.h.
|
|---|
| 605 | (struct patloc, patloc, patlocs_allocated, patlocs_used):
|
|---|
| 606 | Rename from struct FL_pair, fl_pair, n_fl_pair_slots, n_pattern_files,
|
|---|
| 607 | respectively, since the data type is no longer a pair.
|
|---|
| 608 | All uses changed.
|
|---|
| 609 | (struct patloc): New member FILELINE. The lineno member is now
|
|---|
| 610 | ptrdiff_t since nowadays we prefer signed types.
|
|---|
| 611 | (pattern_array, patterns_table): New static vars.
|
|---|
| 612 | (count_nl_bytes, fl_add): Remove; no longer used.
|
|---|
| 613 | (hash_pattern, compare_patterns, update_patterns): New functions.
|
|---|
| 614 | update_patterns does what fl_add used to do, plus remove dups.
|
|---|
| 615 | (pattern_file_name): Adjust to change from fl_pair to patloc.
|
|---|
| 616 | (main): Move some variables to inner blocks for clarity.
|
|---|
| 617 | Maintain the pattern_table hash of all patterns.
|
|---|
| 618 | Update pattern_array to match keys, and use update_patterns
|
|---|
| 619 | instead of fl_add to remove duplicate keys.
|
|---|
| 620 | * tests/filename-lineno.pl (invalid-re-2-files)
|
|---|
| 621 | (invalid-re-2-files2, invalid-re-2e): Ensure regexps are unique in
|
|---|
| 622 | tests so that dups aren’t removed in diagnostics.
|
|---|
| 623 | (invalid-re-line-numbers): New test.
|
|---|
| 624 |
|
|---|
| 625 | 2020-08-23 Jim Meyering <meyering@fb.com>
|
|---|
| 626 |
|
|---|
| 627 | build: update gnulib to latest
|
|---|
| 628 | * gnulib: Update submodule to latest.
|
|---|
| 629 | * bootstrap.conf (gnulib_modules): Add explicit dependency on dirname-lgpl.
|
|---|
| 630 | Before, we pulled this in via a dependency.
|
|---|
| 631 | * bootstrap: Update from gnulib.
|
|---|
| 632 |
|
|---|
| 633 | build: require autoconf-2.64
|
|---|
| 634 | * configure.ac: Require autoconf-2.64, up from 2.63, to align with gnulib.
|
|---|
| 635 |
|
|---|
| 636 | 2020-08-22 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 637 |
|
|---|
| 638 | Revert -L exit status change introduced in grep 3.2
|
|---|
| 639 | Problems reported by Antonio Diaz Diaz in:
|
|---|
| 640 | https://bugs.gnu.org/28105#29
|
|---|
| 641 | * NEWS, doc/grep.texi (Exit Status), src/grep.c (usage):
|
|---|
| 642 | Adjust documentation accordingly.
|
|---|
| 643 | * src/grep.c (grepdesc, main): Go back to old behavior.
|
|---|
| 644 | * tests/skip-read: Adjust tests accordingly.
|
|---|
| 645 |
|
|---|
| 646 | 2020-01-20 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 647 |
|
|---|
| 648 | tests: fix permission issue in previous change
|
|---|
| 649 |
|
|---|
| 650 | tests: work around GCC -fprofile-generate bug
|
|---|
| 651 | * tests/triple-backref: Add a 10 s timeout to work around
|
|---|
| 652 | what appears to be a GCC bug with -fprofile-generate.
|
|---|
| 653 | Problem reported by Martin Liška, with diagnosis by
|
|---|
| 654 | Andreas Schwab (Bug#21513).
|
|---|
| 655 |
|
|---|
| 656 | 2020-01-02 Jim Meyering <meyering@fb.com>
|
|---|
| 657 |
|
|---|
| 658 | maint: post-release administrivia
|
|---|
| 659 | * NEWS: Add header line for next release.
|
|---|
| 660 | * .prev-version: Record previous version.
|
|---|
| 661 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 662 |
|
|---|
| 663 | version 3.4
|
|---|
| 664 | * NEWS: Record release date.
|
|---|
| 665 |
|
|---|
| 666 | build: update gnulib to latest, for mbrtowc-vs-Irix build fix
|
|---|
| 667 |
|
|---|
| 668 | 2020-01-02 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 669 |
|
|---|
| 670 | doc: mention glibc bug 24269
|
|---|
| 671 | * doc/grep.texi (Known Bugs): Mention glibc bug 24269.
|
|---|
| 672 | Merge formatting/URL changes from Gnulib regex.texi.
|
|---|
| 673 |
|
|---|
| 674 | doc: fix --exclude description in man page
|
|---|
| 675 | Problem reported by Duncan Moore (Bug#37212).
|
|---|
| 676 | * src/grep.c (usage): Fix incorrect statement about --exclude
|
|---|
| 677 | and directories. Standardize on “that match GLOB” instead
|
|---|
| 678 | of “matching GLOB”.
|
|---|
| 679 |
|
|---|
| 680 | doc: fix missing “more” in man page
|
|---|
| 681 | Problem reported by Philippe Schnoebelen (Bug#34078).
|
|---|
| 682 | * doc/grep.in.1: Add missing “more”.
|
|---|
| 683 |
|
|---|
| 684 | 2020-01-01 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 685 |
|
|---|
| 686 | doc: add [:blank:] to man page
|
|---|
| 687 | * doc/grep.in.1: Mention [:blank:] (Bug#33291).
|
|---|
| 688 |
|
|---|
| 689 | 2020-01-01 Jim Meyering <meyering@fb.com>
|
|---|
| 690 |
|
|---|
| 691 | maint: update all copyright year number ranges
|
|---|
| 692 | Run "make update-copyright" and then...
|
|---|
| 693 | * gnulib: Update to latest with copyright year adjusted.
|
|---|
| 694 | * tests/init.sh: Sync with gnulib to pick up copyright year.
|
|---|
| 695 | * bootstrap: Likewise.
|
|---|
| 696 | * doc/grep.in.1: Use "-" in copyright year ranges, not \en.
|
|---|
| 697 |
|
|---|
| 698 | 2019-12-31 Jim Meyering <meyering@fb.com>
|
|---|
| 699 |
|
|---|
| 700 | tests: avoid unwarranted failure in a netbsd 8.1 VM
|
|---|
| 701 | * tests/mb-non-UTF8-perf-Fw: Run twice, to avoid first-read penalty.
|
|---|
| 702 | Reported by Nelson H.F. Beebe.
|
|---|
| 703 |
|
|---|
| 704 | 2019-12-30 Jim Meyering <meyering@fb.com>
|
|---|
| 705 |
|
|---|
| 706 | build: update gnulib to latest (for localeinfo perf fix)
|
|---|
| 707 |
|
|---|
| 708 | maint: add syntax-check rule to prohibit "backreference" spelling
|
|---|
| 709 | * cfg.mk (sc_prohibit_backref): New rule.
|
|---|
| 710 |
|
|---|
| 711 | 2019-12-30 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 712 |
|
|---|
| 713 | maint: remove too-long line from AUTHORS
|
|---|
| 714 | * AUTHORS: Remove URL that’s too long.
|
|---|
| 715 |
|
|---|
| 716 | maint: update AUTHORS
|
|---|
| 717 | * AUTHORS: Update to better reflect current authorship.
|
|---|
| 718 |
|
|---|
| 719 | 2019-12-30 Jim Meyering <meyering@fb.com>
|
|---|
| 720 |
|
|---|
| 721 | avoid new syntax-check failures
|
|---|
| 722 | * cfg.mk (old_NEWS_hash): Updating old news, we must also udpate this.
|
|---|
| 723 |
|
|---|
| 724 | 2019-12-30 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 725 |
|
|---|
| 726 | doc: don’t encourage back-references
|
|---|
| 727 | * doc/grep.texi (Usage): Remove palindrome question. Bondioni’s
|
|---|
| 728 | RE makes grep issue a ‘grep: stack overflow’ diagnostic, and we
|
|---|
| 729 | shouldn’t be encouraging fancy back-references anyway, due to all
|
|---|
| 730 | the bugs in this area (Bug#26864). Plus, the allusion to
|
|---|
| 731 | “GNU extensions” doesn't seem to be correct here.
|
|---|
| 732 |
|
|---|
| 733 | doc: robustify some examples
|
|---|
| 734 | Prompted by suggestions by Stephane Chazelas (Bug#38792#20).
|
|---|
| 735 | * doc/grep.texi (Usage): Make examples more robust.
|
|---|
| 736 |
|
|---|
| 737 | doc: fix bug# typo
|
|---|
| 738 |
|
|---|
| 739 | doc: spell "back-reference" more consistently
|
|---|
| 740 |
|
|---|
| 741 | doc: mention back-reference bugs
|
|---|
| 742 | Inspired by Bug#26864.
|
|---|
| 743 | * doc/grep.texi (Known Bugs): New section.
|
|---|
| 744 | Mention back-reference issues.
|
|---|
| 745 |
|
|---|
| 746 | 2019-12-29 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 747 |
|
|---|
| 748 | doc: Add -- to more-complex example
|
|---|
| 749 | Suggested by Stephane Chazelas (Bug#38792).
|
|---|
| 750 | * doc/grep.in.1, doc/grep.texi: Add ‘--’ to recently-added example.
|
|---|
| 751 |
|
|---|
| 752 | doc: improve subsection title (Bug#26132)
|
|---|
| 753 | * doc/grep.in.1: Rename "Matcher Selection" to "Pattern Syntax".
|
|---|
| 754 |
|
|---|
| 755 | doc: fix typo in previous patch
|
|---|
| 756 |
|
|---|
| 757 | doc: document quoting better
|
|---|
| 758 | Problem reported by Martin Simons (Bug#38792).
|
|---|
| 759 | * doc/grep.texi: Fix quoting used in examples. Say that patterns
|
|---|
| 760 | should be quoted, use quoting more consistently in examples, and
|
|---|
| 761 | give an example illustrating the difference between patterns and
|
|---|
| 762 | globbing. Don’t assume zgrep expertise in example.
|
|---|
| 763 | * doc/grep.in.1: Likewise. Also, reorder sections
|
|---|
| 764 | to match GNU/Linux man-pages style.
|
|---|
| 765 |
|
|---|
| 766 | 2019-12-26 Jim Meyering <meyering@fb.com>
|
|---|
| 767 |
|
|---|
| 768 | maint: tweak NEWS wording
|
|---|
| 769 | * NEWS: Minor wording change.
|
|---|
| 770 |
|
|---|
| 771 | build: update gnulib to latest; and sync tests/init.sh
|
|---|
| 772 | * gnulib: update
|
|---|
| 773 | * tests/init.sh: Sync from gnulib (this removes the LC_ALL=C setting).
|
|---|
| 774 |
|
|---|
| 775 | tests: avoid spurious failure due to 1-second timeout
|
|---|
| 776 | * tests/grep-dev-null-out: Use a 10-second timeout, rather than
|
|---|
| 777 | a 1-second one. This avoids false failure on slow systems.
|
|---|
| 778 | Reported by Assaf Gordon in
|
|---|
| 779 | https://lists.gnu.org/r/grep-devel/2019-12/msg00018.html
|
|---|
| 780 |
|
|---|
| 781 | 2019-12-26 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 782 |
|
|---|
| 783 | build: update gnulib submodule to latest
|
|---|
| 784 |
|
|---|
| 785 | maint: adjust surrogate-pair for 16-bit wchar_t
|
|---|
| 786 | * tests/surrogate-pair: Adjust to match fixed behavior
|
|---|
| 787 | on AIX 7.2, where wchar_t is 16 bits and cannot represent
|
|---|
| 788 | the test case data.
|
|---|
| 789 |
|
|---|
| 790 | 2019-12-25 Jim Meyering <meyering@fb.com>
|
|---|
| 791 |
|
|---|
| 792 | tests: fix typo in name of test file
|
|---|
| 793 | * tests/backslash-s-vs-invalid-multitype: Rename to...
|
|---|
| 794 | * tests/backslash-s-vs-invalid-multibyte: ...this.
|
|---|
| 795 | * tests/Makefile.am (TESTS): Reflect renaming.
|
|---|
| 796 |
|
|---|
| 797 | tests: ensure we use require_timeout_ when needed
|
|---|
| 798 | * cfg.mk (sc_timeout_prereq): New syntax-check rule.
|
|---|
| 799 |
|
|---|
| 800 | tests: require timeout
|
|---|
| 801 | * tests/mb-non-UTF8-perf-Fw: This test uses "timeout",
|
|---|
| 802 | so must first call require_timeout_.
|
|---|
| 803 | This avoids test spurious failure when running with
|
|---|
| 804 | no timeout program. Reported by Bruno Haible in
|
|---|
| 805 | https://lists.gnu.org/r/grep-devel/2019-12/msg00008.html
|
|---|
| 806 |
|
|---|
| 807 | 2019-12-25 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 808 |
|
|---|
| 809 | tests: work around AIX 7.2 sh printf bug
|
|---|
| 810 | AIX 7.2 /bin/sh’s printf command mishandles octal escapes
|
|---|
| 811 | in multibyte locales: it treats them as characters, not bytes.
|
|---|
| 812 | * tests/backslash-s-vs-invalid-multitype, tests/encoding-error:
|
|---|
| 813 | Use the C locale when employing the printf command with an octal
|
|---|
| 814 | escape that AIX 7.2 sh might mishandle.
|
|---|
| 815 | * tests/init.sh (setup_): Use the C locale for tests.
|
|---|
| 816 | This has the side benefit of making them more reproducible.
|
|---|
| 817 |
|
|---|
| 818 | 2019-12-22 Jim Meyering <meyering@fb.com>
|
|---|
| 819 |
|
|---|
| 820 | maint: adjust new comments
|
|---|
| 821 | * src/dfasearch.c (possible_backrefs_in_pattern): Remove a
|
|---|
| 822 | duplicate "a", insert a "be" and a comma, and reformat.
|
|---|
| 823 |
|
|---|
| 824 | build: update gnulib to latest
|
|---|
| 825 | * gnulib: Update submodule to latest.
|
|---|
| 826 | * bootstrap: Copy from gnulib.
|
|---|
| 827 | * tests/init.sh: Likewise.
|
|---|
| 828 |
|
|---|
| 829 | 2019-12-22 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 830 |
|
|---|
| 831 | grep: fix some bugs in pattern-grouping speedup
|
|---|
| 832 | This fixes some bugs in the previous commit,
|
|---|
| 833 | and should finish the fix for Bug#33249.
|
|---|
| 834 | * NEWS: Mention fix for Bug#33249.
|
|---|
| 835 | * src/dfasearch.c (possible_backrefs_in_pattern, regex_compile)
|
|---|
| 836 | (GEAcompile): In new code, prefer ptrdiff_t to size_t when either
|
|---|
| 837 | will do, since ptrdiff_t has better error checking. At some point
|
|---|
| 838 | we should adjust the old code too.
|
|---|
| 839 | (possible_backrefs_in_pattern): Rename from
|
|---|
| 840 | find_backref_in_pattern. New arg BS_SAFE. All uses changed.
|
|---|
| 841 | Fix false negative if a multibyte character ends in a single
|
|---|
| 842 | '\\' byte, followed by the two bytes '\\', '1'.
|
|---|
| 843 | (regex_compile): Simplify.
|
|---|
| 844 | (GEAcompile): Avoid quadratic behavior when reallocating growing
|
|---|
| 845 | buffers. Fix a couple of bugs in copying pattern data involving
|
|---|
| 846 | backreferences. Fix another bug in copying pattern metadata
|
|---|
| 847 | involving backreferences, by removing the need to copy it.
|
|---|
| 848 |
|
|---|
| 849 | 2019-12-22 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 850 |
|
|---|
| 851 | grep: grouping of a pattern with multiple lines
|
|---|
| 852 | When grep uses regex, it splits a pattern with multiple lines by
|
|---|
| 853 | newline character into fragments. Compilation and execution run for
|
|---|
| 854 | each fragment. That causes slowdown. By this change, each fragment is
|
|---|
| 855 | divided into groups by whether the fragment includes back references.
|
|---|
| 856 | A fragment with back references constitutes group, and all fragments
|
|---|
| 857 | that lack back references also constitute a group.
|
|---|
| 858 |
|
|---|
| 859 | This change extremely speeds-up following case.
|
|---|
| 860 |
|
|---|
| 861 | $ seq -f '%040g' 0 9999 | sed '1s/$/\\(0\\)\\1/' >pat
|
|---|
| 862 | $ yes 00000000000000000000000000000000000000000x | head -10000 >in
|
|---|
| 863 | $ time -p env LC_ALL=C src/grep -f pat in
|
|---|
| 864 |
|
|---|
| 865 | * src/dfasearch.c (find_backref_in_pattern, regex_compile):
|
|---|
| 866 | New functions.
|
|---|
| 867 | (GEAcompile): Use the new functions to group fragments
|
|---|
| 868 | as mentioned above.
|
|---|
| 869 |
|
|---|
| 870 | 2019-12-19 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 871 |
|
|---|
| 872 | maint: add NEWS for Bug#34951 fix
|
|---|
| 873 | * NEWS: Mention Bug#34951.
|
|---|
| 874 |
|
|---|
| 875 | 2019-12-19 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 876 |
|
|---|
| 877 | dfa: separate parse and compile phase
|
|---|
| 878 | DFAMUST() must be called after parse and before tokens re-order which is
|
|---|
| 879 | introduced in commit 5c7a0371823876cca7a1347fa09ca26bbbff0c98, but both are
|
|---|
| 880 | executed in compilation phase.
|
|---|
| 881 |
|
|---|
| 882 | * lib/dfa.c (dfaparse): Change it to global function.
|
|---|
| 883 | (dfacomp): If first argument is NULL, skip parse.
|
|---|
| 884 | * lib/dfa.h: (dfaparse): Add a prototype.
|
|---|
| 885 |
|
|---|
| 886 | 2019-12-19 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 887 |
|
|---|
| 888 | build: update gnulib submodule to latest
|
|---|
| 889 |
|
|---|
| 890 | 2019-12-19 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 891 |
|
|---|
| 892 | grep: speed up multiple word matching
|
|---|
| 893 | grep uses its KWset matcher for multiple word matching, but that is
|
|---|
| 894 | very slow when most of the parts matched to a pattern are not words.
|
|---|
| 895 | So, if the first match to a pattern is not a word, use the grep matcher
|
|---|
| 896 | to match for its line.
|
|---|
| 897 |
|
|---|
| 898 | Note that when START_PTR is set, the grep matcher uses the regex matcher
|
|---|
| 899 | which is very slow to match words. Therefore, we use the grep matcher
|
|---|
| 900 | when only START_PTR is NULL.
|
|---|
| 901 |
|
|---|
| 902 | * src/kwsearch.c (Fexecute): If an initial match is incomplete because
|
|---|
| 903 | not on a word boundary, use the grep matcher to find a matching line.
|
|---|
| 904 |
|
|---|
| 905 | 2019-12-18 Jim Meyering <meyering@fb.com>
|
|---|
| 906 |
|
|---|
| 907 | maint: sort test names
|
|---|
| 908 | * tests/Makefile.am (TESTS): Alphabetize the new addition,
|
|---|
| 909 | mb-non-UTF8-perf-Fw to placate syntax-check's sc_sorted_tests.
|
|---|
| 910 |
|
|---|
| 911 | 2019-12-18 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 912 |
|
|---|
| 913 | maint: adjust to recent Gnulib change
|
|---|
| 914 | * po/POTFILES.in: Remove lib/xstrtol-error.c.
|
|---|
| 915 |
|
|---|
| 916 | 2019-12-17 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 917 |
|
|---|
| 918 | grep: do not match invalid UTF-8
|
|---|
| 919 | Update Gnulib to latest. Also:
|
|---|
| 920 | * src/dfasearch.c (EGexecute): Use ptrdiff_t, not size_t,
|
|---|
| 921 | to match new Gnulib API.
|
|---|
| 922 | * tests/Makefile.am (TESTS): Add dfa-invalid-utf8.
|
|---|
| 923 | * tests/dfa-invalid-utf8: New file.
|
|---|
| 924 |
|
|---|
| 925 | 2019-11-30 Jim Meyering <meyering@fb.com>
|
|---|
| 926 |
|
|---|
| 927 | tests: add test that would have detected -Fw perf regression
|
|---|
| 928 | * tests/mb-non-UTF8-perf-Fw: New file. Detect v3.3-22-g090a4db's
|
|---|
| 929 | performance regression.
|
|---|
| 930 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 931 |
|
|---|
| 932 | 2019-11-29 Jim Meyering <meyering@fb.com>
|
|---|
| 933 |
|
|---|
| 934 | maint: fix test comment
|
|---|
| 935 | * tests/mb-non-UTF8-word-boundary: Also correct "introduced-in"
|
|---|
| 936 | version number in a comment here.
|
|---|
| 937 |
|
|---|
| 938 | 2019-11-25 Jim Meyering <meyering@fb.com>
|
|---|
| 939 |
|
|---|
| 940 | maint: correct NEWS blurb
|
|---|
| 941 | * NEWS (Bug fixes): Correction: the -Fw bug was introduced
|
|---|
| 942 | in 2.28, not in 3.0. Reported by Paul Eggert.
|
|---|
| 943 |
|
|---|
| 944 | 2019-11-17 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 945 |
|
|---|
| 946 | grep: improve grep -Fw performance in non-UTF8 multibyte locales
|
|---|
| 947 | * src/searchutils.c (mb_goback): New parameter. All callers changed.
|
|---|
| 948 | * src/search.h (mb_goback): Update prototype.
|
|---|
| 949 | * src/kwsearch.c (Fexecute): Use mb_goback's MBCLEN to detect a
|
|---|
| 950 | word-boundary even more efficiently.
|
|---|
| 951 |
|
|---|
| 952 | grep: fix performance regression with previous patch
|
|---|
| 953 | * src/kwsearch.c (Fexecute): Avoid unnecessary back-up in non-UTF8
|
|---|
| 954 | multibyte locales.
|
|---|
| 955 |
|
|---|
| 956 | 2019-11-16 Jim Meyering <meyering@fb.com>
|
|---|
| 957 |
|
|---|
| 958 | maint: rename a variable: bol -> nl
|
|---|
| 959 | * src/kwsearch.c (Fexecute): Change misleading name: s/bol/nl/
|
|---|
| 960 |
|
|---|
| 961 | build: update gnulib to latest
|
|---|
| 962 |
|
|---|
| 963 | maint: correct and clarify a comment
|
|---|
| 964 | * src/kwsearch.c (Fexecute): Logic was reversed.
|
|---|
| 965 |
|
|---|
| 966 | grep: avoid false -Fw match in non-UTF8 multibyte locales
|
|---|
| 967 | For example, this command would erroneously print its input line:
|
|---|
| 968 | echo ab | LC_CTYPE=ja_JP.eucjp grep -Fw b
|
|---|
| 969 | This arose when the "memrchr" search for a preceding newline failed:
|
|---|
| 970 | in that case, MB_START was not adjusted and was initially the same
|
|---|
| 971 | as BEG, so wordchar_prev mistakenly returned 0.
|
|---|
| 972 | * src/kwsearch.c (Fexecute): Set MB_START also when there is no
|
|---|
| 973 | preceding newline.
|
|---|
| 974 | * NEWS (Bug fixes): Mention it.
|
|---|
| 975 | * tests/mb-non-UTF8-word-boundary: New file. Test for the bug.
|
|---|
| 976 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 977 | Reported by NIDE, Naoyuki in https://bugs.gnu.org/38223.
|
|---|
| 978 |
|
|---|
| 979 | 2019-11-08 Jim Meyering <meyering@fb.com>
|
|---|
| 980 |
|
|---|
| 981 | build: update gnulib to latest
|
|---|
| 982 | * po/POTFILES.in: Add lib/argmatch.h.
|
|---|
| 983 |
|
|---|
| 984 | 2019-11-05 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 985 |
|
|---|
| 986 | grep: new --no-ignore-case option
|
|---|
| 987 | Suggested by Karl Berry and mostly implemented by Arnold Robbins
|
|---|
| 988 | (Bug#37907).
|
|---|
| 989 | * NEWS:
|
|---|
| 990 | * doc/grep.in.1:
|
|---|
| 991 | * doc/grep.texi (Matching Control):
|
|---|
| 992 | * src/grep.c (usage):
|
|---|
| 993 | Document the new option.
|
|---|
| 994 | * src/grep.c (NO_IGNORE_CASE_OPTION): New constant.
|
|---|
| 995 | (long_options, main): Support new option.
|
|---|
| 996 |
|
|---|
| 997 | grep: simplify previous patch
|
|---|
| 998 | * src/grep.c (main): Use an int rather than an enum for a local
|
|---|
| 999 | var, which is overkill here.
|
|---|
| 1000 |
|
|---|
| 1001 | grep: further simplify out_file handling
|
|---|
| 1002 | * src/grep.c (print_filenames): Make this a local variable instead
|
|---|
| 1003 | of static. Rename it to filename_option, to avoid confusion with
|
|---|
| 1004 | the print_filename function, and rename the enum values for the
|
|---|
| 1005 | same reason. All uses changed.
|
|---|
| 1006 | (out_file): Now -1, 0, 1 to represent unknown, false, true.
|
|---|
| 1007 | All uses changed.
|
|---|
| 1008 | (single_command_line_arg): Remove. This static variable’s
|
|---|
| 1009 | function is now accomplished by a local variable ‘num_operands’.
|
|---|
| 1010 | (grepdesc): Simplify adjustment of out_file accordingly.
|
|---|
| 1011 | (main): Initialize out_file to -1 if not known yet.
|
|---|
| 1012 |
|
|---|
| 1013 | 2019-11-05 Zev Weiss <zev@bewilderbeest.net>
|
|---|
| 1014 |
|
|---|
| 1015 | grep: simplify out_file handling
|
|---|
| 1016 | * src/grep.c (print_filenames): New tristate enum (-H, -h, or
|
|---|
| 1017 | neither); supplants with_filenames and no_filenames.
|
|---|
| 1018 | (single_command_line_arg): New variable indicating if grep was run
|
|---|
| 1019 | with a single command-line argument.
|
|---|
| 1020 | (no_filenames): Remove variable.
|
|---|
| 1021 | (grepdirent): Don't twiddle out_file back and forth during recursion.
|
|---|
| 1022 | (grepdesc): Turn off out_file on 'grep -r foo nondirectory'.
|
|---|
| 1023 | (main): Replace with_filenames and no_filenames with print_filenames.
|
|---|
| 1024 | Enable out_file when both -r/-R and multiple arguments are given.
|
|---|
| 1025 |
|
|---|
| 1026 | 2019-10-12 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1027 |
|
|---|
| 1028 | grep: fix ‘grep -L ... >/dev/null’ bug
|
|---|
| 1029 | Problem reported by Adam Sampson (Bug#37716).
|
|---|
| 1030 | * NEWS: Mention this.
|
|---|
| 1031 | * src/grep.c (grepdesc): Don’t assume that stdout being /dev/null
|
|---|
| 1032 | means list_files == LISTFILES_NONE.
|
|---|
| 1033 | (main): Do not change list_files merely because stdout is /dev/null.
|
|---|
| 1034 | * tests/skip-read: Test for this bug.
|
|---|
| 1035 |
|
|---|
| 1036 | 2019-10-03 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1037 |
|
|---|
| 1038 | grep: tighten -i doc
|
|---|
| 1039 | * doc/grep.in.1:
|
|---|
| 1040 | * doc/grep.texi (Matching Control):
|
|---|
| 1041 | * src/grep.c (usage):
|
|---|
| 1042 | Make it clearer that -i affects patterns and data, but not
|
|---|
| 1043 | file names (Bug#37604).
|
|---|
| 1044 |
|
|---|
| 1045 | 2019-03-10 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1046 |
|
|---|
| 1047 | maint: fix “/src/grep: No such file or directory”
|
|---|
| 1048 | Problem reported by Jim Meyering in:
|
|---|
| 1049 | https://lists.gnu.org/r/grep-devel/2019-02/msg00000.html
|
|---|
| 1050 | * NEWS: Mention the change.
|
|---|
| 1051 | * configure.ac (fn_grep): Remove. This old attempt to fix
|
|---|
| 1052 | <https://savannah.gnu.org/bugs/?31646> wasn’t working anyway,
|
|---|
| 1053 | since subprograms didn’t grok fn_grep. People building on Solaris
|
|---|
| 1054 | will need a working grep, which is reasonably standard nowadays.
|
|---|
| 1055 | (GREP, EGREP): Do not override. This way, we test the
|
|---|
| 1056 | newly-built grep only when running ‘make test’ and suchlike.
|
|---|
| 1057 | Instead, output a hopefully-helpful diagnostic if the
|
|---|
| 1058 | system 'grep' does not work.
|
|---|
| 1059 |
|
|---|
| 1060 | 2019-02-18 Jim Meyering <meyering@fb.com>
|
|---|
| 1061 |
|
|---|
| 1062 | tests: avoid false positive upon stack overflow
|
|---|
| 1063 | * tests/pcre-jitstack: Don't let a stack overflow evoke a false
|
|---|
| 1064 | failure. This test is to ensure there is no internal PCRE error.
|
|---|
| 1065 | Reported by Andreas Schwab in http://bugs.gnu.org/34370
|
|---|
| 1066 |
|
|---|
| 1067 | 2019-02-16 Jim Meyering <meyering@fb.com>
|
|---|
| 1068 |
|
|---|
| 1069 | build: avoid build failure with --enable-gcc-warnings
|
|---|
| 1070 | * src/kwset.c (bmexec_trans): Define with _GL_ATTRIBUTE_PURE,
|
|---|
| 1071 | per suggestion from recent gcc snapshot.
|
|---|
| 1072 |
|
|---|
| 1073 | 2019-02-03 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1074 |
|
|---|
| 1075 | doc: clarify --exclude globbing
|
|---|
| 1076 | Problem reported by Paul Jackson.
|
|---|
| 1077 | * doc/grep.in.1:
|
|---|
| 1078 | * doc/grep.texi (File and Directory Selection):
|
|---|
| 1079 | Clarify how --exclude globbing works.
|
|---|
| 1080 |
|
|---|
| 1081 | grep: parse --color arg independent of locale
|
|---|
| 1082 | This is a better fix for Bug#34285.
|
|---|
| 1083 | * bootstrap.conf (gnulib_modules): Add c-strcase.
|
|---|
| 1084 | * src/grep.c: Include c-strcase.h, not strings.h.
|
|---|
| 1085 | (main): Use c_strcasecmp, not strcasecmp.
|
|---|
| 1086 |
|
|---|
| 1087 | 2019-02-02 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1088 |
|
|---|
| 1089 | grep: fix grep.c includes
|
|---|
| 1090 | * src/grep.c: Include strings.h; problem reported by David
|
|---|
| 1091 | Monniaux (Bug#34285). Do not include fcntl.h, as system.h does
|
|---|
| 1092 | that for us.h
|
|---|
| 1093 |
|
|---|
| 1094 | build: update gnulib submodule to latest
|
|---|
| 1095 |
|
|---|
| 1096 | 2019-01-20 Jim Meyering <meyering@fb.com>
|
|---|
| 1097 |
|
|---|
| 1098 | build: ensure no VLA is used
|
|---|
| 1099 | Cause developer builds to fail for any use of a VLA.
|
|---|
| 1100 | VLAs (variable length arrays) limit portability.
|
|---|
| 1101 | * configure.ac (nw): Remove -Wvla from the list of disabled warnings,
|
|---|
| 1102 | thus enabling the warning when configured with --enable-gcc-warnings.
|
|---|
| 1103 | (GNULIB_NO_VLA) Define, disabling use of VLAs in gnulib. This commit
|
|---|
| 1104 | is functionally equivalent to coreutils' v8.30-44-gd26dece5d.
|
|---|
| 1105 |
|
|---|
| 1106 | build: update gnulib to latest
|
|---|
| 1107 |
|
|---|
| 1108 | 2019-01-20 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1109 |
|
|---|
| 1110 | doc: --binary-files update in man page
|
|---|
| 1111 | * doc/grep.in.1: Adjust --binary-files description to match that
|
|---|
| 1112 | in doc/grep.texi. When I updated the documentation in
|
|---|
| 1113 | 2016-09-09T01:33:14!eggert@cs.ucla.edu I forgot to update the man
|
|---|
| 1114 | page accordingly (Bug#33898).
|
|---|
| 1115 |
|
|---|
| 1116 | grep: simplify pcresearch.c ifdefs
|
|---|
| 1117 | This fixes a warning if PCRE is not used (Bug#34054).
|
|---|
| 1118 | * configure.ac (USE_PCRE): New conditional.
|
|---|
| 1119 | * src/Makefile.am (grep_SOURCES) [!USE_PCRE]: Omit pcresearch.c.
|
|---|
| 1120 | * src/grep.c (matchers) [!HAVE_LIBPCRE]: Omit perl matcher.
|
|---|
| 1121 | (setmatcher) [!HAVE_LIBPCRE]: If helpful, mention
|
|---|
| 1122 | --disable-perl-regexp in diagnostic.
|
|---|
| 1123 | * src/pcresearch.c: Simplify by assuming HAVE_LIBPCRE.
|
|---|
| 1124 |
|
|---|
| 1125 | 2019-01-01 Jim Meyering <meyering@fb.com>
|
|---|
| 1126 |
|
|---|
| 1127 | maint: update all copyright dates via "make update-copyright"
|
|---|
| 1128 | * gnulib: Also update submodule for its copyright updates.
|
|---|
| 1129 |
|
|---|
| 1130 | 2018-12-20 Jim Meyering <meyering@fb.com>
|
|---|
| 1131 |
|
|---|
| 1132 | doc: fix the bug-introduced version in 3.3's announcement
|
|---|
| 1133 | * NEWS: Correct bug-introduced version (s/2.3/3.2/).
|
|---|
| 1134 | * cfg.mk (old_NEWS_hash): Updating old news, we must also udpate this.
|
|---|
| 1135 |
|
|---|
| 1136 | maint: post-release administrivia
|
|---|
| 1137 | * NEWS: Add header line for next release.
|
|---|
| 1138 | * .prev-version: Record previous version.
|
|---|
| 1139 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 1140 |
|
|---|
| 1141 | version 3.3
|
|---|
| 1142 | * NEWS: Record release date.
|
|---|
| 1143 |
|
|---|
| 1144 | grep: fix \b DFA-bug in C locale
|
|---|
| 1145 | Under some conditions, \b would mistakenly fail to match, e.g.
|
|---|
| 1146 | echo 123-x|LC_ALL=C grep '.\bx'
|
|---|
| 1147 | * NEWS (Bug fixes): Mention it
|
|---|
| 1148 | * gnulib: Update to latest, for DFA regression fix.
|
|---|
| 1149 | * tests/word-delim-multibyte: Add a test for the dfa.c regression.
|
|---|
| 1150 |
|
|---|
| 1151 | 2018-12-20 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1152 |
|
|---|
| 1153 | grep: fit --version authorship into 80
|
|---|
| 1154 | * src/grep.c (AUTHORS): Remove.
|
|---|
| 1155 | (main): Output the authorship info ourselves instead of having
|
|---|
| 1156 | version_etc do it. This is better for i18n anyway.
|
|---|
| 1157 |
|
|---|
| 1158 | build: update gnulib submodule to latest
|
|---|
| 1159 |
|
|---|
| 1160 | 2018-12-20 Jim Meyering <meyering@fb.com>
|
|---|
| 1161 |
|
|---|
| 1162 | maint: post-release administrivia
|
|---|
| 1163 | * NEWS: Add header line for next release.
|
|---|
| 1164 | * .prev-version: Record previous version.
|
|---|
| 1165 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 1166 |
|
|---|
| 1167 | version 3.2
|
|---|
| 1168 | * NEWS: Record release date.
|
|---|
| 1169 |
|
|---|
| 1170 | 2018-12-18 Jim Meyering <meyering@fb.com>
|
|---|
| 1171 |
|
|---|
| 1172 | build: update gnulib for c-stack fix
|
|---|
| 1173 |
|
|---|
| 1174 | 2018-12-17 Bruno Haible <bruno@clisp.org>
|
|---|
| 1175 |
|
|---|
| 1176 | tests: stack-overflow: avoid unwarranted test failure on some hosts
|
|---|
| 1177 | * tests/stack-overflow: Use ulimit to limit stack size. Otherwise,
|
|---|
| 1178 | at least on gcc113, grep would fail to overflow its stack, so this
|
|---|
| 1179 | test would fail to find the required diagnostic and would fail.
|
|---|
| 1180 |
|
|---|
| 1181 | 2018-12-16 Jim Meyering <meyering@fb.com>
|
|---|
| 1182 |
|
|---|
| 1183 | tests: reenable the surrogate-pair test
|
|---|
| 1184 | This reverts commit bdb98cec2e7bf255e1d00eaf8be16299f7bf571e,
|
|---|
| 1185 | but adding the comment changes suggested by Bruno Haible in
|
|---|
| 1186 | https://lists.gnu.org/r/grep-devel/2018-12/msg00037.html
|
|---|
| 1187 | * tests/surrogate-pair: New file.
|
|---|
| 1188 | * tests/Makefile.am (TESTS): List it.
|
|---|
| 1189 |
|
|---|
| 1190 | 2018-12-16 Bruno Haible <bruno@clisp.org>
|
|---|
| 1191 |
|
|---|
| 1192 | tests: stackoverflow: fix test failure on HardenedBSD 11
|
|---|
| 1193 | * tests/stack-overflow: Try up to 10 million opening parentheses.
|
|---|
| 1194 |
|
|---|
| 1195 | 2018-12-16 Jim Meyering <meyering@fb.com>
|
|---|
| 1196 |
|
|---|
| 1197 | tests: remove stale surrogate-pair test
|
|---|
| 1198 | The cygwin-specific code for surrogate pairs was first disconnected
|
|---|
| 1199 | via v2.21-62-g936c904 and later removed as part of a then-unused
|
|---|
| 1200 | function via v2.24-12-g704de87. So now I'm removing the test, too.
|
|---|
| 1201 | If someone thinks it important and would like to revive it, please do.
|
|---|
| 1202 | * tests/surrogate-pair: Remove file.
|
|---|
| 1203 | * tests/Makefile.am (TESTS): Remove it.
|
|---|
| 1204 |
|
|---|
| 1205 | 2018-12-16 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1206 |
|
|---|
| 1207 | build: update gnulib submodule to latest
|
|---|
| 1208 |
|
|---|
| 1209 | 2018-12-15 Jim Meyering <meyering@fb.com>
|
|---|
| 1210 |
|
|---|
| 1211 | tests: stack-overflow: handle the case of success without the diagnostic
|
|---|
| 1212 | * tests/stack-overflow: Do not always require a stack
|
|---|
| 1213 | overflow diagnostic.
|
|---|
| 1214 |
|
|---|
| 1215 | build: update gnulib to latest
|
|---|
| 1216 | * gnulib: Update to latest, to pull in code that now compensates for
|
|---|
| 1217 | a bug in glibc-2.27 and prior.
|
|---|
| 1218 |
|
|---|
| 1219 | build: make the autoconf-2.63 requirement explicit
|
|---|
| 1220 | * configure.ac: AC_PREREQ: Require 2.63, not 2.59. And quote properly.
|
|---|
| 1221 | Autoconf-2.63 has been required for some time via gnulib.
|
|---|
| 1222 | This merely makes it explicit.
|
|---|
| 1223 |
|
|---|
| 1224 | 2018-12-15 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1225 |
|
|---|
| 1226 | tests: fix diagnostic typo
|
|---|
| 1227 | Fix by Bruno Haible in:
|
|---|
| 1228 | https://lists.gnu.org/r/grep-devel/2018-12/msg00003.html
|
|---|
| 1229 | * tests/init.cfg (envvar_check_fail): Fix diagnostic.
|
|---|
| 1230 |
|
|---|
| 1231 | 2018-11-24 Jim Meyering <meyering@fb.com>
|
|---|
| 1232 |
|
|---|
| 1233 | tests: stack-overflow: avoid false failure
|
|---|
| 1234 | * tests/stack-overflow: This test would fail to elicit a stack overflow
|
|---|
| 1235 | diagnostic on some OS X systems. Rewrite to iterate, gradually increasing
|
|---|
| 1236 | the size of the input regex, stopping when grep emits the desired diagnostic
|
|---|
| 1237 | or the size reaches a reasonable limit.
|
|---|
| 1238 |
|
|---|
| 1239 | 2018-10-16 Jim Meyering <meyering@fb.com>
|
|---|
| 1240 |
|
|---|
| 1241 | tests: reduce the sole failing test
|
|---|
| 1242 | * tests/backref-alt: Significantly reduce abort-inducing input.
|
|---|
| 1243 |
|
|---|
| 1244 | build: update gnulib to latest; also update bootstrap and init.sh
|
|---|
| 1245 |
|
|---|
| 1246 | 2018-10-13 Jim Meyering <meyering@fb.com>
|
|---|
| 1247 |
|
|---|
| 1248 | doc: NEWS: mention performance improvements
|
|---|
| 1249 | * NEWS (Improvements): Mention them.
|
|---|
| 1250 |
|
|---|
| 1251 | 2018-10-13 Jim Meyering <meyering@fb.com>
|
|---|
| 1252 |
|
|---|
| 1253 | grep: triple initial buffer size: 32k->96k
|
|---|
| 1254 | Changing 32k to 96k gives a 3-23% performance improvement.
|
|---|
| 1255 | All timings ran with this diff on top of commit v3.1-39-g7179b21:
|
|---|
| 1256 |
|
|---|
| 1257 | for n in 32 64 96 128; do
|
|---|
| 1258 | echo n=$n
|
|---|
| 1259 | perl -pi -e 's/(INITIAL_BUFSIZE =) \d+/$1 '$n/ src/grep.c &&
|
|---|
| 1260 | make AM_CFLAGS=-O3 WERROR_CFLAGS= >& makerr-$n &&
|
|---|
| 1261 | for needle in 1f2 1f298lkjskjhahjklkj34; do
|
|---|
| 1262 | echo " needle=$needle"
|
|---|
| 1263 | for i in $(seq 10); do
|
|---|
| 1264 | env MALLOC_PERTURB_= time -qf%e src/grep $needle w2000
|
|---|
| 1265 | done 2>&1 |sort -g | tee >(head -1|sed 's/^/ /') > .time-${n}KB-$needle
|
|---|
| 1266 | done
|
|---|
| 1267 | done
|
|---|
| 1268 |
|
|---|
| 1269 | Tested searchs: search for a short literal pattern that is not
|
|---|
| 1270 | present in 9.3GB file containing 2000 copies of /usr/dict/words
|
|---|
| 1271 | created via this:
|
|---|
| 1272 | ln -s /usr/share/dict/words k && cat $(yes k|head -2000) > w2000
|
|---|
| 1273 | I ran this command:
|
|---|
| 1274 | env MALLOC_PERTURB_= time src/grep 1f2 w2000
|
|---|
| 1275 | old(32k) vs new elapsed time, best of 10 trials (gcc-9.0.0 20180831, -O3):
|
|---|
| 1276 | 32k 64k 96k(%incr) 128k CPU
|
|---|
| 1277 | 1.25 1.18 1.16( 7.2) 1.20 i7-4770S@3.10GHz cache=8MB
|
|---|
| 1278 | 1.21 1.16 1.17( 3.3) 1.19 Xeon(R) E3-1505M v5 @ 2.80GHz cache=8MB
|
|---|
| 1279 | 2.36 2.29 2.29( 3.0) 2.36 Xeon(R) E5-2680 v4 @ 2.40GHz cache=32MB
|
|---|
| 1280 | 1.40 1.32 1.31( 6.4) 1.33 i5-6260U @ 1.80GHz cache=4MB
|
|---|
| 1281 | 1.31 1.26 1.24( 5.3) 1.23 AMD FX(tm)-4100 cache=2MB (with only 1000 copies)
|
|---|
| 1282 |
|
|---|
| 1283 | Searching for a longer string: 1f298lkjskjhahjklkj34
|
|---|
| 1284 | 2.03 1.76 1.61(20.7) 1.53 i7-4770S@3.10GHz cache=8MB
|
|---|
| 1285 | 1.95 1.70 1.56(20.0) 1.51 Xeon(R) E3-1505M v5 @ 2.80GHz
|
|---|
| 1286 | 3.27 2.98 2.84(13.1) 3.02 Xeon(R) E5-2680 v4 @ 2.40GHz
|
|---|
| 1287 | 2.48 2.12 1.91(23.0) 1.80 i5-6260U @ 1.80GHz cache=4MB
|
|---|
| 1288 | 1.72 1.54 1.46(15.1) 1.41 AMD FX(tm)-4100 cache=2MB
|
|---|
| 1289 |
|
|---|
| 1290 | * src/grep.c (INITIAL_BUFSIZE): Triple it: 32kB -> 96kB
|
|---|
| 1291 |
|
|---|
| 1292 | 2018-09-28 Barret Rhoden <brho@cs.berkeley.edu> (tiny change)
|
|---|
| 1293 |
|
|---|
| 1294 | maint: fix cross-compiling problem
|
|---|
| 1295 | * cfg.mk (PATH): Omit if cross-compiling (Bug#32866).
|
|---|
| 1296 |
|
|---|
| 1297 | 2018-09-28 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1298 |
|
|---|
| 1299 | build: update gnulib submodule to latest
|
|---|
| 1300 |
|
|---|
| 1301 | grep: fix usage 80-column glitch
|
|---|
| 1302 | * src/grep.c (usage): Do not go over 80 columns in the source
|
|---|
| 1303 | code, to pacify "make dist".
|
|---|
| 1304 |
|
|---|
| 1305 | 2018-09-19 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1306 |
|
|---|
| 1307 | maint: update bootstrap
|
|---|
| 1308 | * bootstrap: Copy from Gnulib.
|
|---|
| 1309 |
|
|---|
| 1310 | maint: fix build failure
|
|---|
| 1311 | Problem found by OpenCSW buildbot; the bug also occurs on GNU/Linux
|
|---|
| 1312 | build platforms. The symptom is “system.h:26:24: fatal error:
|
|---|
| 1313 | configmake.h: No such file or directory”. See:
|
|---|
| 1314 | https://buildfarm.opencsw.org/buildbot/builders/ggrep-solaris10-sparc/builds/107
|
|---|
| 1315 | * bootstrap.conf: Add configmake, a dependency that was formerly brought
|
|---|
| 1316 | in only by accident.
|
|---|
| 1317 |
|
|---|
| 1318 | 2018-09-18 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1319 |
|
|---|
| 1320 | build: update gnulib submodule to latest
|
|---|
| 1321 |
|
|---|
| 1322 | 2018-08-09 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1323 |
|
|---|
| 1324 | tests: fix comment
|
|---|
| 1325 |
|
|---|
| 1326 | tests: backref-alt works with glibc 2.28
|
|---|
| 1327 | Problem reported by Jaroslav Skarvada (Bug#32409).
|
|---|
| 1328 | * tests/Makefile.am (XFAIL_TESTS) [!USE_INCLUDED_REGEX]:
|
|---|
| 1329 | Don’t add backref-alt, since this bug is fixed in glibc 2.28.
|
|---|
| 1330 |
|
|---|
| 1331 | 2018-05-11 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1332 |
|
|---|
| 1333 | doc: “pattern” vs “patterns”
|
|---|
| 1334 | * doc/grep.in.1, doc/grep.texi, src/grep.c (usage): Be more
|
|---|
| 1335 | careful about saying that an argument or option specifies one or
|
|---|
| 1336 | more patterns, not just a single pattern. Problem reported by Kaz
|
|---|
| 1337 | Kylheku (Bug#31400).
|
|---|
| 1338 |
|
|---|
| 1339 | build: update gnulib submodule to latest
|
|---|
| 1340 |
|
|---|
| 1341 | 2018-04-21 Jim Meyering <meyering@fb.com>
|
|---|
| 1342 |
|
|---|
| 1343 | maint: fix new syntax-check (sc_long_lines) failure
|
|---|
| 1344 | * HACKING: Shorten line by one byte to fit in 80 columns.
|
|---|
| 1345 |
|
|---|
| 1346 | build: update gnulib to latest
|
|---|
| 1347 |
|
|---|
| 1348 | 2018-04-21 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1349 |
|
|---|
| 1350 | doc: fix font typo
|
|---|
| 1351 |
|
|---|
| 1352 | maint: update URLs
|
|---|
| 1353 | Mostly this is just changing http: to https:.
|
|---|
| 1354 | In one or two places it removes no-longer-useful URLs.
|
|---|
| 1355 |
|
|---|
| 1356 | doc: man-page format fixes
|
|---|
| 1357 | * doc/grep.in.1: Fix minor formatting glitches, e.g., extra
|
|---|
| 1358 | space after [...] because groff thought it was a sentence end.
|
|---|
| 1359 | Problem reported by Ingo Schwarze (Bug#31228#11).
|
|---|
| 1360 |
|
|---|
| 1361 | 2018-04-20 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1362 |
|
|---|
| 1363 | doc: mention encoding errors
|
|---|
| 1364 | This attempts to document the encoding-error problem more
|
|---|
| 1365 | precisely (Bug#30326).
|
|---|
| 1366 | * doc/grep.in.1, doc/grep.texi: Mention that the behavior of
|
|---|
| 1367 | patterns like ‘.’ is not specified on encoding errors.
|
|---|
| 1368 |
|
|---|
| 1369 | doc: port better to mandoc
|
|---|
| 1370 | * doc/grep.in.1: Check for groff and its macro packages
|
|---|
| 1371 | independently, as groff can be used with non-groff macro packages.
|
|---|
| 1372 | Use an-ext style macros rather than www.tmac style, as this should
|
|---|
| 1373 | be more portable to mandoc. Problem reported by Laura Morales and
|
|---|
| 1374 | Ingo Schwarze (Bug#31228).
|
|---|
| 1375 |
|
|---|
| 1376 | 2018-02-16 Jim Meyering <meyering@fb.com>
|
|---|
| 1377 |
|
|---|
| 1378 | maint: avoid new syntax-check failure
|
|---|
| 1379 | * cfg.mk (old_NEWS_hash): Update, to accommodate v3.1-20-g63d4174's
|
|---|
| 1380 | typo fix.
|
|---|
| 1381 |
|
|---|
| 1382 | doc: clarify that PCRE support is here to stay
|
|---|
| 1383 | * doc/grep.texi (grep Programs): Clarify: it's not PCRE support
|
|---|
| 1384 | that is experimental, but its combination with --null-data (-z).
|
|---|
| 1385 |
|
|---|
| 1386 | 2018-02-05 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1387 |
|
|---|
| 1388 | maint: fix typo
|
|---|
| 1389 |
|
|---|
| 1390 | 2018-01-06 Jim Meyering <meyering@fb.com>
|
|---|
| 1391 |
|
|---|
| 1392 | maint: update gnulib and copyright dates for 2018
|
|---|
| 1393 | * gnulib: Update to latest.
|
|---|
| 1394 | * all files: Run "make update-copyright".
|
|---|
| 1395 | * bootstrap: Update from gnulib.
|
|---|
| 1396 |
|
|---|
| 1397 | 2017-12-17 Jim Meyering <meyering@fb.com>
|
|---|
| 1398 |
|
|---|
| 1399 | build: link with -lsigsegv, when c-stack module requires it
|
|---|
| 1400 | * src/Makefile.am (grep_LDADD): Add $(LIBCSTACK).
|
|---|
| 1401 | Otherwise, on at least Debian and Arch-based systems, linking would
|
|---|
| 1402 | fail with diagnostics like these:
|
|---|
| 1403 | c-stack.c:207: undefined reference to `stackoverflow_install_handler'
|
|---|
| 1404 | c-stack.c:216: undefined reference to `sigsegv_install_handler'
|
|---|
| 1405 | Reported by Jeremy Feusi.
|
|---|
| 1406 |
|
|---|
| 1407 | build: suppress sig-handler.h's -Wcast-function-type warning
|
|---|
| 1408 | * configure.ac (WERROR_CFLAGS): Add -Wno-cast-function-type
|
|---|
| 1409 | to suppress warning about sig-handler.h's sa_handler_t cast:
|
|---|
| 1410 | sig-handler.h: In function 'get_handler':
|
|---|
| 1411 | sig-handler.h:47:12: error: cast between incompatible function\
|
|---|
| 1412 | types from 'void (* const)(int, siginfo_t *, void *)'\
|
|---|
| 1413 | {aka 'void (* const)(int, struct <anonymous> *, void *)'}\
|
|---|
| 1414 | to 'void (*)(int)' [-Werror=cast-function-type]
|
|---|
| 1415 | return (sa_handler_t) a->sa_sigaction;
|
|---|
| 1416 |
|
|---|
| 1417 | 2017-12-16 Jim Meyering <meyering@fb.com>
|
|---|
| 1418 |
|
|---|
| 1419 | grep: diagnose stack overflow rather than segfaulting
|
|---|
| 1420 | * bootstrap.conf (gnulib_modules): Add c-stack.
|
|---|
| 1421 | * src/grep.c: Include "c-stack.h".
|
|---|
| 1422 | (main): Call c_stack_action (NULL);
|
|---|
| 1423 | * tests/stack-overflow: New file.
|
|---|
| 1424 | * tests/Makefile.am (TESTS): Add name of new file.
|
|---|
| 1425 | * NEWS (Improvements): Mention it.
|
|---|
| 1426 | Interestingly, this bug does not afflict grep-2.5.4 or prior,
|
|---|
| 1427 | so it appeared to have been introduced with grep-2.6. However,
|
|---|
| 1428 | the origin is in glibc's regexp compiler, and I tracked it to
|
|---|
| 1429 | stack-aware parsing that was removed from glibc's regexp in 2002.
|
|---|
| 1430 | However, grep-2.5.4 was released in 2009. That version worked
|
|---|
| 1431 | (and still works, now) because it included and (by default) used
|
|---|
| 1432 | an old copy of glibc's regexp code.
|
|---|
| 1433 | Jeremy Feusi reported the grep segfault in https://bugs.gnu.org/29666.
|
|---|
| 1434 | I reported the glibc regexp bug in
|
|---|
| 1435 | https://sourceware.org/bugzilla/show_bug.cgi?id=22620
|
|---|
| 1436 |
|
|---|
| 1437 | 2017-11-26 Stephan T. Lavavej <stl@nuwen.net>
|
|---|
| 1438 |
|
|---|
| 1439 | grep: fix directory recursion on MS-Windows
|
|---|
| 1440 | gnulib recently gained a module, windows-stat-inodes, that fixes
|
|---|
| 1441 | directory recursion on MS-Windows. No changes to grep's C sources are
|
|---|
| 1442 | required; grep simply needs to request the module during configuration.
|
|---|
| 1443 |
|
|---|
| 1444 | When grep requests this module, its configure script will gain the
|
|---|
| 1445 | behavior that was implemented in windows-stat-inodes.m4. This detects
|
|---|
| 1446 | mingw and sets WINDOWS_STAT_INODES=1. All other platforms are
|
|---|
| 1447 | unaffected, setting WINDOWS_STAT_INODES=0 (which is what's happening
|
|---|
| 1448 | in the absence of this patch).
|
|---|
| 1449 |
|
|---|
| 1450 | * bootstrap.conf (gnulib_modules): Add windows-stat-inodes.
|
|---|
| 1451 | * NEWS (Bug fixes): Mention it.
|
|---|
| 1452 | Thanks to Pär Björklund who diagnosed the problem as involving inodes,
|
|---|
| 1453 | and thanks to Václav Haisman who provided the bootstrap.conf patch.
|
|---|
| 1454 |
|
|---|
| 1455 | 2017-11-25 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1456 |
|
|---|
| 1457 | grep: port better to Adélie GNU/Linux 64-bit ppc
|
|---|
| 1458 | Problem reported by A. Wilcox (Bug#29446).
|
|---|
| 1459 | * src/pcresearch.c (PCRE_EXTRA_MATCH_LIMIT_RECURSION)
|
|---|
| 1460 | (PCRE_STUDY_EXTRA_NEEDED): Default to 0.
|
|---|
| 1461 | (jit_exec): If we run up against the recursion limit,
|
|---|
| 1462 | double it (if possible) and try again.
|
|---|
| 1463 | (Pcompile): Also specify PCRE_STUDY_EXTRA_NEEDED so that
|
|---|
| 1464 | pc->extra is not null.
|
|---|
| 1465 |
|
|---|
| 1466 | 2017-11-03 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1467 |
|
|---|
| 1468 | grep: omit a dup 'const'
|
|---|
| 1469 | * src/grep.c (matchers): Omit duplicate 'const'.
|
|---|
| 1470 |
|
|---|
| 1471 | 2017-10-13 Bernhard Voelker <mail@bernhard-voelker.de>
|
|---|
| 1472 |
|
|---|
| 1473 | doc: document the option delimiter '--'
|
|---|
| 1474 | * doc/grep.texi (Other options): Do the above.
|
|---|
| 1475 | Reported in https://lists.opensuse.org/opensuse/2017-03/msg00411.html
|
|---|
| 1476 | This addresses http://bugs.gnu.org/26139
|
|---|
| 1477 |
|
|---|
| 1478 | 2017-08-21 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1479 |
|
|---|
| 1480 | build: update gnulib submodule to latest
|
|---|
| 1481 |
|
|---|
| 1482 | Pacify GCC 5.4
|
|---|
| 1483 | * src/grep.c (grepdesc): Rework to pacify GCC 5.4 warning
|
|---|
| 1484 | about logical not.
|
|---|
| 1485 |
|
|---|
| 1486 | 2017-08-20 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1487 |
|
|---|
| 1488 | build: update gnulib submodule to latest
|
|---|
| 1489 |
|
|---|
| 1490 | 2017-08-17 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1491 |
|
|---|
| 1492 | grep: -L exits with status 0 if a file is selected
|
|---|
| 1493 | Problem reported by Anthony Sottile (Bug#28105).
|
|---|
| 1494 | * NEWS, doc/grep.texi (Exit Status), src/grep.c (usage): Document this.
|
|---|
| 1495 | * src/grep.c (grepdesc): Implement it.
|
|---|
| 1496 | * tests/skip-read: Test it.
|
|---|
| 1497 |
|
|---|
| 1498 | build: update gnulib submodule to latest
|
|---|
| 1499 |
|
|---|
| 1500 | 2017-08-13 Jim Meyering <meyering@fb.com>
|
|---|
| 1501 |
|
|---|
| 1502 | maint: avoid newly-introduced syntax-check failure
|
|---|
| 1503 | * src/grep.c (usage): Shorten --help line to 80, so
|
|---|
| 1504 | "make syntax-check" passes once again.
|
|---|
| 1505 |
|
|---|
| 1506 | 2017-08-03 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1507 |
|
|---|
| 1508 | doc: improve -o help
|
|---|
| 1509 | * src/grep.c (usage): Document that -o outputs only nonempty
|
|---|
| 1510 | matches (Bug#27931).
|
|---|
| 1511 |
|
|---|
| 1512 | 2017-07-26 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1513 |
|
|---|
| 1514 | tests: add Bug#27838 test case
|
|---|
| 1515 | * tests/backref-alt: New test case from a fuzzer.
|
|---|
| 1516 |
|
|---|
| 1517 | 2017-07-25 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1518 |
|
|---|
| 1519 | doc: distinguish -w from \<...\>
|
|---|
| 1520 | * doc/grep.texi (Matching Control):
|
|---|
| 1521 | Give example of why -w differs from \<...\> (Bug#27813).
|
|---|
| 1522 |
|
|---|
| 1523 | 2017-07-11 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1524 |
|
|---|
| 1525 | doc: define Dt string in man page
|
|---|
| 1526 | Problem reported by Bjarni I. Gislason via Santiago R.R. (Bug#27651).
|
|---|
| 1527 | * doc/grep.in.1 (dT): New macro.
|
|---|
| 1528 | (Dt): Define this string.
|
|---|
| 1529 |
|
|---|
| 1530 | 2017-07-02 Jim Meyering <meyering@fb.com>
|
|---|
| 1531 |
|
|---|
| 1532 | maint: post-release administrivia
|
|---|
| 1533 | * NEWS: Add header line for next release.
|
|---|
| 1534 | * .prev-version: Record previous version.
|
|---|
| 1535 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 1536 |
|
|---|
| 1537 | version 3.1
|
|---|
| 1538 | * NEWS: Record release date.
|
|---|
| 1539 |
|
|---|
| 1540 | 2017-07-01 Jim Meyering <meyering@fb.com>
|
|---|
| 1541 |
|
|---|
| 1542 | tests: avoid false failures when run in qemu user mode
|
|---|
| 1543 | * tests/filename-lineno.pl: Derive the program name that grep
|
|---|
| 1544 | will use in diagnostics, based on a suggestion from Assaf Gordon.
|
|---|
| 1545 | * tests/in-eq-out-infloop: Similar: accept an arbitrary "command_name: "
|
|---|
| 1546 | prefix on checked diagnostics, rather than requiring "grep: ".
|
|---|
| 1547 | * tests/reversed-range-endpoints: Likewise.
|
|---|
| 1548 | * tests/write-error-msg: Likewise.
|
|---|
| 1549 | Reported by Bruno Haible in http://bugs.gnu.org/27532
|
|---|
| 1550 |
|
|---|
| 1551 | 2017-06-25 Jim Meyering <meyering@fb.com>
|
|---|
| 1552 |
|
|---|
| 1553 | gnulib: update to latest
|
|---|
| 1554 | * gnulib: Update to latest for these portability fixes:
|
|---|
| 1555 | - stat: port to xlc 12.01
|
|---|
| 1556 | - xalloc-oversized: port to icc
|
|---|
| 1557 |
|
|---|
| 1558 | doc: fix another typo
|
|---|
| 1559 | * doc/grep.texi (File and Directory Selection): Fix typo: s/afer/after/
|
|---|
| 1560 |
|
|---|
| 1561 | 2017-06-24 Jim Meyering <meyering@fb.com>
|
|---|
| 1562 |
|
|---|
| 1563 | doc: stop calling --perl-regexp (-P) "highly" experimental
|
|---|
| 1564 | Use wording that is less likely to make readers think that
|
|---|
| 1565 | support for -P may be removed.
|
|---|
| 1566 | * doc/grep.in.1: s/highly experimental/experimental/
|
|---|
| 1567 | * doc/grep.texi: Likewise.
|
|---|
| 1568 | Suggested by Evan Sheahan.
|
|---|
| 1569 |
|
|---|
| 1570 | 2017-06-21 Jim Meyering <meyering@fb.com>
|
|---|
| 1571 |
|
|---|
| 1572 | doc: correct typo
|
|---|
| 1573 | * doc/grep.texi (Performance): s/suprisingly/surprisingly/
|
|---|
| 1574 |
|
|---|
| 1575 | gnulib: update to latest
|
|---|
| 1576 |
|
|---|
| 1577 | 2017-06-21 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1578 |
|
|---|
| 1579 | grep: -m no longer cuts off trailing context
|
|---|
| 1580 | Problem reported by Markus Jochim (Bug#26254).
|
|---|
| 1581 | * NEWS, doc/grep.texi (General Output Control): Document this.
|
|---|
| 1582 | * src/grep.c (prpending): Selected lines no longer cut off context.
|
|---|
| 1583 | (usage): Say "selected" instead of "matching", where appropriate.
|
|---|
| 1584 | * tests/foad1, tests/max-count-vs-context, tests/yesno:
|
|---|
| 1585 | Adjust to match new behavior.
|
|---|
| 1586 |
|
|---|
| 1587 | 2017-05-31 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1588 |
|
|---|
| 1589 | Document grep performance
|
|---|
| 1590 | * doc/grep.texi (Performance): New section.
|
|---|
| 1591 |
|
|---|
| 1592 | build: update gnulib submodule to latest
|
|---|
| 1593 |
|
|---|
| 1594 | 2017-05-21 Jim Meyering <meyering@fb.com>
|
|---|
| 1595 |
|
|---|
| 1596 | maint: make the announcement template Cc the devel- list
|
|---|
| 1597 | * cfg.mk (announcement_Cc_): Define.
|
|---|
| 1598 |
|
|---|
| 1599 | gnulib: update to latest; and update tests/init.sh
|
|---|
| 1600 |
|
|---|
| 1601 | maint: accommodate GCC7's -Werror=duplicated-branches
|
|---|
| 1602 | * src/system.h (IGNORE_DUPLICATE_BRANCH_WARNING): Define.
|
|---|
| 1603 | * src/grep.c (grepfile): Use it.
|
|---|
| 1604 | * src/kwset.c (bmexec, acexec): Use it.
|
|---|
| 1605 |
|
|---|
| 1606 | maint: update to work with GCC7's -Werror=implicit-fallthrough=
|
|---|
| 1607 | * src/system.h (FALLTHROUGH): Define.
|
|---|
| 1608 | * src/grep.c (context_length_arg): Use new FALLTHROUGH macro in place
|
|---|
| 1609 | of comments
|
|---|
| 1610 | (fgrep_to_grep_pattern, try_fgrep_pattern, main): Likewise.
|
|---|
| 1611 |
|
|---|
| 1612 | 2017-05-13 Jim Meyering <meyering@fb.com>
|
|---|
| 1613 |
|
|---|
| 1614 | gnulib: update to latest and adapt src/kwset.c
|
|---|
| 1615 | * gnulib: Update to latest.
|
|---|
| 1616 | * src/kwset.c: Include "verify.h" for use of assume.
|
|---|
| 1617 |
|
|---|
| 1618 | 2017-03-22 Jim Meyering <meyering@fb.com>
|
|---|
| 1619 |
|
|---|
| 1620 | gnulib: update to latest for dfa [0-9] performance improvement
|
|---|
| 1621 | This pulls in the following change that is very relevant to grep:
|
|---|
| 1622 |
|
|---|
| 1623 | commit 6afba02d7869d39ed7f61981045ddbdcb2814101
|
|---|
| 1624 | Author: Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1625 | dfa: make [0-9] faster in non-C locales
|
|---|
| 1626 |
|
|---|
| 1627 | * gnulib: Update to latest.
|
|---|
| 1628 | * NEWS (Improvements): Describe the effect on grep.
|
|---|
| 1629 |
|
|---|
| 1630 | 2017-03-05 Jim Meyering <meyering@fb.com>
|
|---|
| 1631 |
|
|---|
| 1632 | build: use $(builddir), not $(srcdir)
|
|---|
| 1633 | * cfg.mk (PATH): Use $(builddir), so this also takes effect
|
|---|
| 1634 | in a non-srcdir build. Also, switch ${PATH} syntax to $(PATH).
|
|---|
| 1635 |
|
|---|
| 1636 | 2017-03-05 Juan Manuel Guerrero <juan.guerrero@gmx.de>
|
|---|
| 1637 |
|
|---|
| 1638 | build: use $(PATH_SEPARATOR), not ":" to augment PATH
|
|---|
| 1639 | * cfg.mk (PATH): Use $(PATH_SEPARATOR), for those systems that
|
|---|
| 1640 | use something other than ":".
|
|---|
| 1641 | * THANKS.in: Remove name, to avoid syntax-check failure due to
|
|---|
| 1642 | the duplicate, now that there is this commit.
|
|---|
| 1643 |
|
|---|
| 1644 | 2017-02-17 Jim Meyering <meyering@fb.com>
|
|---|
| 1645 |
|
|---|
| 1646 | maint: fix distcheck failure: remove stale dosbuf.c reference
|
|---|
| 1647 | * src/Makefile.am (EXTRA_DIST): Do not attempt to distribute
|
|---|
| 1648 | the recently deleted file, dosbuf.c.
|
|---|
| 1649 |
|
|---|
| 1650 | maint: fix new syntax-check errors
|
|---|
| 1651 | * po/POTFILES.in: Add lib/xbinary-io.c.
|
|---|
| 1652 | * cfg.mk (FILTER_LONG_LINES): Add TODO to the list of exempt files.
|
|---|
| 1653 |
|
|---|
| 1654 | 2017-02-16 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1655 |
|
|---|
| 1656 | Fix up recent -U patches
|
|---|
| 1657 | Inspired by a suggestion by Eric Blake (Bug#25707#17).
|
|---|
| 1658 | * bootstrap.conf (gnulib_modules): Add xbinary-io,
|
|---|
| 1659 | and remove binary-io and xfreopen.
|
|---|
| 1660 | * doc/grep.texi (Other Options):
|
|---|
| 1661 | Fix typo and reword to be a bit more general.
|
|---|
| 1662 | * src/grep.c: Include xbinary-io.h instead of xfreopen.h.
|
|---|
| 1663 | (grepfile): Open with O_BINARY if binary.
|
|---|
| 1664 | (grepdesc): No need for set_binary_mode now.
|
|---|
| 1665 | (grep_command_line_arg, main): Set stdin to binary mode if binary.
|
|---|
| 1666 | (main): Avoid unnecessary test of stdin == NULL.
|
|---|
| 1667 | Use xsetmode instead of xfreopen.
|
|---|
| 1668 | * src/system.h: Do not include binary-io.h.
|
|---|
| 1669 |
|
|---|
| 1670 | build: update gnulib submodule to latest
|
|---|
| 1671 |
|
|---|
| 1672 | Simplify -U on MS-Windows by removing guesswork
|
|---|
| 1673 | Suggested by Eric Blake (Bug#25707#11).
|
|---|
| 1674 | * NEWS, doc/grep.texi: Document this.
|
|---|
| 1675 | * src/dosbuf.c: Remove.
|
|---|
| 1676 | * bootstrap.conf (gnulib_modules): Add xfreopen.
|
|---|
| 1677 | * src/grep.c: Include xfreopen.h, not dosbuf.c.
|
|---|
| 1678 | (fillbuf, print_line_head): Do not undossify input.
|
|---|
| 1679 | (binary): New static var.
|
|---|
| 1680 | (grepdesc): Apply BINARY to input file.
|
|---|
| 1681 | (usage): Remove -u help.
|
|---|
| 1682 | (main): Set BINARY if -U, and apply it to stdout. Do nothing if -u.
|
|---|
| 1683 | With -f, apply BINARY to input file.
|
|---|
| 1684 |
|
|---|
| 1685 | 2017-02-16 Eric Blake <eblake@redhat.com>
|
|---|
| 1686 |
|
|---|
| 1687 | grep: don't forcefully strip carriage returns
|
|---|
| 1688 | Commit 5c92a54 made the mistaken assumption that using fopen("rt")
|
|---|
| 1689 | on platforms where O_TEXT is non-zero makes sense. However, POSIX
|
|---|
| 1690 | already requires fopen("r") to open a file in text mode, vs.
|
|---|
| 1691 | fopen("rb") when binary mode is wanted, and at least on Cygwin,
|
|---|
| 1692 | where it is possible to control whether a mount point is binary
|
|---|
| 1693 | or text by default (using just "r"), the use of fopen("rt") actively
|
|---|
| 1694 | breaks assumptions on a binary mount by silently corrupting any
|
|---|
| 1695 | carriage returns that are supposed to be preserved.
|
|---|
| 1696 |
|
|---|
| 1697 | * src/grep.c (main): Never use fopen("rt") (Bug#25707).
|
|---|
| 1698 |
|
|---|
| 1699 | 2017-02-13 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1700 |
|
|---|
| 1701 | Update TODO and doc
|
|---|
| 1702 | * TODO: Bring up-to-date and fix formatting glitches.
|
|---|
| 1703 | * doc/grep.in.1, doc/grep.texi: Fix minor glitches.
|
|---|
| 1704 | The above patches should address the same problems that recent
|
|---|
| 1705 | Debian doc patches address, albeit in a different way.
|
|---|
| 1706 |
|
|---|
| 1707 | 2017-02-12 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1708 |
|
|---|
| 1709 | doc: clarify default input (Bug#25651)
|
|---|
| 1710 | * doc/grep.in.1:
|
|---|
| 1711 | * src/grep.c (usage): Clarify default input when -r.
|
|---|
| 1712 | * src/grep.c (usage): Do not bother documenting egrep and fgrep;
|
|---|
| 1713 | the manual is enough.
|
|---|
| 1714 |
|
|---|
| 1715 | 2017-02-09 Jim Meyering <meyering@fb.com>
|
|---|
| 1716 |
|
|---|
| 1717 | maint: post-release administrivia
|
|---|
| 1718 | * NEWS: Add header line for next release.
|
|---|
| 1719 | * .prev-version: Record previous version.
|
|---|
| 1720 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 1721 |
|
|---|
| 1722 | version 3.0
|
|---|
| 1723 | * NEWS: Record release date.
|
|---|
| 1724 |
|
|---|
| 1725 | 2017-02-08 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1726 |
|
|---|
| 1727 | grep: do not mishandle \. in multiple patterns
|
|---|
| 1728 | Problem reported by Lars Wendler (Bug#25655).
|
|---|
| 1729 | * NEWS: Document this.
|
|---|
| 1730 | * src/grep.c (try_fgrep_pattern): Fix typo that prevented
|
|---|
| 1731 | keys from being properly updated.
|
|---|
| 1732 | * tests/foad1: Test for the bug.
|
|---|
| 1733 |
|
|---|
| 1734 | 2017-02-07 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1735 |
|
|---|
| 1736 | Do not assume PCRE 8.20 or later
|
|---|
| 1737 | Problem reported by Zube (Bug#25647)
|
|---|
| 1738 | * NEWS: Document this.
|
|---|
| 1739 | * src/pcresearch.c (struct pcre.com.jit_stack):
|
|---|
| 1740 | Declare only if PCRE_STUDY_JIT_COMPILE.
|
|---|
| 1741 |
|
|---|
| 1742 | 2017-02-06 Jim Meyering <meyering@fb.com>
|
|---|
| 1743 |
|
|---|
| 1744 | maint: post-release administrivia
|
|---|
| 1745 | * NEWS: Add header line for next release.
|
|---|
| 1746 | * .prev-version: Record previous version.
|
|---|
| 1747 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 1748 |
|
|---|
| 1749 | version 2.28
|
|---|
| 1750 | * NEWS: Record release date.
|
|---|
| 1751 |
|
|---|
| 1752 | 2017-02-02 Jim Meyering <meyering@fb.com>
|
|---|
| 1753 |
|
|---|
| 1754 | gnulib: update to latest
|
|---|
| 1755 |
|
|---|
| 1756 | 2017-02-01 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1757 |
|
|---|
| 1758 | grep: tune to avoid memchr2 sometimes
|
|---|
| 1759 | Problem noted by Norihiro Tanaka in:
|
|---|
| 1760 | http://lists.gnu.org/archive/html/grep-devel/2017-01/msg00027.html
|
|---|
| 1761 | Although not enough to restore all the previous performance in the
|
|---|
| 1762 | case he noted, it helps significantly.
|
|---|
| 1763 | * src/kwset.c (memchr_kwset): Bring back small_heuristic,
|
|---|
| 1764 | in a somewhat different form.
|
|---|
| 1765 |
|
|---|
| 1766 | 2017-01-29 Jim Meyering <meyering@fb.com>
|
|---|
| 1767 |
|
|---|
| 1768 | gnulib: update to latest
|
|---|
| 1769 |
|
|---|
| 1770 | 2017-01-23 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1771 |
|
|---|
| 1772 | grep: simplify recent kwset change
|
|---|
| 1773 | * src/kwset.c (acexec_trans): Simplify.
|
|---|
| 1774 |
|
|---|
| 1775 | 2017-01-23 Jim Meyering <meyering@fb.com>
|
|---|
| 1776 |
|
|---|
| 1777 | tests: really add the new test name
|
|---|
| 1778 | * tests/Makefile.am (TESTS): Add fgrep-longest.
|
|---|
| 1779 |
|
|---|
| 1780 | 2017-01-21 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 1781 |
|
|---|
| 1782 | grep -Fo could report a match that is not the longest
|
|---|
| 1783 | * src/kwset.c (acexec): Fix it.
|
|---|
| 1784 | * tests/fgrep-longest: New test.
|
|---|
| 1785 | * tests/Makefile.am: Add the test.
|
|---|
| 1786 | * NEWS: Mention it.
|
|---|
| 1787 |
|
|---|
| 1788 | 2017-01-18 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1789 |
|
|---|
| 1790 | grep: speed up Aho-Corasick when at most 2 bytes
|
|---|
| 1791 | When using Aho-Corasick and all matched strings either begin with
|
|---|
| 1792 | the same byte, or begin with one of at most two bytes, use memchr2
|
|---|
| 1793 | to search for these matching bytes and apply the Aho-Corasick
|
|---|
| 1794 | algorithm only when a memchr2 match is found. On my platform,
|
|---|
| 1795 | this speeds up 'grep -F -e aa -e ba in' by a factor of 7, where
|
|---|
| 1796 | the file 'in' was created by 'seq -f %040.0f 10000000 >in'.
|
|---|
| 1797 | * src/kwset.c (struct kwset.gc1): Now int, not char.
|
|---|
| 1798 | If negative, there is no single terminal byte. All uses changed.
|
|---|
| 1799 | (struct kwset.gc1help): Now int, not char.
|
|---|
| 1800 | If negative, memchr2 cannot be used.
|
|---|
| 1801 | (kwsprep): Set up gc1 and gc1help from kwset->next, with
|
|---|
| 1802 | the new (slightly changed) interpretation.
|
|---|
| 1803 | (memchr_kwset): Use memchr2 if possible.
|
|---|
| 1804 | Adjust to match new meaning of gc1, gc1help.
|
|---|
| 1805 | (memoff2_kwset): Remove; no longer needed.
|
|---|
| 1806 | (acexec_trans): Use memchr_kwset when possible, for speed.
|
|---|
| 1807 | It now supersedes memoff2_kwset.
|
|---|
| 1808 |
|
|---|
| 1809 | grep: remove Commentz-Walter code
|
|---|
| 1810 | This code was not being used, and complicated maintenance.
|
|---|
| 1811 | We can bring it back from the repository if it turns out
|
|---|
| 1812 | to be useful later.
|
|---|
| 1813 | * src/kwset.c (struct kwset.reverse): Remove. All uses of
|
|---|
| 1814 | FOO->reverse replaced by (FOO->kwsexec == bmexec).
|
|---|
| 1815 | (kwsalloc): Remove 'reverse' arg, as callers outside this
|
|---|
| 1816 | module do not care about algorithm choice. All callers changed.
|
|---|
| 1817 | (kwsprep): When deciding whether to use Boyer-Moore, do not worry
|
|---|
| 1818 | about being called twice on the same kwset, as that is not allowed.
|
|---|
| 1819 | (cwexec): Remove; it was never called. All uses removed.
|
|---|
| 1820 |
|
|---|
| 1821 | 2017-01-17 Jim Meyering <meyering@fb.com>
|
|---|
| 1822 |
|
|---|
| 1823 | maint: avoid new syntax-check failures
|
|---|
| 1824 | * src/kwset.c (struct kwset): Split a line longer than 80.
|
|---|
| 1825 | * bootstrap: Update from gnulib. This fixes a new syntax-check
|
|---|
| 1826 | failure due to its use of "time stamp".
|
|---|
| 1827 |
|
|---|
| 1828 | 2017-01-17 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1829 |
|
|---|
| 1830 | * NEWS: Fix typo.
|
|---|
| 1831 |
|
|---|
| 1832 | * src/kwset.c: Fix comment typo.
|
|---|
| 1833 |
|
|---|
| 1834 | Improve -i performance in typical UTF-8 searches
|
|---|
| 1835 | Currently ‘grep -i i’ is slow in a UTF-8 locale, because ‘i’ in
|
|---|
| 1836 | the pattern matches the two-byte character 'ı' (U+0131, LATIN
|
|---|
| 1837 | SMALL LETTER DOTLESS I) in data, and kwset handles only
|
|---|
| 1838 | single-byte character translations, so grep falls back on a slower
|
|---|
| 1839 | DFA-based search for all searches. Improve -i performance in the
|
|---|
| 1840 | typical case by using kwset when data are free of troublesome
|
|---|
| 1841 | characters like 'ı', falling back on the DFA only when data
|
|---|
| 1842 | contain troublesome characters.
|
|---|
| 1843 | * src/dfasearch.c (GEAcompile):
|
|---|
| 1844 | * src/grep.c (compile_fp_t):
|
|---|
| 1845 | * src/kwsearch.c (Fcompile):
|
|---|
| 1846 | * src/pcresearch.c (Pcompile):
|
|---|
| 1847 | Pattern arg is now char *, not char const *, since Fcompile
|
|---|
| 1848 | now reallocates it sometimes.
|
|---|
| 1849 | * src/grep.c (all_single_byte_after_folding): Remove.
|
|---|
| 1850 | All callers removed.
|
|---|
| 1851 | (fgrep_icase_charlen): New function.
|
|---|
| 1852 | (fgrep_icase_available, try_fgrep_pattern):
|
|---|
| 1853 | Use it, for more-generous semantics.
|
|---|
| 1854 | (fgrep_to_grep_pattern): Now extern.
|
|---|
| 1855 | (main): Do not free keys, since Fexecute may use them.
|
|---|
| 1856 | * src/kwsearch.c (struct kwsearch): New struct.
|
|---|
| 1857 | (Fcompile): Return it. If -i, be more generous about patterns.
|
|---|
| 1858 | (Fexecute): Use it. Fall back on DFA when the data contain
|
|---|
| 1859 | troublesome characters; this should be rare in practice.
|
|---|
| 1860 | * src/kwset.c, src/kwset.h (kwswords): New function.
|
|---|
| 1861 |
|
|---|
| 1862 | build: update gnulib submodule to latest
|
|---|
| 1863 |
|
|---|
| 1864 | 2017-01-15 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1865 |
|
|---|
| 1866 | dfa: prefer ptrdiff_t to size_t
|
|---|
| 1867 | The code already cannot handle objects with size greater than
|
|---|
| 1868 | SIZE_MAX / 2, so be more honest about it and use ptrdiff_t instead
|
|---|
| 1869 | of size_t. ptrdiff_t arithmetic is signed, which allows for more
|
|---|
| 1870 | checking via -fsanitize=undefined. It also makes the code a tad
|
|---|
| 1871 | smaller on x86-64, since it can test for < 0 rather than for ==
|
|---|
| 1872 | SIZE_MAX.
|
|---|
| 1873 | * src/dfasearch.c (struct dfa_comp.kwset_exact_matches):
|
|---|
| 1874 | (kwsmusts, EGexecute):
|
|---|
| 1875 | * src/kwsearch.c (Fcompile, Fexecute):
|
|---|
| 1876 | * src/kwset.c (struct kwset.kwsexec, kwsincr, memchr_kwset)
|
|---|
| 1877 | (memoff2_kwset, bmexec_trans, bmexec, cwexec, acexec_trans)
|
|---|
| 1878 | (acexec, kwsexec):
|
|---|
| 1879 | * src/kwset.h (struct kwsmatch.index, .offset, .size):
|
|---|
| 1880 | Prefer ptrdiff_t to size_t where either will do.
|
|---|
| 1881 |
|
|---|
| 1882 | 2017-01-11 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1883 |
|
|---|
| 1884 | grep: improve comments, mostly in kwset
|
|---|
| 1885 | Remove kwset.h comments that are obsolete and seemingly not
|
|---|
| 1886 | maintained anyway; people can look in kwset.c instead.
|
|---|
| 1887 | Update comments to reflect current behavior better.
|
|---|
| 1888 | Cite Faro & Lecroq 2013. Use GNU style for end-of-sentence.
|
|---|
| 1889 |
|
|---|
| 1890 | 2017-01-01 Jim Meyering <meyering@fb.com>
|
|---|
| 1891 |
|
|---|
| 1892 | maint: update gnulib and copyright dates for 2017
|
|---|
| 1893 | * gnulib: Update to latest.
|
|---|
| 1894 | * all files: Run "make update-copyright".
|
|---|
| 1895 |
|
|---|
| 1896 | 2016-12-31 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1897 |
|
|---|
| 1898 | grep: speed up -x with many patterns
|
|---|
| 1899 | * src/kwsearch.c (Fcompile): Improve buffer allocation overhead
|
|---|
| 1900 | with -x and multiple patterns. In the common case where '\n' is
|
|---|
| 1901 | the end-of-line byte, avoid copying other than the first and last
|
|---|
| 1902 | patterns.
|
|---|
| 1903 |
|
|---|
| 1904 | 2016-12-31 Jim Meyering <meyering@fb.com>
|
|---|
| 1905 |
|
|---|
| 1906 | gnulib: update to latest, fixing a parallel getopt test failure
|
|---|
| 1907 |
|
|---|
| 1908 | 2016-12-29 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1909 |
|
|---|
| 1910 | maint: space before paren
|
|---|
| 1911 |
|
|---|
| 1912 | grep: int cleanup in kwset.c
|
|---|
| 1913 | This should affect only theoretical bugs with very large inputs.
|
|---|
| 1914 | On my platform, this patch shrinks the grep text by 136 bytes.
|
|---|
| 1915 | * src/kwset.c: Include intprops.h, for INT_MULTIPLY_WRAPV.
|
|---|
| 1916 | (struct trie, struct kwset, kwsalloc, kwsincr, treedelta, kwsprep)
|
|---|
| 1917 | (bm_delta2_search, bmexec_trans, cwexec): Prefer ptrdiff_t to int
|
|---|
| 1918 | when counts can exceed INT_MAX in large inputs, at least in theory.
|
|---|
| 1919 | (hasevery): Use bool for booleans.
|
|---|
| 1920 | (bmexec_trans): Avoid undefined behavior on integer overflow.
|
|---|
| 1921 |
|
|---|
| 1922 | 2016-12-27 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 1923 |
|
|---|
| 1924 | grep: improve performance with multiple patterns
|
|---|
| 1925 | * src/grep.c (main): Avoid fgrep-to-grep conversion for word matching
|
|---|
| 1926 | with multiple patterns in single byte locales.
|
|---|
| 1927 |
|
|---|
| 1928 | 2016-12-27 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1929 |
|
|---|
| 1930 | * NEWS: Fix typo.
|
|---|
| 1931 |
|
|---|
| 1932 | grep: fix bug with '... | grep pat >> /dev/null'
|
|---|
| 1933 | Problem reported by Benno Fünfstück (Bug#25283).
|
|---|
| 1934 | * NEWS: Document this.
|
|---|
| 1935 | * src/grep.c (drain_input) [SPLICE_F_MOVE]:
|
|---|
| 1936 | Don't assume /dev/null is always acceptable output to splice.
|
|---|
| 1937 | * tests/grep-dev-null-out: Test for the bug.
|
|---|
| 1938 |
|
|---|
| 1939 | 2016-12-26 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 1940 |
|
|---|
| 1941 | grep: minor performance tweak for pure functions
|
|---|
| 1942 | * src/search.h (wordchars_size, wordchar_next, wordchar_prev):
|
|---|
| 1943 | Declare to be pure.
|
|---|
| 1944 |
|
|---|
| 1945 | 2016-12-25 Zev Weiss <zev@bewilderbeest.net>
|
|---|
| 1946 |
|
|---|
| 1947 | grep: move localeinfo to grep.c
|
|---|
| 1948 | It's not really dfasearch-specific, and grep.c initializes it, so it
|
|---|
| 1949 | seems like the most appropriate "owner".
|
|---|
| 1950 |
|
|---|
| 1951 | * src/dfasearch.c (localeinfo): Remove.
|
|---|
| 1952 | * src/grep.c (localeinfo): Add.
|
|---|
| 1953 | * src/search.h (localeinfo): Move to new commented section.
|
|---|
| 1954 |
|
|---|
| 1955 | 2016-12-25 Zev Weiss <zev@bewilderbeest.net>
|
|---|
| 1956 |
|
|---|
| 1957 | pcresearch: thread safety
|
|---|
| 1958 | * src/pcresearch.c (pcre_comp): New struct to hold previously-global
|
|---|
| 1959 | state.
|
|---|
| 1960 | (jit_exec): Operate on a pcre_comp parameter instead of global state.
|
|---|
| 1961 | (Pcompile): Allocate and return a pcre_comp instead of setting global
|
|---|
| 1962 | variables.
|
|---|
| 1963 | (Pexecute): Operate on a pcre_comp parameter instead of global state.
|
|---|
| 1964 |
|
|---|
| 1965 | kwsearch: thread safety
|
|---|
| 1966 | * src/kwsearch.c (Fcompile): Return a kwset_t instead of setting a
|
|---|
| 1967 | global variable.
|
|---|
| 1968 | (Fexecute): Use a passed-in kwset_t instead of a global variable.
|
|---|
| 1969 | (kwset): Remove global variable.
|
|---|
| 1970 |
|
|---|
| 1971 | dfasearch: thread safety
|
|---|
| 1972 | * src/dfasearch.c (struct dfa_comp): New struct to hold
|
|---|
| 1973 | previously-global variables.
|
|---|
| 1974 | (dfawarn): Remove static variable.
|
|---|
| 1975 | (kwsmusts): Operate on a dfa_comp parameter instead of global
|
|---|
| 1976 | variables.
|
|---|
| 1977 | (GEAcompile): Allocate and return a dfa_comp struct instead of setting
|
|---|
| 1978 | global variables.
|
|---|
| 1979 | (EGexecute): Operate on a dfa_comp parameter instead of global
|
|---|
| 1980 | variables.
|
|---|
| 1981 | * src/searchutils.c (kwsinit): Replace a static array with a
|
|---|
| 1982 | dynamically-allocated one.
|
|---|
| 1983 |
|
|---|
| 1984 | 2016-12-25 Zev Weiss <zev@bewilderbeest.net>
|
|---|
| 1985 |
|
|---|
| 1986 | grep: prepare search backends for thread-safety
|
|---|
| 1987 | To facilitate removing mutable global state from search backends,
|
|---|
| 1988 | compile() functions will return an opaque pointer to backend-specific
|
|---|
| 1989 | data, which must then be passed back into the corresponding execute()
|
|---|
| 1990 | function. This is merely a preparatory step changing function
|
|---|
| 1991 | signatures and call sites, so the pointers passed & returned are
|
|---|
| 1992 | dummies for now and not (yet) actually used.
|
|---|
| 1993 |
|
|---|
| 1994 | * src/grep.c (compile_fp_t): Now returns an opaque pointer (the
|
|---|
| 1995 | compiled pattern).
|
|---|
| 1996 | (execute_fp_t): Now passed the pointer returned by a compile_fp_t.
|
|---|
| 1997 | All call sites updated accordingly.
|
|---|
| 1998 | (compiled_pattern): New static variable.
|
|---|
| 1999 | * src/dfasearch.c (GEAcompile): Return a void pointer (dummy NULL).
|
|---|
| 2000 | (EGexecute): Receive a void pointer argument (unused).
|
|---|
| 2001 | * src/kwsearch.c (Fcompile): Return a void pointer (dummy NULL).
|
|---|
| 2002 | (Fexecute): Receive a void pointer argument (unused).
|
|---|
| 2003 | * src/pcresearch.c (Pcompile): Return a void pointer (dummy NULL).
|
|---|
| 2004 | (Pexecute): Receive a void pointer argument (unused).
|
|---|
| 2005 | * src/search.h: Update compile/execute function prototypes.
|
|---|
| 2006 |
|
|---|
| 2007 | 2016-12-24 Jim Meyering <meyering@fb.com>
|
|---|
| 2008 |
|
|---|
| 2009 | maint: fix "syntax-check" failure
|
|---|
| 2010 | * src/grep.c (SEP_STR_GROUP): Declare "static".
|
|---|
| 2011 |
|
|---|
| 2012 | 2016-12-23 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2013 |
|
|---|
| 2014 | grep: fix comment in searchutils.c
|
|---|
| 2015 |
|
|---|
| 2016 | grep: improve word checking with UTF-8
|
|---|
| 2017 | * src/searchutils.c: Do not include <verify.h>.
|
|---|
| 2018 | (word_start): Remove, replacing with ...
|
|---|
| 2019 | (sbwordchar): New static var. All uses changed.
|
|---|
| 2020 | (wordchar_prev): Return size_t, not bool, as this generates
|
|---|
| 2021 | slightly better code. Go back faster if UTF-8.
|
|---|
| 2022 |
|
|---|
| 2023 | grep: standardize on localeinfo.multibyte
|
|---|
| 2024 | * src/dfasearch.c (EGexecute):
|
|---|
| 2025 | * src/grep.c (main):
|
|---|
| 2026 | * src/kwsearch.c (Fexecute):
|
|---|
| 2027 | * src/pcresearch.c (Pcompile):
|
|---|
| 2028 | Prefer localeinfo.multibyte to (MB_CUR_MAX > 1).
|
|---|
| 2029 |
|
|---|
| 2030 | grep: speed up -wf in C locale
|
|---|
| 2031 | Problem reported by Norihiro Tanaka (Bug#22357#100).
|
|---|
| 2032 | This patch improves the performance on that benchmark on my
|
|---|
| 2033 | platform so that grep is now only about 2x slower than grep 2.26,
|
|---|
| 2034 | which means it is considerably faster than grep 2.25 and earlier.
|
|---|
| 2035 | * src/kwsearch.c (Fexecute):
|
|---|
| 2036 | Use wordchars_size to boost performance for this case.
|
|---|
| 2037 | * src/search.h, src/searchutils.c (wordchars_size): New function.
|
|---|
| 2038 |
|
|---|
| 2039 | grep: specialize word-finding functions
|
|---|
| 2040 | This improves performance a bit.
|
|---|
| 2041 | * src/dfasearch.c, src/kwsearch.c (wordchar):
|
|---|
| 2042 | Remove; now in searchutils.c.
|
|---|
| 2043 | * src/grep.c (main): Call wordinit if -w.
|
|---|
| 2044 | * src/search.h: Adjust.
|
|---|
| 2045 | * src/searchutils.c: Include verify.h.
|
|---|
| 2046 | (word_start): New static var.
|
|---|
| 2047 | (wordchar): Move here from dfasearch.c and kwsearch.c.
|
|---|
| 2048 | (wordinit, wordchars_count, wordchar_next, wordchar_prev):
|
|---|
| 2049 | New functions.
|
|---|
| 2050 | (mb_prev_wc, mb_next_wc): Remove.
|
|---|
| 2051 | All callers changed to use the new functions instead.
|
|---|
| 2052 |
|
|---|
| 2053 | grep: simplify Fexecute
|
|---|
| 2054 | * src/kwsearch.c (Fexecute): Avoid the need for a 'try' local or
|
|---|
| 2055 | for a 'goto success'. Update mb_start to reflect newline found.
|
|---|
| 2056 |
|
|---|
| 2057 | grep: remove C label
|
|---|
| 2058 | * src/kwsearch.c (Fexecute): Remove label.
|
|---|
| 2059 |
|
|---|
| 2060 | maint: rewrite to avoid some macros
|
|---|
| 2061 | These days, the dangerous powers of C macros are not needed if
|
|---|
| 2062 | constants or functions will do just as well.
|
|---|
| 2063 | * src/grep.c (SEP_CHAR_SELECTED, SEP_CHAR_REJECTED, SEP_STR_GROUP)
|
|---|
| 2064 | (INITIAL_BUFSIZE):
|
|---|
| 2065 | * src/kwset.c (DEPTH_SIZE):
|
|---|
| 2066 | Now constants, not macros.
|
|---|
| 2067 | * src/kwset.c (link): Remove macro. Instead, rename local vars
|
|---|
| 2068 | from 'link' to 'cur'.
|
|---|
| 2069 | (malloc) [GREP]: Remove macro. All uses of malloc changed to xmalloc.
|
|---|
| 2070 | Omit double-inclusion of xalloc.h. Do not depend on 'GREP'.
|
|---|
| 2071 | (U): Now a function, not a macro.
|
|---|
| 2072 | * src/kwset.c, src/searchutils.c (NCHAR): Move this macro to ...
|
|---|
| 2073 | * src/system.h: ... here, and make it a constant.
|
|---|
| 2074 |
|
|---|
| 2075 | 2016-12-20 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2076 |
|
|---|
| 2077 | grep: fix performance with multiple patterns
|
|---|
| 2078 | Problem reported by Jaroslav Skarvada (Bug#22357).
|
|---|
| 2079 | * NEWS: Document this and other recent performance fixes.
|
|---|
| 2080 | * src/grep.c (E_MATCHER_INDEX): New constant.
|
|---|
| 2081 | (all_single_byte_after_folding):
|
|---|
| 2082 | New function, split out from fgrep_icase_available.
|
|---|
| 2083 | (fgrep_icase_available): Use it.
|
|---|
| 2084 | (try_fgrep_pattern): New function, which also uses it.
|
|---|
| 2085 | (main): With two or more patterns, use try_fgrep_pattern to fix
|
|---|
| 2086 | performance regression. The number "two" here is just a heuristic.
|
|---|
| 2087 |
|
|---|
| 2088 | grep: simplify matcher configuration
|
|---|
| 2089 | * src/grep.c (matcher, compile): Remove static vars.
|
|---|
| 2090 | (compile_fp_t): Now takes a 3rd syntax argument.
|
|---|
| 2091 | (Gcomppile, Ecompile, Acompile, GAcompile, PAcompile): Remove.
|
|---|
| 2092 | (struct matcher): Now nameless, since it is used only once.
|
|---|
| 2093 | Make 'name' a bit shorter. New member 'syntax'.
|
|---|
| 2094 | (matchers): Initialize it, and change removed functions to GEAcompile.
|
|---|
| 2095 | (F_MATCHER_INDEX, G_MATCHER_INDEX): New constants.
|
|---|
| 2096 | (setmatcher): New arg MATCHER, and return new matcher index.
|
|---|
| 2097 | Avoid unnecessary call to strcmp.
|
|---|
| 2098 | (main): Keep matcher as a local int, not a global pointer.
|
|---|
| 2099 | * src/kwsearch.c (Fcompile):
|
|---|
| 2100 | * src/pcresearch.c (Pcompile): Ignore the 3rd syntax argument.
|
|---|
| 2101 |
|
|---|
| 2102 | grep: simplify line counting in patterns
|
|---|
| 2103 | * src/grep.c (n_patterns): Rename from patfile_lineno,
|
|---|
| 2104 | as it is now origin-zero. Now size_t, not uintmax_t.
|
|---|
| 2105 | (count_nl_bytes, fl_add): Simplify to just buffer and size.
|
|---|
| 2106 | All callers changed.
|
|---|
| 2107 |
|
|---|
| 2108 | 2016-12-19 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2109 |
|
|---|
| 2110 | build: update gnulib submodule to latest
|
|---|
| 2111 |
|
|---|
| 2112 | 2016-12-18 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2113 |
|
|---|
| 2114 | build: update gnulib submodule to latest
|
|---|
| 2115 |
|
|---|
| 2116 | build: update gnulib submodule to latest
|
|---|
| 2117 |
|
|---|
| 2118 | 2016-12-13 Jim Meyering <meyering@fb.com>
|
|---|
| 2119 |
|
|---|
| 2120 | tests: use just-built grep in more places
|
|---|
| 2121 | * cfg.mk (PATH): Prepend $(srcdir)/src, so that we use the just-
|
|---|
| 2122 | built grep also when running commands like those of "make distcheck".
|
|---|
| 2123 | This would have avoided the recently-luckily-noticed infloop bug.
|
|---|
| 2124 | Tested by running this in a just-built directory:
|
|---|
| 2125 | f=src/grep; printf '%s\n' '#!/bin/sh' 'sleep 9h' > $f; chmod a+x $f
|
|---|
| 2126 | and then verifying that nearly every "make syntax-check" rule hangs.
|
|---|
| 2127 |
|
|---|
| 2128 | maint: tell "syntax-check" not to worry about the NEWS update
|
|---|
| 2129 | Whenever we change "old" NEWS, we have to update this checksum.
|
|---|
| 2130 | Otherwise, a "make syntax-check" test that guards against a class
|
|---|
| 2131 | of logical merge conflicts will fail.
|
|---|
| 2132 | * cfg.mk (old_NEWS_hash): Update this hash to accommodate the
|
|---|
| 2133 | recent clarification of a 2.27 NEWS entry.
|
|---|
| 2134 |
|
|---|
| 2135 | 2016-12-13 Arnold D. Robbins <arnold@skeeve.com>
|
|---|
| 2136 |
|
|---|
| 2137 | build: update gnulib submodule to latest
|
|---|
| 2138 | * src/dfasearch.c (GEAcompile): Remove use of flag, RE_ICASE covers it.
|
|---|
| 2139 |
|
|---|
| 2140 | 2016-12-12 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2141 |
|
|---|
| 2142 | grep: work around proc lseek glitch
|
|---|
| 2143 | Problem reported by Andreas Schwab (Bug#25180).
|
|---|
| 2144 | * NEWS: Document this.
|
|---|
| 2145 | * src/grep.c (finalize_input): Ignore EINVAL lseek failures.
|
|---|
| 2146 | * tests/Makefile.am (TESTS): Add proc.
|
|---|
| 2147 | * tests/proc: New file.
|
|---|
| 2148 |
|
|---|
| 2149 | 2016-12-07 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2150 |
|
|---|
| 2151 | grep: simplify finalize_input
|
|---|
| 2152 | * src/grep.c (finalize_input): Simplify without changing behavior.
|
|---|
| 2153 | It's still a bit of a rat's-nest, but it's a cozier rat's-nest.
|
|---|
| 2154 |
|
|---|
| 2155 | maint: clarify early-exit news for 2.27
|
|---|
| 2156 | * NEWS: Mention early-exit options to avoid confusion. See:
|
|---|
| 2157 | http://lists.gnu.org/archive/html/grep-devel/2016-12/msg00007.html
|
|---|
| 2158 |
|
|---|
| 2159 | 2016-12-06 Jim Meyering <meyering@fb.com>
|
|---|
| 2160 |
|
|---|
| 2161 | maint: post-release administrivia
|
|---|
| 2162 | * NEWS: Add header line for next release.
|
|---|
| 2163 | * .prev-version: Record previous version.
|
|---|
| 2164 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 2165 |
|
|---|
| 2166 | version 2.27
|
|---|
| 2167 | * NEWS: Record release date.
|
|---|
| 2168 |
|
|---|
| 2169 | 2016-11-29 Jim Meyering <meyering@fb.com>
|
|---|
| 2170 |
|
|---|
| 2171 | grep: fix DFA-induced infloop
|
|---|
| 2172 | * gnulib: Update to latest, for the DFA infloop fix.
|
|---|
| 2173 | * tests/dfa-infloop: New test, to trigger an infinite loop
|
|---|
| 2174 | in the DFA matcher.
|
|---|
| 2175 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 2176 |
|
|---|
| 2177 | 2016-11-28 Jim Meyering <meyering@fb.com>
|
|---|
| 2178 |
|
|---|
| 2179 | tests: use "returns_ N env VAR=val ..."
|
|---|
| 2180 | rather than "VAR=val returns_ N ..."
|
|---|
| 2181 | Some shells do not propagate envvar settings through our use
|
|---|
| 2182 | of the "returns_" function, so set any envvar via use of "env".
|
|---|
| 2183 | This was an issue at least on Ubuntu and Debian-based systems,
|
|---|
| 2184 | presumably due to their common use of "dash" as /bin/sh.
|
|---|
| 2185 | Reported by Assaf Gordon.
|
|---|
| 2186 | * tests/char-class-multibyte: As above.
|
|---|
| 2187 | * tests/euc-mb: Likewise.
|
|---|
| 2188 | * tests/false-match-mb-non-utf8: Likewise.
|
|---|
| 2189 | * tests/pcre-infloop: Likewise.
|
|---|
| 2190 | * tests/pcre-jitstack: Likewise.
|
|---|
| 2191 | * tests/sjis-mb: Likewise.
|
|---|
| 2192 | * tests/warn-char-classes: Likewise.
|
|---|
| 2193 |
|
|---|
| 2194 | 2016-11-28 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2195 |
|
|---|
| 2196 | tests: revert check for unibyte French range bug
|
|---|
| 2197 | The test wasn't portable, as it assumed that rational ranges
|
|---|
| 2198 | were not in effect. Problem reported by Eric Blake (Bug#25048#8).
|
|---|
| 2199 | There doesn't seem to be a portable way to do the test, so omit it.
|
|---|
| 2200 | * tests/init.cfg, tests/unibyte-bracket-expr:
|
|---|
| 2201 | Revert previous change.
|
|---|
| 2202 |
|
|---|
| 2203 | build: update gnulib submodule to latest
|
|---|
| 2204 |
|
|---|
| 2205 | 2016-11-27 Jim Meyering <meyering@fb.com>
|
|---|
| 2206 |
|
|---|
| 2207 | grep: avoid false matches in non-UTF8 multibyte locales
|
|---|
| 2208 | * gnulib: Update to latest, for the dfa.c fix.
|
|---|
| 2209 | * NEWS (Bug fixes): Mention it.
|
|---|
| 2210 | * tests/false-match-mb-non-utf8: New file, with tests for this.
|
|---|
| 2211 | Based on tests from Stephane Chazelas.
|
|---|
| 2212 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 2213 | Introduced by commit v2.18-54-g3ef4c8e, a change that made grep use
|
|---|
| 2214 | its DFA matcher more aggressively. The malfunction arises only with
|
|---|
| 2215 | the DFA matcher, not with regex.
|
|---|
| 2216 | Reported by Stephane Chazelas in https://bugs.gnu.org/24975
|
|---|
| 2217 |
|
|---|
| 2218 | 2016-11-20 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2219 |
|
|---|
| 2220 | tests: check for unibyte French range bug
|
|---|
| 2221 | Problem reported by Stephane Chazelas (Bug#24973).
|
|---|
| 2222 | This bug was fixed in Gnulib.
|
|---|
| 2223 | * NEWS: Document the fix.
|
|---|
| 2224 | * tests/init.cfg (require_ru_RU_koi8_r): Remove.
|
|---|
| 2225 | * tests/unibyte-bracket-expr: Add a test for the bug.
|
|---|
| 2226 | Call get-mb-cur-max directly instead of bothering with
|
|---|
| 2227 | require_ru_RU_koi8_r.
|
|---|
| 2228 |
|
|---|
| 2229 | build: update gnulib submodule to latest
|
|---|
| 2230 |
|
|---|
| 2231 | 2016-11-19 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2232 |
|
|---|
| 2233 | grep: further -P performance fix
|
|---|
| 2234 | Problem reported by Stephane Chazelas in:
|
|---|
| 2235 | http://bugs.gnu.org/22655#103
|
|---|
| 2236 | * src/pcresearch.c (Pexecute): Set the subject to the start of
|
|---|
| 2237 | each line as it is found.
|
|---|
| 2238 |
|
|---|
| 2239 | grep: -P no longer uses PCRE_MULTILINE
|
|---|
| 2240 | This reverts commit f6603c4e1e04dbb87a7232c4b44acc6afdf65fef,
|
|---|
| 2241 | as the extra performance is not worth the trouble for PCRE users.
|
|---|
| 2242 | Problem reported by Stephane Chazelas in:
|
|---|
| 2243 | http://bugs.gnu.org/22655#103
|
|---|
| 2244 | * NEWS: Document this and the next patch.
|
|---|
| 2245 | * src/dfasearch.c (EGexecute):
|
|---|
| 2246 | * src/grep.c (execute_fp_t):
|
|---|
| 2247 | * src/kwsearch.c (Fexecute):
|
|---|
| 2248 | * src/pcresearch.c (Pexecute):
|
|---|
| 2249 | First arg is now a const pointer again.
|
|---|
| 2250 | * src/grep.c (buf_has_encoding_errors): Now static.
|
|---|
| 2251 | * src/grep.h (buf_has_encoding_errors): Remove decl.
|
|---|
| 2252 | * src/search.h: Adjust decls.
|
|---|
| 2253 | * src/pcresearch.c (reflags): Remove. All uses removed.
|
|---|
| 2254 | (Pcompile, Pexecute): Do not use PCRE_MULTILINE.
|
|---|
| 2255 |
|
|---|
| 2256 | 2016-11-19 Jim Meyering <meyering@fb.com>
|
|---|
| 2257 |
|
|---|
| 2258 | doc: fix a doubled "the"
|
|---|
| 2259 | * doc/grep.texi (--perl-regexp): s/the\nthe/the/
|
|---|
| 2260 |
|
|---|
| 2261 | 2016-11-19 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2262 |
|
|---|
| 2263 | grep: fix -zxP bug
|
|---|
| 2264 | * NEWS: Document this.
|
|---|
| 2265 | * src/pcresearch.c (Pcompile): Search a line at a time if -x is
|
|---|
| 2266 | used, since -x uses ^ and $.
|
|---|
| 2267 | * tests/pcre: Test this.
|
|---|
| 2268 |
|
|---|
| 2269 | grep: simplify by using PRIuMAX
|
|---|
| 2270 | * configure.ac (HAVE_PRINTF_C99_SIZES): Remove; no longer needed.
|
|---|
| 2271 | * src/grep.c (print_offset): Simplify (Bug#24451).
|
|---|
| 2272 |
|
|---|
| 2273 | grep: -T now adjusts number widths for worst case
|
|---|
| 2274 | * NEWS, doc/grep.texi (Output Line Prefix Control):
|
|---|
| 2275 | Document this (Bug#24451).
|
|---|
| 2276 | * src/grep.c (offset_width): New static var.
|
|---|
| 2277 | (print_offset): Use it instead of arg. All callers changed.
|
|---|
| 2278 | (grep): Set it.
|
|---|
| 2279 | * tests/initial-tab: Test this.
|
|---|
| 2280 |
|
|---|
| 2281 | grep: -T no longer outputs BS
|
|---|
| 2282 | * NEWS: Document this (Bug#24451).
|
|---|
| 2283 | * src/grep.c (print_line_head): Do not attempt to backspace output.
|
|---|
| 2284 | * tests/initial-tab: New test.
|
|---|
| 2285 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 2286 |
|
|---|
| 2287 | grep: document -oz better
|
|---|
| 2288 | * doc/grep.texi (General Output Control, Usage): Tweak (Bug#24961).
|
|---|
| 2289 |
|
|---|
| 2290 | grep: fix performance typo with -P
|
|---|
| 2291 | Reported by Zev Weiss in: http://bugs.gnu.org/22655#88
|
|---|
| 2292 | * src/pcresearch.c (Pcompile): Initialize reflags.
|
|---|
| 2293 |
|
|---|
| 2294 | tests: use "returns_" rather than "$?"
|
|---|
| 2295 | * tests/grep-dev-null-out: Use "returns_ 124" rather than testing
|
|---|
| 2296 | $? = 124.
|
|---|
| 2297 |
|
|---|
| 2298 | grep -f /dev/null -L PAT FILE outputs FILE
|
|---|
| 2299 | * NEWS: Document this.
|
|---|
| 2300 | * src/grep.c (main): Do not exit right away with -L.
|
|---|
| 2301 | * tests/skip-read: Test for the fix.
|
|---|
| 2302 |
|
|---|
| 2303 | grep: tune -f /dev/null
|
|---|
| 2304 | * src/grep.c (main): Do the -f /dev/null early-exit checks before
|
|---|
| 2305 | more-expensive tests that involve syscalls.
|
|---|
| 2306 |
|
|---|
| 2307 | grep: treat -f /dev/null like -m0
|
|---|
| 2308 | * NEWS: Document this.
|
|---|
| 2309 | * src/grep.c (main): With -f /dev/null, don't bother to read the
|
|---|
| 2310 | input. This is what FreeBSD grep does.
|
|---|
| 2311 | * tests/Makefile.am (TESTS): Add skip-read.
|
|---|
| 2312 | * tests/skip-read: New file.
|
|---|
| 2313 |
|
|---|
| 2314 | grep: avoid O(N**2) buffer reallocation
|
|---|
| 2315 | * src/grep.c (main): Use x2realloc to avoid O(N**2) performance as
|
|---|
| 2316 | pattern buffers grow.
|
|---|
| 2317 |
|
|---|
| 2318 | grep: avoid unnecessary gettext call
|
|---|
| 2319 | Translate "(standard input)" lazily.
|
|---|
| 2320 | * src/grep.c (input_filename): New function.
|
|---|
| 2321 | (suppressible_error): Remove 1st arg, since it is always
|
|---|
| 2322 | input_filename (). All callers changed.
|
|---|
| 2323 | (suppressible_error, print_filename, grep, grepdesc): Use it.
|
|---|
| 2324 | (grep_command_line_arg): Set filename to NULL if standard
|
|---|
| 2325 | input has no label. Often, this avoids all calls to gettext,
|
|---|
| 2326 | which can be a win as the first call can be expensive.
|
|---|
| 2327 |
|
|---|
| 2328 | grep: drain the input pipe faster
|
|---|
| 2329 | * src/grep.c (dev_null_output): Now static.
|
|---|
| 2330 | (drain_input): New function, using 'splice' if that makes sense.
|
|---|
| 2331 | (finalize_input): Use it.
|
|---|
| 2332 | (main): Omit now-unnecessary initialization.
|
|---|
| 2333 |
|
|---|
| 2334 | grep: scale back /dev/null speedup
|
|---|
| 2335 | The performance improvement when output is /dev/null (commit
|
|---|
| 2336 | af6af288eac28951b5eee1eaaf373e22b2193b7b dated 2016-05-01)
|
|---|
| 2337 | breaks scripts that run "PROGRAM | grep PATTERN >/dev/null"
|
|---|
| 2338 | where PROGRAM dies when writing into a broken pipe.
|
|---|
| 2339 | Suppress the improvement if standard input is not seekable.
|
|---|
| 2340 | Problem reported by Gary Johnson (Bug#24941).
|
|---|
| 2341 | * NEWS: Document this.
|
|---|
| 2342 | * src/grep.c (seek_failed): New static var.
|
|---|
| 2343 | (seek_data_failed): Move decl earlier, to be next to seek_failed.
|
|---|
| 2344 | (file_must_have_nulls): Skip useless syscalls if seek_failed.
|
|---|
| 2345 | Lessen source-code nesting.
|
|---|
| 2346 | (reset): Set seek_failed and seek_data_failed.
|
|---|
| 2347 | Try lseek even on non-regular files.
|
|---|
| 2348 | (grep): New arg INEOF. All callers changed.
|
|---|
| 2349 | Do not clear seek_data_failed here, since 'reset' now does this.
|
|---|
| 2350 | (finalize_input): New static function.
|
|---|
| 2351 | (grepdesc): Use it.
|
|---|
| 2352 | (main): Do not exit on first match merely because output is
|
|---|
| 2353 | /dev/null.
|
|---|
| 2354 | * tests/grep-dev-null-out: Adjust to new behavior.
|
|---|
| 2355 |
|
|---|
| 2356 | grep: improve diagnostic on lseek failure
|
|---|
| 2357 | * src/grep.c (reset): Mention the file name in the (unlikely)
|
|---|
| 2358 | chance of an lseek failure.
|
|---|
| 2359 |
|
|---|
| 2360 | grep: avoid unnecessary isatty calls
|
|---|
| 2361 | This fixes an inefficiency that was mistakenly introduced a while
|
|---|
| 2362 | back, when the macro SET_BINARY became defined on all platforms.
|
|---|
| 2363 | * src/grep.c (grepdesc, main): Do not unecessarily call isatty on
|
|---|
| 2364 | POSIXish platforms.
|
|---|
| 2365 |
|
|---|
| 2366 | grep: -Pz no longer rejects ^, $
|
|---|
| 2367 | Problem reported by Stephane Chazelas (Bug#22655).
|
|---|
| 2368 | * NEWS: Document this.
|
|---|
| 2369 | * doc/grep.texi (grep Programs): Warn about -Pz.
|
|---|
| 2370 | * src/pcresearch.c (reflags): New static var.
|
|---|
| 2371 | (multibyte_locale): Remove static var; now local to Pcompile.
|
|---|
| 2372 | (Pcompile): Check for (? and (* too. Set reflags instead of
|
|---|
| 2373 | dying when problematic operators are found.
|
|---|
| 2374 | (Pexecute): Use reflags to decide whether searches should
|
|---|
| 2375 | be multiline.
|
|---|
| 2376 | * tests/pcre: Test new behavior.
|
|---|
| 2377 |
|
|---|
| 2378 | 2016-11-14 Jim Meyering <meyering@fb.com>
|
|---|
| 2379 |
|
|---|
| 2380 | tests: use "returns_" rather than explicit comparison with "$?"
|
|---|
| 2381 | * tests/sjis-mb (encode): Rearrange to emit desired input into
|
|---|
| 2382 | a file, rather than piping directly into grep. That permits
|
|---|
| 2383 | the use of returns_ 1 to verify timeout's exit status.
|
|---|
| 2384 | * tests/euc-mb: Use "returns_ 1" rather than testing $? = 1
|
|---|
| 2385 | * tests/char-class-multibyte: Likewise.
|
|---|
| 2386 | * tests/dfa-heap-overrun: Likewise.
|
|---|
| 2387 | * tests/encoding-error: Likewise.
|
|---|
| 2388 | * tests/fedora: Likewise.
|
|---|
| 2389 | * tests/grep-dev-null: Likewise.
|
|---|
| 2390 | * tests/init.cfg (envvar_check_fail): Likewise.
|
|---|
| 2391 | * tests/kwset-abuse: Likewise.
|
|---|
| 2392 | * tests/mb-non-UTF8-overrun: Likewise.
|
|---|
| 2393 | * tests/multibyte-white-space: Likewise.
|
|---|
| 2394 | * tests/pcre-infloop: Likewise.
|
|---|
| 2395 | * tests/surrogate-pair: Likewise.
|
|---|
| 2396 | * tests/warn-char-classes: Likewise.
|
|---|
| 2397 | Do the same for other values:
|
|---|
| 2398 | * tests/backref-multibyte-slow: Likewise.
|
|---|
| 2399 | * tests/euc-mb: Likewise.
|
|---|
| 2400 | * tests/pcre-abort: Likewise.
|
|---|
| 2401 | * tests/pcre-jitstack: Likewise.
|
|---|
| 2402 | * tests/repetition-overflow: Likewise.
|
|---|
| 2403 | * tests/reversed-range-endpoints: Likewise.
|
|---|
| 2404 | * tests/warn-char-classes: Likewise.
|
|---|
| 2405 |
|
|---|
| 2406 | 2016-10-26 Jim Meyering <meyering@fb.com>
|
|---|
| 2407 |
|
|---|
| 2408 | doc: grep builds on HP-UX once again
|
|---|
| 2409 | * NEWS (Bug fixes): Mention the HP-UX fix.
|
|---|
| 2410 |
|
|---|
| 2411 | gnulib: update to latest, for getprogname HPUX port
|
|---|
| 2412 |
|
|---|
| 2413 | 2016-10-22 Mark Veltzer <mark.veltzer@gmail.com>
|
|---|
| 2414 |
|
|---|
| 2415 | ignore coverage generated files
|
|---|
| 2416 |
|
|---|
| 2417 | ignore ar-lib in build-aux
|
|---|
| 2418 |
|
|---|
| 2419 | 2016-10-20 Zev Weiss <zev@bewilderbeest.net>
|
|---|
| 2420 |
|
|---|
| 2421 | grep: use 'j' intmax_t printf length modifier if supported
|
|---|
| 2422 | * configure.ac: Use gl_PRINTF_SIZES_C99 to test printf and
|
|---|
| 2423 | (conditionally) define HAVE_PRINTF_C99_SIZES.
|
|---|
| 2424 | * src/grep.c (print_offset): Use printf("%j...") for printing
|
|---|
| 2425 | [u]intmax_t if HAVE_PRINTF_C99_SIZES is defined; otherwise continue
|
|---|
| 2426 | using the existing hand-rolled loop.
|
|---|
| 2427 |
|
|---|
| 2428 | 2016-10-15 Jim Meyering <meyering@fb.com>
|
|---|
| 2429 |
|
|---|
| 2430 | build: distribute new file, die.h, so "make distcheck" passes
|
|---|
| 2431 | * src/Makefile.am (grep_SOURCES): Add die.h.
|
|---|
| 2432 | Also, sort these file names.
|
|---|
| 2433 |
|
|---|
| 2434 | 2016-10-10 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2435 |
|
|---|
| 2436 | build: update gnulib submodule to latest
|
|---|
| 2437 |
|
|---|
| 2438 | 2016-10-09 Jim Meyering <meyering@fb.com>
|
|---|
| 2439 |
|
|---|
| 2440 | maint: die.h: add the "#define ..." part of double inclusion guard
|
|---|
| 2441 | * src/die.h (DIE_H): Define to 1.
|
|---|
| 2442 |
|
|---|
| 2443 | 2016-10-04 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2444 |
|
|---|
| 2445 | grep: don't assume stdbool.h before die call
|
|---|
| 2446 | * src/die.h: Include stdbool.h, since 'die' uses 'false'
|
|---|
| 2447 |
|
|---|
| 2448 | grep: die more systematically
|
|---|
| 2449 | * src/die.h: New file.
|
|---|
| 2450 | * src/dfasearch.c, src/grep.c, src/pcresearch.c: Include die.h.
|
|---|
| 2451 | * src/dfasearch.c (dfaerror):
|
|---|
| 2452 | * src/grep.c (context_length_arg, add_count, prline, setmatcher, main):
|
|---|
| 2453 | * src/pcresearch.c (jit_exec, Pcompile, Pexecute):
|
|---|
| 2454 | Use 'die' instead of 'error' when exiting.
|
|---|
| 2455 | * src/pcresearch.c: Do not include verify.h.
|
|---|
| 2456 | (die): Remove; now in die.h.
|
|---|
| 2457 | * src/search.h: Do not include error.h here, since this file does
|
|---|
| 2458 | not use anything defined in error.h. Instead, dfasearch.c, which
|
|---|
| 2459 | uses error.h's symbols, now includes error.h directly.
|
|---|
| 2460 |
|
|---|
| 2461 | 2016-10-02 Jim Meyering <meyering@fb.com>
|
|---|
| 2462 |
|
|---|
| 2463 | maint: post-release administrivia
|
|---|
| 2464 | * NEWS: Add header line for next release.
|
|---|
| 2465 | * .prev-version: Record previous version.
|
|---|
| 2466 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 2467 |
|
|---|
| 2468 | version 2.26
|
|---|
| 2469 | * NEWS: Record release date.
|
|---|
| 2470 |
|
|---|
| 2471 | 2016-10-01 Jim Meyering <meyering@fb.com>
|
|---|
| 2472 |
|
|---|
| 2473 | gnulib: update to latest; for getprogname fix
|
|---|
| 2474 |
|
|---|
| 2475 | 2016-10-01 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2476 |
|
|---|
| 2477 | tests/grep-dir: port to Solaris 10
|
|---|
| 2478 | * tests/grep-dir: Port to Solaris 10 'cat', which
|
|---|
| 2479 | exits with status 0 even after 'read' fails from a directory.
|
|---|
| 2480 |
|
|---|
| 2481 | 2016-09-28 Jim Meyering <meyering@fb.com>
|
|---|
| 2482 |
|
|---|
| 2483 | build: placate GCC 7's -Wimplicit-fallthrough
|
|---|
| 2484 | * src/pcresearch.c (die): New macro.
|
|---|
| 2485 | (Pexecute): Use it in place of offending uses of error,
|
|---|
| 2486 | to placate GCC 7's -Wimplicit-fallthrough.
|
|---|
| 2487 | Include verify.h. Since this is grep's first explicit use of this
|
|---|
| 2488 | gnulib module, ...
|
|---|
| 2489 | * bootstrap.conf (gnulib_modules): Add verify.
|
|---|
| 2490 |
|
|---|
| 2491 | gnulib: update to latest; for ...
|
|---|
| 2492 | This includes the following:
|
|---|
| 2493 | - a getprogname-vs-openbsd-5.1 portability fix
|
|---|
| 2494 | - "fallthru" comment-adding changes for dfa and unistr/u8-uctomb-aux.c
|
|---|
| 2495 | - another getprograme fix to avoid breaking newer glibc
|
|---|
| 2496 |
|
|---|
| 2497 | 2016-09-27 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2498 |
|
|---|
| 2499 | build: reword .git old-GCC warning
|
|---|
| 2500 | * configure.ac (gl_gcc_warnings): Reword diagnostic.
|
|---|
| 2501 | Suggested by Assaf Gordon in:
|
|---|
| 2502 | http://lists.gnu.org/archive/html/grep-devel/2016-09/msg00024.html
|
|---|
| 2503 |
|
|---|
| 2504 | build: port .git builds to newer GCC
|
|---|
| 2505 | * configure.ac (gl_gcc_warnings): Omit duplicate copy of 'main'.
|
|---|
| 2506 | Problem reported by Assaf Gordon in:
|
|---|
| 2507 | http://lists.gnu.org/archive/html/grep-devel/2016-09/msg00024.html
|
|---|
| 2508 |
|
|---|
| 2509 | build: port .git builds to older GCC
|
|---|
| 2510 | Problem reported by Dagobert Michelsen in:
|
|---|
| 2511 | http://lists.gnu.org/archive/html/grep-devel/2016-09/msg00018.html
|
|---|
| 2512 | * configure.ac (gl_gcc_warnings): Default to false if .git
|
|---|
| 2513 | exists but GCC is too old.
|
|---|
| 2514 |
|
|---|
| 2515 | 2016-09-27 Jim Meyering <meyering@fb.com>
|
|---|
| 2516 |
|
|---|
| 2517 | tests/long-pattern-perf: avoid false-failure due to cache speed
|
|---|
| 2518 | * tests/long-pattern-perf: This test would fail semi-consistently
|
|---|
| 2519 | on some systems, probably because the smaller regexp fit well
|
|---|
| 2520 | within cache, yet the larger one did not. In that case, there
|
|---|
| 2521 | was a relative speed difference greater than 20x and the test
|
|---|
| 2522 | would fail. Quadruple the sizes, to make that less likely.
|
|---|
| 2523 | Also, construct the 10x larger regexp directly from the smaller,
|
|---|
| 2524 | rather than relying on seq with endpoints to induce that
|
|---|
| 2525 | approximate size ratio. Reported by Bruce Dubbs in
|
|---|
| 2526 | https://lists.gnu.org/archive/html/grep-devel/2016-09/msg00013.html
|
|---|
| 2527 |
|
|---|
| 2528 | 2016-09-24 Jim Meyering <meyering@fb.com>
|
|---|
| 2529 |
|
|---|
| 2530 | build: avoid "./configure && make dist" missing-dep. failure
|
|---|
| 2531 | * Makefile.am (run-syntax-check): Depend on "all", to avoid a
|
|---|
| 2532 | parallel build failure due to a missing dependency. Reported by
|
|---|
| 2533 | Paul Eggert in https://bugs.gnu.org/24256#50
|
|---|
| 2534 |
|
|---|
| 2535 | 2016-09-24 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2536 |
|
|---|
| 2537 | build: update gnulib submodule to latest
|
|---|
| 2538 |
|
|---|
| 2539 | 2016-09-24 Jim Meyering <meyering@fb.com>
|
|---|
| 2540 |
|
|---|
| 2541 | tests/fmbtest: avoid false-failure due to reliance on MB-correct sed
|
|---|
| 2542 | * tests/fmbtest: Several of these tests would mistakenly fail due to
|
|---|
| 2543 | postprocessing with a combination of sed and locale support that failed
|
|---|
| 2544 | to handle some multibyte characters in the cs_CZ.UTF-8 locale. Instead
|
|---|
| 2545 | of relying on sed's multibyte support or anything locale-related to
|
|---|
| 2546 | perform this simple filtering, just use this: tr -cs '0-9' '[ *]'
|
|---|
| 2547 | Also, rather than exporting LC_ALL, just set it for each command.
|
|---|
| 2548 | Reported by Nelson H. F. Beebe.
|
|---|
| 2549 | https://bugs.gnu.org/24534
|
|---|
| 2550 |
|
|---|
| 2551 | tests: revamp multibyte-white-space test to be more permissive
|
|---|
| 2552 | This test elicits too many failures. Whether a system has accurate
|
|---|
| 2553 | unicode "whitespace" attributes should not influence whether grep's
|
|---|
| 2554 | test suite passes. In many cases, now you will see a warning that
|
|---|
| 2555 | some multibyte characters do not pass whitespace-related tests, but
|
|---|
| 2556 | this test no longer fails. However, if you run this test on a modern
|
|---|
| 2557 | enough system, it does require that \s and \S do work properly with
|
|---|
| 2558 | most of the listed characters.
|
|---|
| 2559 | * tests/multibyte-white-space: Confirm that Fedora 24's locale
|
|---|
| 2560 | tables still declare those four Unicode code points *not* whitespace.
|
|---|
| 2561 | Honor a new column telling how to handle failure. Provide more
|
|---|
| 2562 | information in each diagnostic.
|
|---|
| 2563 | Reported by Nelson H. F. Beebe.
|
|---|
| 2564 | https://bugs.gnu.org/24530
|
|---|
| 2565 |
|
|---|
| 2566 | tests: avoid erroneous failure of pcre-jitstack test
|
|---|
| 2567 | On some systems (*BSD), 'ulimit -s unlimited' would fail, yet the
|
|---|
| 2568 | test for that mistakenly masked the failure, so the following grep
|
|---|
| 2569 | command ended up failing with a segfault.
|
|---|
| 2570 | * tests/pcre-jitstack: Don't mask the ulimit failure.
|
|---|
| 2571 | Reported privately by Nelson H. F. Beebe.
|
|---|
| 2572 | https://bugs.gnu.org/24524
|
|---|
| 2573 |
|
|---|
| 2574 | 2016-09-23 Jim Meyering <meyering@fb.com>
|
|---|
| 2575 |
|
|---|
| 2576 | grep: avoid unwarranted "input file 'F' is also the output" on *BSD
|
|---|
| 2577 | On *BSD systems, any command like "echo y | grep x", where grep reads
|
|---|
| 2578 | from a pipe and writes to standard output, would mistakenly emit this:
|
|---|
| 2579 | grep: input file '(standard input)' is also the output
|
|---|
| 2580 | * src/grep.c (grepdesc): Ensure that the file descriptor we're
|
|---|
| 2581 | reading is a regular one before using SAME_INODE to test whether
|
|---|
| 2582 | it is the same as the descriptor open on standard output.
|
|---|
| 2583 | Nelson Beebe reported privately that the foad1 tests failed on many
|
|---|
| 2584 | BSD systems. Exposed by commit v2.25-2-gaf6af28.
|
|---|
| 2585 | https://bugs.gnu.org/24522
|
|---|
| 2586 |
|
|---|
| 2587 | tests: avoid backref-multibyte-slow false failure
|
|---|
| 2588 | * tests/backref-multibyte-slow (max_seconds): If we calculate
|
|---|
| 2589 | a max duration of 1 second, use 5. Otherwise, on high-latency
|
|---|
| 2590 | systems, it would be way too easy for the duration of the final
|
|---|
| 2591 | test run to exceed that limit. Reported by Nelson H. F. Beebe.
|
|---|
| 2592 | http://bugs.gnu.org/24516
|
|---|
| 2593 |
|
|---|
| 2594 | 2016-09-22 Jim Meyering <meyering@fb.com>
|
|---|
| 2595 |
|
|---|
| 2596 | gnulib: update to latest; for getprogname-vs-AIX fix
|
|---|
| 2597 |
|
|---|
| 2598 | 2016-09-18 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 2599 |
|
|---|
| 2600 | grep: add news entry for fix to bug#24233
|
|---|
| 2601 | * NEWS (Bug fixes): Add an entry describing bug#24233.
|
|---|
| 2602 | The bug was fixed by commit v2.25-77-gad468bb, by chance.
|
|---|
| 2603 |
|
|---|
| 2604 | 2016-09-15 Jim Meyering <meyering@fb.com>
|
|---|
| 2605 |
|
|---|
| 2606 | gnulib: update to latest
|
|---|
| 2607 |
|
|---|
| 2608 | 2016-09-10 Jim Meyering <meyering@fb.com>
|
|---|
| 2609 |
|
|---|
| 2610 | dfa: reflect move of grep's DFA code to gnulib
|
|---|
| 2611 | Now that the core DFA code and tests reside in gnulib,
|
|---|
| 2612 | remove the copies here and use what gnulib provides.
|
|---|
| 2613 | * bootstrap.conf: Use the dfa module.
|
|---|
| 2614 | * cfg.mk: Remove settings involving files that have moved.
|
|---|
| 2615 | (_gl_TS_unmarked_extern_functions): Add dfaerror and dfawarn.
|
|---|
| 2616 | It is wrong/ugly to have to define these global symbols to use
|
|---|
| 2617 | the dfa module, but we'll adjust that separately.
|
|---|
| 2618 | * po/POTFILES.in: Apply s/src/lib/ to src/dfa.c.
|
|---|
| 2619 | * src/Makefile.am: Remove mention of dfa.[ch] and localeinfo.[ch].
|
|---|
| 2620 | * tests/Makefile.am: Remove mention of the tests that we have
|
|---|
| 2621 | moved to the gnulib module.
|
|---|
| 2622 | * src/dfa.c: Remove file.
|
|---|
| 2623 | * src/dfa.h: Likewise.
|
|---|
| 2624 | * src/localeinfo.c: Likewise.
|
|---|
| 2625 | * src/localeinfo.h: Likewise.
|
|---|
| 2626 | * tests/dfa-match: Likewise.
|
|---|
| 2627 | * tests/dfa-match-aux.c: Likewise.
|
|---|
| 2628 | * tests/invalid-char-class: Likewise.
|
|---|
| 2629 |
|
|---|
| 2630 | gnulib: update to latest, for new dfa module
|
|---|
| 2631 |
|
|---|
| 2632 | 2016-09-08 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2633 |
|
|---|
| 2634 | grep: encoding errors suppress just their line
|
|---|
| 2635 | From a suggestion by Marcello Perathoner (Bug#22838).
|
|---|
| 2636 | * NEWS, doc/grep.texi (File and Directory Selection): Document this.
|
|---|
| 2637 | * src/grep.c (print_line_head): Do not suppress later output lines
|
|---|
| 2638 | merely because an earlier output line would have had an encoding error.
|
|---|
| 2639 | * tests/encoding-error: Test for the new behavior.
|
|---|
| 2640 |
|
|---|
| 2641 | 2016-09-08 Jim Meyering <meyering@fb.com>
|
|---|
| 2642 |
|
|---|
| 2643 | gnulib: update to latest, for getprogname fixes
|
|---|
| 2644 |
|
|---|
| 2645 | 2016-09-08 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 2646 |
|
|---|
| 2647 | dfa: additional change new option for anchored searches
|
|---|
| 2648 | * src/dfa.c (dfaexec_main): Do it.
|
|---|
| 2649 |
|
|---|
| 2650 | 2016-09-07 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2651 |
|
|---|
| 2652 | doc: define "context lines"
|
|---|
| 2653 | Reported by Igor Bogomazov via Santiago Ruano Rincón (Bug#24024).
|
|---|
| 2654 | * doc/grep.texi (Context Line Control): Define "context lines".
|
|---|
| 2655 |
|
|---|
| 2656 | build: update gnulib submodule to latest
|
|---|
| 2657 |
|
|---|
| 2658 | 2016-09-05 Jim Meyering <meyering@fb.com>
|
|---|
| 2659 |
|
|---|
| 2660 | maint: switch from gnulib's progname to getprogname module
|
|---|
| 2661 | * gnulib: Update to latest, for its new getprogname module.
|
|---|
| 2662 | * bootstrap.conf (avoided_gnulib_modules): Include the getprogname
|
|---|
| 2663 | module rather than the now-obsolescent progname.
|
|---|
| 2664 | * src/grep.c: Include "getprogname.h" rather than "progname.h"
|
|---|
| 2665 | and remove any use of set_program_name.
|
|---|
| 2666 | * tests/dfa-match-aux.c (main): Likewise.
|
|---|
| 2667 | * tests/get-mb-cur-max.c (main): Likewise.
|
|---|
| 2668 | * src/grep.c (usage, main): Use getprogname() in place of program_name.
|
|---|
| 2669 |
|
|---|
| 2670 | 2016-09-02 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2671 |
|
|---|
| 2672 | dfa: minor cleanup of previous change
|
|---|
| 2673 | * src/dfa.c (dfaexec_main): Omit redundant code and reindent.
|
|---|
| 2674 |
|
|---|
| 2675 | 2016-09-02 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 2676 |
|
|---|
| 2677 | dfa: additional change new option for anchored searches
|
|---|
| 2678 | * src/dfa.c (dfaexec_main): Do it.
|
|---|
| 2679 |
|
|---|
| 2680 | dfa: use single-byte algorithm even in non-UTF-8
|
|---|
| 2681 | * src/dfa.c (dfaexec_main): Do it. (This was inadvertently
|
|---|
| 2682 | omitted in a recent patch.)
|
|---|
| 2683 |
|
|---|
| 2684 | 2016-09-02 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2685 |
|
|---|
| 2686 | dfa: merge xalloc.h changes from Gawk
|
|---|
| 2687 | * src/dfa.h (_GL_ATTRIBUTE_MALLOC): Define here, as other
|
|---|
| 2688 | Gnulib .h files do. This is more consistent with Gawk.
|
|---|
| 2689 | * src/dfa.c: Include xalloc.h, since dfa.h no longer does so.
|
|---|
| 2690 | Include localeinfo.h later; we don't care about order, but Gawk does.
|
|---|
| 2691 |
|
|---|
| 2692 | 2016-09-02 Arnold Robbins <arnold@skeeve.com>
|
|---|
| 2693 |
|
|---|
| 2694 | dfa: port to C90
|
|---|
| 2695 | * src/dfa.c (dfamust): Avoid declarations after statement (Bug#21486).
|
|---|
| 2696 |
|
|---|
| 2697 | 2016-09-02 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2698 |
|
|---|
| 2699 | dfa: new option for anchored searches
|
|---|
| 2700 | This follows up on a suggestion by Norihiro Tanaka (Bug#24262).
|
|---|
| 2701 | * src/dfa.c (struct regex_syntax): New member 'anchor'.
|
|---|
| 2702 | (char_context): Use it.
|
|---|
| 2703 | (dfasyntax): Change signature to specify it, along with the old
|
|---|
| 2704 | FOLD and EOL args, as a single DFAOPTS arg. All uses changed.
|
|---|
| 2705 | * src/dfa.h (DFA_ANCHOR, DFA_CASE_FOLD, DFA_EOL_NUL): New constants
|
|---|
| 2706 | for dfasyntax new last arg.
|
|---|
| 2707 |
|
|---|
| 2708 | 2016-09-02 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 2709 |
|
|---|
| 2710 | dfa: simplify and optimize at initial state in execution
|
|---|
| 2711 | * src/dfa.c (skip_remains_mb): Remove argument *pwc. Update calller.
|
|---|
| 2712 | (dfaexec_main): Simplify and optimize at initial state (Bug#24261).
|
|---|
| 2713 |
|
|---|
| 2714 | dfa: simplify to find state index for state 0
|
|---|
| 2715 | * src/dfa.c (dfastate): Simplify to find state index for state 0.
|
|---|
| 2716 |
|
|---|
| 2717 | 2016-09-01 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 2718 |
|
|---|
| 2719 | tests: add a new test for SJIS locale
|
|---|
| 2720 | * tests/sjis-mb: Add a new test. It fails in grep-2.25 or prior.
|
|---|
| 2721 |
|
|---|
| 2722 | 2016-09-01 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2723 |
|
|---|
| 2724 | grep: update NEWS
|
|---|
| 2725 | * NEWS: Describe previous change.
|
|---|
| 2726 |
|
|---|
| 2727 | grep: use regex fastmap unless -i
|
|---|
| 2728 | This builds on a suggestion by Norihiro Tanaka (Bug#24009).
|
|---|
| 2729 | * src/dfasearch.c (GEAcompile): Use a fastmap unless -i.
|
|---|
| 2730 | This improves performance 20x for me using the first benchmark
|
|---|
| 2731 | given in Bug#24009.
|
|---|
| 2732 |
|
|---|
| 2733 | grep: improve dfasearch storage management
|
|---|
| 2734 | This patch is mostly refactoring, with a bit of performance tweaking.
|
|---|
| 2735 | It is done in preparation for a fix for Bug#24009.
|
|---|
| 2736 | * src/dfasearch.c (patterns): Now of type struct re_pattern_buffer *
|
|---|
| 2737 | instead of an anonymous struct pointer, since there is no longer
|
|---|
| 2738 | any need to keep regs here. All uses changed.
|
|---|
| 2739 | (GEAcompile): Use patlim instead of a hard-to-follow "total".
|
|---|
| 2740 | Use x2nrealloc to avoid potential O(N**2) reallocation algorithm.
|
|---|
| 2741 | Initialize just the pattern members that need clearing.
|
|---|
| 2742 | (EGexecute): Put regs into a static variable, as this code did
|
|---|
| 2743 | before 2001-02-18, as there is no need to have a separate set of
|
|---|
| 2744 | regs for each pattern. Explain the "Q@#%!#" comment better.
|
|---|
| 2745 |
|
|---|
| 2746 | 2016-09-01 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 2747 |
|
|---|
| 2748 | dfa: remove separation by context in transition in non-UTF8 multibyte locales
|
|---|
| 2749 | * src/dfa.c (struct dfa): Remove member curr_dependent. All uses
|
|---|
| 2750 | removed.
|
|---|
| 2751 |
|
|---|
| 2752 | 2016-09-01 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2753 |
|
|---|
| 2754 | dfa: document previous change
|
|---|
| 2755 | * NEWS: Adjust to match previous change.
|
|---|
| 2756 |
|
|---|
| 2757 | 2016-09-01 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 2758 |
|
|---|
| 2759 | dfa: avoid invalid character matching period
|
|---|
| 2760 | * dfa.c (transit_state): Avoid invalid character matching period.
|
|---|
| 2761 |
|
|---|
| 2762 | dfa: use single-byte algorithm even in non-UTF-8
|
|---|
| 2763 | Even in non-UTF8 locales, if the current input character
|
|---|
| 2764 | is single byte, we can use CSET to match ANYCHAR.
|
|---|
| 2765 | * src/dfa.c (struct dfa): New member canychar.
|
|---|
| 2766 | Cache index of CSET for ANYCHAR.
|
|---|
| 2767 | (lex): Make CSET for ANYCHAR.
|
|---|
| 2768 | (state_index): Simplify.
|
|---|
| 2769 | (dfastate): Consider CSET for ANYCHAR.
|
|---|
| 2770 | (transit_state_singlebyte, transit_state): Remove handling for eolbyte,
|
|---|
| 2771 | as we assume that eolbyte does not appear at current position.
|
|---|
| 2772 | (dfaexec_main): Use algorithm for single byte character to any single
|
|---|
| 2773 | byte character in input text always.
|
|---|
| 2774 | (dfasyntax): Initialize canychar.
|
|---|
| 2775 |
|
|---|
| 2776 | 2016-09-01 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2777 |
|
|---|
| 2778 | grep: avoid code duplication with -iF
|
|---|
| 2779 | This follows up on the -iF performance improvement (Bug#23752).
|
|---|
| 2780 | * NEWS: Simplify description of -iF improvement.
|
|---|
| 2781 | * src/dfa.c: Do not include wctype.h.
|
|---|
| 2782 | (lonesome_lower, case_folded_counterparts): Move to localeinfo.c.
|
|---|
| 2783 | (CASE_FOLDED_BUFSIZE): Move to localeinfo.h.
|
|---|
| 2784 | * src/grep.c: Do not include wctype.h.
|
|---|
| 2785 | (lonesome_lower): Remove.
|
|---|
| 2786 | (fgrep_icase_available): Use case_folded_counterparts instead.
|
|---|
| 2787 | Do not call it for the same character twice.
|
|---|
| 2788 | Return false on wcrtomb failures (which should never happen).
|
|---|
| 2789 | (fgrep_to_grep_pattern, main): Simplify. Let fgrep_to_grep’s
|
|---|
| 2790 | caller fiddle with the global variables.
|
|---|
| 2791 | * src/localeinfo.c: Include <wctype.h>
|
|---|
| 2792 | (lonesome_lower, case_folded_counterparts):
|
|---|
| 2793 | Move here from src/dfa.c. Return int, not unsigned int.
|
|---|
| 2794 | Verify that CASE_FOLDED_BUFSIZE is big enough.
|
|---|
| 2795 | * src/localeinfo.h (CASE_FOLDED_BUFSIZE): Now 32, so that
|
|---|
| 2796 | we don’t expose lonesome_lower’s size.
|
|---|
| 2797 | * src/searchutils.c (kwsinit): Return new kwset instead of
|
|---|
| 2798 | storing it via a pointer. All callers changed. Simplify a bit.
|
|---|
| 2799 |
|
|---|
| 2800 | 2016-09-01 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 2801 |
|
|---|
| 2802 | grep: speed up -iF in multibyte locales
|
|---|
| 2803 | In a multibyte locale, if a pattern is composed of only single byte
|
|---|
| 2804 | characters and their all counterparts are also single byte characters
|
|---|
| 2805 | and the pattern does not have invalid sequences, grep -iF uses the
|
|---|
| 2806 | fgrep matcher, the same as in a single byte locale (Bug#23752).
|
|---|
| 2807 | * NEWS: Mention it.
|
|---|
| 2808 | * src/grep.c (lonesome_lower): New constant.
|
|---|
| 2809 | (fgrep_icase_available): New function.
|
|---|
| 2810 | (fgrep_to_grep_pattern): Simplify it.
|
|---|
| 2811 | (main): Use them.
|
|---|
| 2812 | * src/searchutils.c (kwsinit): New arg MB_TRANS; all uses changed.
|
|---|
| 2813 | Try fgrep matcher for case insensitive matching by grep -F in multibyte
|
|---|
| 2814 | locale.
|
|---|
| 2815 |
|
|---|
| 2816 | 2016-08-31 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2817 |
|
|---|
| 2818 | build: update gnulib submodule to latest
|
|---|
| 2819 |
|
|---|
| 2820 | 2016-08-31 Jim Meyering <meyering@fb.com>
|
|---|
| 2821 |
|
|---|
| 2822 | maint: avoid new 'make syntax-check' failure
|
|---|
| 2823 | * src/dfa.c (using_simple_locale): Prefer STREQ(a,b) over
|
|---|
| 2824 | strcmp(a,b) == 0.
|
|---|
| 2825 |
|
|---|
| 2826 | gnulib: update to latest
|
|---|
| 2827 |
|
|---|
| 2828 | 2016-08-31 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2829 |
|
|---|
| 2830 | dfa: make dfa.c fully thread-safe
|
|---|
| 2831 | This follows up on Zev Weiss’s recent patches to make the DFA code
|
|---|
| 2832 | thread-safe (Bug#24249). It removes the remaining static
|
|---|
| 2833 | variables used by dfa.c. These variables are locale-dependent, so
|
|---|
| 2834 | they would cause problems in multithreaded code where different
|
|---|
| 2835 | threads are in different locales (e.g., via uselocale). I
|
|---|
| 2836 | abstracted most of the variables into a new localeinfo module.
|
|---|
| 2837 | * src/Makefile.am (grep_SOURCES): Add localeinfo.c.
|
|---|
| 2838 | (noinst_HEADERS): Add localeinfo.h.
|
|---|
| 2839 | * src/dfa.c: Include localeinfo.h.
|
|---|
| 2840 | (struct dfa): Remove multibyte member, as it is now part of
|
|---|
| 2841 | localeinfo. New members simple_locale and localeinfo.
|
|---|
| 2842 | Put locale-related members at the end.
|
|---|
| 2843 | (mbrtowc_cache): Remove; now part of dfa->localeinfo.
|
|---|
| 2844 | (charclass_index): Rename back from dfa_charclass_index,
|
|---|
| 2845 | since it's private.
|
|---|
| 2846 | (unibyte_word_constituent): New arg DFA; use its sbctowc member.
|
|---|
| 2847 | (using_utf8, dfa_using_utf8, init_mbrtowc_cache, check_utf8):
|
|---|
| 2848 | Remove; now done by localeinfo members. All uses changed.
|
|---|
| 2849 | (dfasyntax): New localeinfo arg. Move to end to avoid forward decls.
|
|---|
| 2850 | Initialize the entire DFA.
|
|---|
| 2851 | (unibyte_c, check_unibyte_c): Remove; now in simple_locale member.
|
|---|
| 2852 | (using_simple_locale): Now takes bool instead of DFA.
|
|---|
| 2853 | Do the locale check here, rather than in the caller,
|
|---|
| 2854 | as the result is now cached in dfa->simple_locale.
|
|---|
| 2855 | (dfaalloc): Just allocate the DFA. dfasyntax now initializes it.
|
|---|
| 2856 | * src/dfa.h: Add forward decl of struct localeinfo.
|
|---|
| 2857 | Adjust to new dfa.c API.
|
|---|
| 2858 | * src/dfasearch.c (localeinfo): New var, replacing former static
|
|---|
| 2859 | vars like mbrtowc_cache.
|
|---|
| 2860 | * src/localeinfo.c, src/localeinfo.h: New files.
|
|---|
| 2861 | * src/search.h: Include localeinfo.h.
|
|---|
| 2862 | (localeinfo): New decl.
|
|---|
| 2863 | * src/searchutils.c (mbclen_cache, build_mbclen_cache):
|
|---|
| 2864 | Remove. All uses changed to localeinfo.
|
|---|
| 2865 | * tests/Makefile.am (dfa_match_aux_LDADD): Add localeinfo.o.
|
|---|
| 2866 | * tests/dfa-match-aux.c: Include localeinfo.h.
|
|---|
| 2867 | (main): Adjust to changes in DFA API.
|
|---|
| 2868 |
|
|---|
| 2869 | 2016-08-28 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2870 |
|
|---|
| 2871 | build: update gnulib submodule to latest
|
|---|
| 2872 | This should fix Bug#24323 reported by Dennis Clarke, where grep
|
|---|
| 2873 | does not build on Solaris 10 when compiled with Solaris Studio 12.4.
|
|---|
| 2874 |
|
|---|
| 2875 | 2016-08-23 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2876 |
|
|---|
| 2877 | dfa: minor thread-safety cleanups
|
|---|
| 2878 | * src/dfa.c (struct lexer_state): Rename lexptr to ptr and lexleft
|
|---|
| 2879 | to left, for brevity. All uses changed.
|
|---|
| 2880 | (struct dfa): Rename lexstate to lex and parsestate to parse,
|
|---|
| 2881 | for brevity. All uses changed.
|
|---|
| 2882 | (using_simple_locale): Simplify boolean expression.
|
|---|
| 2883 | (FETCH_WC): Parenthesize uses of dfa macro arg.
|
|---|
| 2884 | (FETCH_WC, parse_bracket_exp, addtok_mb): Prefer suffix operators
|
|---|
| 2885 | on structure members when possible, for clarity.
|
|---|
| 2886 | (parse_bracket_exp): Check for buffer exhaustion before
|
|---|
| 2887 | dereferencing buffer pointer.
|
|---|
| 2888 | (struct lexptr): New type.
|
|---|
| 2889 | (push_lex_state, pop_lex_state): Use it. Change from macros
|
|---|
| 2890 | PUSH_LEX_STATE and POP_LEX_STATE to static functions, and add
|
|---|
| 2891 | parameters to make them proper C functions. All uses changed.
|
|---|
| 2892 | (lex): Simplify tests for \) and \|. Avoid some string
|
|---|
| 2893 | duplication by using &"^..."[boolean].
|
|---|
| 2894 | (dfaalloc): Use xzalloc, not xcalloc with 1.
|
|---|
| 2895 |
|
|---|
| 2896 | 2016-08-21 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2897 |
|
|---|
| 2898 | grep: minor tweaks of initial buffer alloc
|
|---|
| 2899 | * src/grep.c (main): Allocate input buffer only when about
|
|---|
| 2900 | to do I/O. Avoid int overflow on systems with 2 GiB pages.
|
|---|
| 2901 | Fix size_t overflow check.
|
|---|
| 2902 |
|
|---|
| 2903 | 2016-08-20 Zev Weiss <zev@bewilderbeest.net>
|
|---|
| 2904 |
|
|---|
| 2905 | dfa: constify some function parameters
|
|---|
| 2906 | * src/dfa.c (char_context): Mark dfa parameter const.
|
|---|
| 2907 | (charclass_context): Likewise.
|
|---|
| 2908 |
|
|---|
| 2909 | dfa: thread-safety: initialize mbrtowc_cache in dfa_init
|
|---|
| 2910 | * src/dfa.c (dfasyntax): Remove initialization of mbrtowc_cache.
|
|---|
| 2911 | (init_mbrtowc_cache): New function.
|
|---|
| 2912 | (dfa_init): Call it.
|
|---|
| 2913 | http://bugs.gnu.org/24259
|
|---|
| 2914 |
|
|---|
| 2915 | dfa: thread-safety: eliminate static local variables
|
|---|
| 2916 | * src/dfa.c: Replace utf8 and unibyte_c static local variables with
|
|---|
| 2917 | static globals initialized by a new function dfa_init() which must be
|
|---|
| 2918 | called before any other dfa*() functions.
|
|---|
| 2919 | (dfa_using_utf8): Rename using_utf8() to dfa_using_utf8() for
|
|---|
| 2920 | consistency with other exported functions.
|
|---|
| 2921 | * src/dfa.h (dfa_using_utf8): Rename using_utf8() to dfa_using_utf8();
|
|---|
| 2922 | also add _GL_ATTRIBUTE_PURE.
|
|---|
| 2923 | (dfa_init): New function.
|
|---|
| 2924 | * src/grep.c (main), tests/dfa-match-aux.c (main): Call dfa_init().
|
|---|
| 2925 | * src/dfasearch.c (EGexecute): Replace using_utf8 with dfa_using_utf8.
|
|---|
| 2926 | * src/kwsearch.c (Fexecute): Likewise.
|
|---|
| 2927 | * src/pcresearch.c (Pcompile): Likewise.
|
|---|
| 2928 | http://bugs.gnu.org/24259
|
|---|
| 2929 |
|
|---|
| 2930 | dfa: thread-safety: move regex syntax configuration into struct dfa
|
|---|
| 2931 | * src/dfa.c: move global variables holding regex syntax configuration
|
|---|
| 2932 | into a new struct (`struct regex_syntax') and add an instance of it to
|
|---|
| 2933 | struct dfa. All references to the globals are replaced with
|
|---|
| 2934 | references to the dfa struct's new member. As a side effect, a
|
|---|
| 2935 | `struct dfa' must be allocated with dfaalloc() and passed to
|
|---|
| 2936 | dfasyntax().
|
|---|
| 2937 | * src/dfa.h (dfasyntax): Add new struct dfa* parameter.
|
|---|
| 2938 | * src/dfasearch.c (GEAcompile): Allocate `dfa' earlier and pass it to
|
|---|
| 2939 | dfasyntax().
|
|---|
| 2940 | * tests/dfa-match-aux.c (main): Pass `dfa' to dfasyntax().
|
|---|
| 2941 | http://bugs.gnu.org/24259
|
|---|
| 2942 |
|
|---|
| 2943 | dfa: thread-safety: move parser state into struct dfa
|
|---|
| 2944 | * src/dfa.c: move global variables holding parser state (`tok' and
|
|---|
| 2945 | `depth') into a new struct (`struct parser_state') and add an instance
|
|---|
| 2946 | of it to struct dfa. All references to the globals are replaced by
|
|---|
| 2947 | references to the dfa struct's new member.
|
|---|
| 2948 | http://bugs.gnu.org/24259
|
|---|
| 2949 |
|
|---|
| 2950 | dfa: thread-safety: move lexer state into struct dfa
|
|---|
| 2951 | * src/dfa.c: move global variables holding lexer state into a new
|
|---|
| 2952 | struct (`struct lexer_state') and add an instance of this struct to
|
|---|
| 2953 | struct dfa. All references to the globals are replaced with
|
|---|
| 2954 | references to the dfa struct's new member.
|
|---|
| 2955 | http://bugs.gnu.org/24259
|
|---|
| 2956 |
|
|---|
| 2957 | 2016-08-19 Zev Weiss <zev@bewilderbeest.net>
|
|---|
| 2958 |
|
|---|
| 2959 | dfa: thread-safety: remove dfa.c's "dfa" global
|
|---|
| 2960 | Remove the global dfa struct. Instead, add a struct dfa pointer
|
|---|
| 2961 | parameter to each function that had been using the global.
|
|---|
| 2962 | * src/dfa.c (dfa): Remove file-scoped global.
|
|---|
| 2963 | (charclass_index): Remove now-unnecessary function.
|
|---|
| 2964 | (using_simple_locale): Add a dfa parameter and update all callers.
|
|---|
| 2965 | (FETCH_WC, parse_bracket_exp, lex, addtok_mb, addtok): Likewise.
|
|---|
| 2966 | (addtok_wc, add_utf8_anychar, atom, nsubtoks, copytoks): Likewise.
|
|---|
| 2967 | (closure, branch, regexp): Likewise.
|
|---|
| 2968 | (dfaparse): No longer set the global.
|
|---|
| 2969 | http://bugs.gnu.org/24260
|
|---|
| 2970 |
|
|---|
| 2971 | 2016-08-18 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2972 |
|
|---|
| 2973 | grep: tune list_files conversion to enum
|
|---|
| 2974 | * src/grep.c (grepdesc): Use a slightly more-efficient way to test
|
|---|
| 2975 | list_files.
|
|---|
| 2976 |
|
|---|
| 2977 | grep: prefer bitwise to short-circuit when shorter
|
|---|
| 2978 | * src/grep.c (skip_devices, initialize_unibyte_mask, fillbuf, main)
|
|---|
| 2979 | * src/kwsearch.c (Fexecute): Prefer bitwise to short-circuit ops
|
|---|
| 2980 | when they are logically equivalent and the bitwise ops generate
|
|---|
| 2981 | shorter code on GCC 6.1 x86-64.
|
|---|
| 2982 | * src/grep.c (get_nondigit_option, parse_grep_colors):
|
|---|
| 2983 | Use c_isdigit instead of spelling it out with a short-circuit op.
|
|---|
| 2984 |
|
|---|
| 2985 | 2016-08-17 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 2986 |
|
|---|
| 2987 | dfa: use 64-bit when ulong is at least that wide
|
|---|
| 2988 | * src/dfa.c (charclass_word): Now unsigned long instead of unsigned.
|
|---|
| 2989 | (CHARCLASS_WORD_BITS): Now 64 on 64-bit platforms.
|
|---|
| 2990 | (CHARCLASS_PAIR, CHARCLASS_INIT): New macros.
|
|---|
| 2991 | (CHARCLASS_WORD_MASK): Now a static const, since it no longer
|
|---|
| 2992 | needs to be a macro.
|
|---|
| 2993 | (equal): Open-code rather than calling memcmp.
|
|---|
| 2994 | (add_utf8_anychar): Use CHARCLASS_INIT.
|
|---|
| 2995 |
|
|---|
| 2996 | dfa: avoid uninitialized constants
|
|---|
| 2997 | Some compilers warn about 'static int const x;' on the grounds
|
|---|
| 2998 | that X should have an initializer. Instead of worrying about
|
|---|
| 2999 | this, rewrite to avoid this sort of thing.
|
|---|
| 3000 | * src/dfa.c (emptyset): New function.
|
|---|
| 3001 | (parse_bracket_exp): Use it instead of 'equal' and a zero constant.
|
|---|
| 3002 | * src/dfasearch.c (struct patterns): Remove tag 'patterns'.
|
|---|
| 3003 | (patterns0): Remove zero constant.
|
|---|
| 3004 | (GEAcompile): Use memset instead of the zero constant.
|
|---|
| 3005 |
|
|---|
| 3006 | 2016-08-17 Jim Meyering <meyering@fb.com>
|
|---|
| 3007 |
|
|---|
| 3008 | maint: avoid new "make syntax-check" failure
|
|---|
| 3009 | * src/dfa.c: Adjust comment not to go past column 80.
|
|---|
| 3010 |
|
|---|
| 3011 | tests: pcre-jitstack: avoid false failure without base64 -d support
|
|---|
| 3012 | * tests/pcre-jitstack: Try harder to find a base64 decoder:
|
|---|
| 3013 | try 'base64 -d', 'base64 -D', 'openssl base64 -d' and perl's
|
|---|
| 3014 | MIME::Base64 decode_base64. The old code would fail at least on
|
|---|
| 3015 | OS X, for which base64 expects -D or --decode.
|
|---|
| 3016 | Reported by Jack Howarth in http://bugs.gnu.org/24243.
|
|---|
| 3017 |
|
|---|
| 3018 | 2016-08-16 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3019 |
|
|---|
| 3020 | dfa: minor refactoring and doc fixes
|
|---|
| 3021 | * NEWS: Improve description of recent change.
|
|---|
| 3022 | * src/dfa.c: Improve commentary. Indent new code (and some
|
|---|
| 3023 | long-existing howlers) more in GNU style.
|
|---|
| 3024 | (dfa_state): Reorder members to make struct smaller on x86.
|
|---|
| 3025 | mb_trindex member is now state_num, not size_t, so that -1 is more
|
|---|
| 3026 | natural; all uses changed.
|
|---|
| 3027 | (struct dfa): Similarly for mb_trcount member.
|
|---|
| 3028 | (state_index): Compute values for new state components before
|
|---|
| 3029 | allocating the state, to make the code easier to understand.
|
|---|
| 3030 | (state_index, dfastate): Prefer A & ~B to other forms like (A & B)
|
|---|
| 3031 | != A.
|
|---|
| 3032 | (dfastate, build_state, transit_state): In new code, prefer i++ to
|
|---|
| 3033 | ++i in for-loop control.
|
|---|
| 3034 | (build_state, transit_state): In new code, prefer < to >.
|
|---|
| 3035 | (transit_state): Add to *PP in one assignment, rather than in a
|
|---|
| 3036 | loop. Prefer !x to x == NULL. Use xmalloc instead of xnmalloc,
|
|---|
| 3037 | since the size is a constant. Do the size calculation as a signed
|
|---|
| 3038 | integer constant expression, so that the compiler diagnoses any
|
|---|
| 3039 | overflow.
|
|---|
| 3040 | (transit_state, free_mbdata): Tune by looping from -1 to N - 1,
|
|---|
| 3041 | rather than from 0 to N - 1 with a separate instance for -1.
|
|---|
| 3042 | (dfaexec_main): Rewrite to avoid side effects in if-part.
|
|---|
| 3043 | (free_mbdata): Simplify.
|
|---|
| 3044 |
|
|---|
| 3045 | dfa: port to C90
|
|---|
| 3046 | * src/dfa.c (transit_state, dfa_supported, dfamust):
|
|---|
| 3047 | Don't use declarations after statements.
|
|---|
| 3048 | If I recall correctly, gawk still wants to port to C90.
|
|---|
| 3049 |
|
|---|
| 3050 | dfa: fix context newline confusion
|
|---|
| 3051 | * src/dfa.c (transit_state): Fix "... & ~0" that was evidently
|
|---|
| 3052 | intended to be "... & ~1". Do index calculation in a simpler way,
|
|---|
| 3053 | that uses just addition (Bug#21486).
|
|---|
| 3054 |
|
|---|
| 3055 | 2016-08-16 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 3056 |
|
|---|
| 3057 | dfa: improve leading "." with non-UTF8 multibyte
|
|---|
| 3058 | In non-UTF8 multibyte locales, matching the dot expression is very
|
|---|
| 3059 | slow, as the next state is calculated on demand. This change caches
|
|---|
| 3060 | the result for the typical case (Bug#21486).
|
|---|
| 3061 |
|
|---|
| 3062 | Compare the run times of this command before and after this change,
|
|---|
| 3063 | on a i5-4570 CPU @ 3.20GHz using rawhide (~fedora 22) and compiled
|
|---|
| 3064 | with gcc 5.1.1 20150618:
|
|---|
| 3065 | yes "$(printf 'a%38db\n' 0)" | head -1000000 >in
|
|---|
| 3066 | env LC_ALL=ja_JP.eucJP time -p \
|
|---|
| 3067 | src/grep .......................................... in
|
|---|
| 3068 | Before: 19.10
|
|---|
| 3069 | After : 0.55
|
|---|
| 3070 |
|
|---|
| 3071 | * NEWS: Document this.
|
|---|
| 3072 | * src/dfa.c: (struct dfa_state): New members curr_dependent, mb_trindex.
|
|---|
| 3073 | (MAX_TRCOUNT): New constant.
|
|---|
| 3074 | (struct dfa): New members mb_trans, mb_trcount.
|
|---|
| 3075 | (state_index): Initialize new members of struct dfa_state and calculate
|
|---|
| 3076 | dependency on context of next character for positions for dot.
|
|---|
| 3077 | (dfastate): Calculate follows positions for dot if enabled.
|
|---|
| 3078 | (realloc_trans_if_necessary): Allocate transition tables.
|
|---|
| 3079 | (build_state): Use new constant and reset transition tables.
|
|---|
| 3080 | (transit_state): Use cache for transition from a state with the dot
|
|---|
| 3081 | expression.
|
|---|
| 3082 | (free_mbdata): Deallocate transition tables.
|
|---|
| 3083 |
|
|---|
| 3084 | 2016-08-06 Jim Meyering <meyering@fb.com>
|
|---|
| 3085 |
|
|---|
| 3086 | tests: standardize on 10-second timeouts to avoid rare false failure
|
|---|
| 3087 | In a parallel test run, it is not unusual to exceed a timeout of
|
|---|
| 3088 | 1-3 seconds. Increase several from 3 or fewer to 10 seconds.
|
|---|
| 3089 | * tests/skip-device: Increase timeout from 2 to 10 seconds.
|
|---|
| 3090 | * tests/grep-dev-null-out: Likewise, but s/1/10/.
|
|---|
| 3091 | * tests/pcre-invalid-utf8-input: Likewise, but s/3/10/.
|
|---|
| 3092 | * tests/dfa-match: Likewise.
|
|---|
| 3093 | * tests/pcre-invalid-utf8-infloop: Likewise.
|
|---|
| 3094 | * tests/pcre-infloop: Likewise.
|
|---|
| 3095 | * tests/max-count-overread: Likewise.
|
|---|
| 3096 | * tests/invalid-multibyte-infloop: Likewise.
|
|---|
| 3097 | Prompted by http://bugs.gnu.org/24159.
|
|---|
| 3098 |
|
|---|
| 3099 | tests/backref-multibyte-slow:: avoid false positive
|
|---|
| 3100 | * tests/backref-multibyte-slow: When redirecting the "fast" LC_ALL=C
|
|---|
| 3101 | run's output to /dev/null, we got an artificially low timing (of 0),
|
|---|
| 3102 | due to grep's own stdout-vs-/dev/null optimization. With an initial
|
|---|
| 3103 | timing of 0 on that first run, the derived timeout for the UTF-8 run
|
|---|
| 3104 | (which redirects to a file) would be a mere 1 second. The fix: also
|
|---|
| 3105 | redirect that first run's output to a file, not to /dev/null.
|
|---|
| 3106 |
|
|---|
| 3107 | 2016-08-05 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 3108 |
|
|---|
| 3109 | dfa: minor fix for whether dfa is "fast"
|
|---|
| 3110 | * src/dfa.c (dfaoptimize): When a UTF-8 optimization succeeds for
|
|---|
| 3111 | a DFA (it can use single-byte code paths), record that by setting
|
|---|
| 3112 | its ->fast flag.
|
|---|
| 3113 |
|
|---|
| 3114 | 2016-07-25 Jim Meyering <meyering@fb.com>
|
|---|
| 3115 |
|
|---|
| 3116 | grep: print "filename:lineno:" in invalid-regex diagnostic
|
|---|
| 3117 | Determining the file name and line number is a little tricky because
|
|---|
| 3118 | of the way the regular expressions are all concatenated onto a newline-
|
|---|
| 3119 | separated list. By the time grep would compile regular expressions,
|
|---|
| 3120 | the <filename,lineno> origin of each regexp was no longer available.
|
|---|
| 3121 | This patch adds a list of filename,first_lineno pairs, one per input
|
|---|
| 3122 | source, by which we can then map the ordinal regexp number to a
|
|---|
| 3123 | filename,lineno pair for the diagnostic.
|
|---|
| 3124 |
|
|---|
| 3125 | * src/dfasearch.c (GEAcompile): When diagnosing an invalid regexp
|
|---|
| 3126 | specified via -f FILE, include the "FILENAME:LINENO: " prefix.
|
|---|
| 3127 | Also, when there are two or more lines with compilation failures,
|
|---|
| 3128 | diagnose all of them, rather than stopping after the first.
|
|---|
| 3129 | * src/grep.h (pattern_file_name): Declare it.
|
|---|
| 3130 | * src/grep.c: (struct FL_pair): Define type.
|
|---|
| 3131 | (fl_pair, n_fl_pair_slots, n_pattern_files, patfile_lineno):
|
|---|
| 3132 | Define globals.
|
|---|
| 3133 | (fl_add, pattern_file_name): Define functions.
|
|---|
| 3134 | (main): Call fl_add for each type of the following: -e argument,
|
|---|
| 3135 | -f argument, command-line-specified (without -e) regexp.
|
|---|
| 3136 | * tests/filename-lineno.pl: New file.
|
|---|
| 3137 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 3138 | * NEWS (Improvements): Mention this.
|
|---|
| 3139 | Initially reported by Gunnar Wolf in https://bugs.debian.org/525214
|
|---|
| 3140 | Forwarded to grep's bug list by Santiago Ruano Rincón as
|
|---|
| 3141 | http://debbugs.gnu.org/23965
|
|---|
| 3142 |
|
|---|
| 3143 | 2016-07-24 Jim Meyering <meyering@fb.com>
|
|---|
| 3144 |
|
|---|
| 3145 | tests: add coreutils' perl-driven test framework
|
|---|
| 3146 | * configure.ac: Set the AM_CONDITIONAL variable, HAVE_PERL.
|
|---|
| 3147 | * tests/Coreutils.pm: New file.
|
|---|
| 3148 | * tests/CuSkip.pm: New file.
|
|---|
| 3149 | * tests/CuTmpdir.pm: New file.
|
|---|
| 3150 | * tests/no-perl: New file.
|
|---|
| 3151 | * tests/Makefile.am: Set up to use .pl tests:
|
|---|
| 3152 | (TEST_EXTENSIONS, TESTSUITE_PERL, TESTSUITE_PERL_OPTIONS): Define.
|
|---|
| 3153 | (SH_LOG_COMPILER, PL_LOG_COMPILER): Define.
|
|---|
| 3154 | (EXTRA_DIST): Add the four new file names.
|
|---|
| 3155 |
|
|---|
| 3156 | doc: omit an excess word in HACKING
|
|---|
| 3157 |
|
|---|
| 3158 | 2016-07-21 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 3159 |
|
|---|
| 3160 | grep: always match single line only with DFA superset
|
|---|
| 3161 | \n cannot occur inside a multibyte character. So an input always
|
|---|
| 3162 | matches single line only with DFA superset.
|
|---|
| 3163 |
|
|---|
| 3164 | * src/dfasearch.c (EGexecute): Simplify it with above.
|
|---|
| 3165 |
|
|---|
| 3166 | 2016-07-15 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 3167 |
|
|---|
| 3168 | dfa: fix whitespace problems
|
|---|
| 3169 | * src/dfa.c: Use GNU style for pointer decls.
|
|---|
| 3170 |
|
|---|
| 3171 | 2016-07-15 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3172 |
|
|---|
| 3173 | maint: modernize HACKING a bit
|
|---|
| 3174 | * HACKING: Remove some ancient history to simplify maintenance.
|
|---|
| 3175 |
|
|---|
| 3176 | 2016-07-14 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3177 |
|
|---|
| 3178 | grep: minor style changes for -F crash fix
|
|---|
| 3179 | * src/kwset.c (memoff2_kwset): Use ?: instead of if-else.
|
|---|
| 3180 |
|
|---|
| 3181 | 2016-07-14 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 3182 |
|
|---|
| 3183 | grep: fix -F crash when alternating duplicates
|
|---|
| 3184 | grep -F crashes with a pattern like 0\n0.
|
|---|
| 3185 | This bug was introduced in 966f6586fbce3081ce6e5e2f9b55301b0ec3d2b4.
|
|---|
| 3186 |
|
|---|
| 3187 | * src/kwset.c (memoff2_kwset): If two characters are the same,
|
|---|
| 3188 | use memchr instead of memchr2.
|
|---|
| 3189 | * tests/two-chars: New test.
|
|---|
| 3190 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 3191 |
|
|---|
| 3192 | 2016-07-07 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3193 |
|
|---|
| 3194 | dfa: fix comments to match code better
|
|---|
| 3195 | * src/dfa.c: Fix comments.
|
|---|
| 3196 |
|
|---|
| 3197 | 2016-07-06 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3198 |
|
|---|
| 3199 | dfa: don't treat null bytes specially
|
|---|
| 3200 | * src/dfa.c (transit_state): Do not treat null byte specially
|
|---|
| 3201 | when eolbyte == '\n'.
|
|---|
| 3202 |
|
|---|
| 3203 | 2016-07-06 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 3204 |
|
|---|
| 3205 | dfa: don't distingish letter in non-POSIX locales
|
|---|
| 3206 | For non-POSIX locales, dfa does not support word delimiter
|
|---|
| 3207 | support, so remove distinction between letters and non-letters.
|
|---|
| 3208 | * src/dfa.c (struct dfa): Remove members initstate_letter,
|
|---|
| 3209 | initstate_others. All uses removed. New member initstate_notbol.
|
|---|
| 3210 | (dfaanalyze, dfaexec_main): Replace old members with new member.
|
|---|
| 3211 | (wchar_context): Remove. Update callers.
|
|---|
| 3212 |
|
|---|
| 3213 | 2016-07-06 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3214 |
|
|---|
| 3215 | dfa: minor cleanups for non-POSIX simplification
|
|---|
| 3216 | * src/dfa.c (transit_state_singlebyte): Remove unnecessary 'const'
|
|---|
| 3217 | from arg; we usually don't bother with 'const' on locals.
|
|---|
| 3218 | (transit_state_singlebyte): Omit '!= NULL' in boolean context.
|
|---|
| 3219 | Use assert rather than abort.
|
|---|
| 3220 |
|
|---|
| 3221 | 2016-07-06 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 3222 |
|
|---|
| 3223 | dfa: simplify for non-POSIX locales
|
|---|
| 3224 | Simplify the dfa code, since it no longer supports ranges,
|
|---|
| 3225 | collating elements, and equivalent classes in non-POSIX locales.
|
|---|
| 3226 | * src/dfa.c (struct dfa): Remove mb_match_lens.
|
|---|
| 3227 | (enum status_transit_state, match_anychar)
|
|---|
| 3228 | (check_matching_with_multibyte_ops, transit_state_consume_1char):
|
|---|
| 3229 | (State_transition): Remove.
|
|---|
| 3230 | (transit_state_singlebyte): Accepts pointer-to-pointer position,
|
|---|
| 3231 | instead of pointer, and no longer accept pointer to next state.
|
|---|
| 3232 | Return next state instead of status_transit_state. All callers
|
|---|
| 3233 | changed.
|
|---|
| 3234 | (transit_state_singlebyte, transit_state): Simplify.
|
|---|
| 3235 | (dfaexec_main): Now transit_state is called only when next character
|
|---|
| 3236 | matches with ANYCHAR.
|
|---|
| 3237 |
|
|---|
| 3238 | 2016-06-14 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3239 |
|
|---|
| 3240 | doc: propagate more changes from grep.texi
|
|---|
| 3241 | Problem reported by Björn Voigt in: http://bugs.gnu.org/23763#27
|
|---|
| 3242 | * doc/grep.in.1: Fix more inconsistencies with grep.texi.
|
|---|
| 3243 |
|
|---|
| 3244 | 2016-06-13 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3245 |
|
|---|
| 3246 | doc: remove obsolete MS-DOS mention
|
|---|
| 3247 | * doc/grep.in.1: Remove obsolete discussion of MS-DOS heuristics.
|
|---|
| 3248 | Problem reported by Björn Voigt in: http://bugs.gnu.org/23763
|
|---|
| 3249 |
|
|---|
| 3250 | 2016-06-09 Zev Weiss <zev@bewilderbeest.net>
|
|---|
| 3251 |
|
|---|
| 3252 | grep: do pagesize initialization and buffer allocation earlier
|
|---|
| 3253 | * src/grep.c (reset, main): We're going to need pagesize and buffer
|
|---|
| 3254 | initialized anyway, so we might as well do so unconditionally early on
|
|---|
| 3255 | rather than checking on every call to reset().
|
|---|
| 3256 | http://bugs.gnu.org/23717
|
|---|
| 3257 |
|
|---|
| 3258 | grep: remove unnecessary dirdesc variable.
|
|---|
| 3259 | * src/grep.c (grepdirent): Remove dirdesc variable and just use
|
|---|
| 3260 | fts_cwd_fd directly, since the fts_options test was guaranteed to
|
|---|
| 3261 | succeed (and fts_cwd_fd was already being used directly in fstatat()
|
|---|
| 3262 | anyway). http://bugs.gnu.org/23716
|
|---|
| 3263 |
|
|---|
| 3264 | grep: convert list_files to an enum
|
|---|
| 3265 | * src/grep.c: Make list_files a tristate enum instead of an int.
|
|---|
| 3266 | http://bugs.gnu.org/23715
|
|---|
| 3267 |
|
|---|
| 3268 | grep: correct a stale comment and remove dead code
|
|---|
| 3269 | * src/grep.c (grepdesc): The `grep()' function no longer has
|
|---|
| 3270 | special-case negative return values, since it no longer handles
|
|---|
| 3271 | directories, so don't bother checking for them.
|
|---|
| 3272 | http://bugs.gnu.org/23714
|
|---|
| 3273 |
|
|---|
| 3274 | maint: replace bitwise with logical OR
|
|---|
| 3275 | * src/grep.c (main): replace bitwise ORs with logical ORs where it
|
|---|
| 3276 | makes sense (when dealing with boolean conditions as opposed to
|
|---|
| 3277 | bitmasks). http://bugs.gnu.org/23713
|
|---|
| 3278 |
|
|---|
| 3279 | maint: mark a couple of static variables const
|
|---|
| 3280 | * src/dfa.c (parse_bracket_exp): mark zeroclass const.
|
|---|
| 3281 | * src/dfasearch.c: mark patterns0 const.
|
|---|
| 3282 | http://bugs.gnu.org/23712
|
|---|
| 3283 |
|
|---|
| 3284 | 2016-06-03 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3285 |
|
|---|
| 3286 | tests: fix similar bug in exit status test
|
|---|
| 3287 | * tests/grep-dir (status_range): New shell function.
|
|---|
| 3288 | Use it to fix bug where $? was not saved properly.
|
|---|
| 3289 |
|
|---|
| 3290 | 2016-06-03 Zev Weiss <zev@bewilderbeest.net>
|
|---|
| 3291 |
|
|---|
| 3292 | tests: fix bug in exit status test
|
|---|
| 3293 | When checking $? against multiple values, save its value in another
|
|---|
| 3294 | variable and check that so as to avoid tests beyond the first seeing a
|
|---|
| 3295 | $? clobbered by earlier ones.
|
|---|
| 3296 |
|
|---|
| 3297 | * tests/status: save $? in a temporary variable before testing it.
|
|---|
| 3298 |
|
|---|
| 3299 | 2016-06-02 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 3300 |
|
|---|
| 3301 | dfa: more simplification of dfaexec_main
|
|---|
| 3302 | * src/dfa.c (dfaexec_main): Failure at an acceptable position and demand
|
|---|
| 3303 | to build state is unlikely. So go next loop without checking them after
|
|---|
| 3304 | a newline. This commit induces no semantic change.
|
|---|
| 3305 |
|
|---|
| 3306 | 2016-06-02 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3307 |
|
|---|
| 3308 | maint: correct attribution
|
|---|
| 3309 | * build-aux/git-log-fix: Fix attribution of primary Aho-Corasick patch
|
|---|
| 3310 |
|
|---|
| 3311 | 2016-06-02 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3312 |
|
|---|
| 3313 | grep: simplify -F Aho-Corasick a bit
|
|---|
| 3314 | This removes some tuning that complicates the code without providing
|
|---|
| 3315 | performance benefits that I could measure (GCC 6.1, x86-64).
|
|---|
| 3316 | (acexec_trans): Do not hand-unroll. Unduplicate the code for a
|
|---|
| 3317 | transition step.
|
|---|
| 3318 |
|
|---|
| 3319 | * src/kwset.c (struct kwset.kwsexec, bmexec, acexec_trans, acexec)
|
|---|
| 3320 |
|
|---|
| 3321 | 2016-06-02 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3322 |
|
|---|
| 3323 | grep: minor cleanups for -F Aho-Corasick
|
|---|
| 3324 | * NEWS: Don't claim 7x, as the value seems to be system-dependent.
|
|---|
| 3325 | * src/kwset.c (struct kwset.kwsexec, bmexec, acexec_trans, acexec):
|
|---|
| 3326 | * src/kwset.c, src/kwset.h (kwsalloc, kwsexec):
|
|---|
| 3327 | Don't put 'const' into the declaration when that is irrelevant to
|
|---|
| 3328 | the API. More generally, don't bother with 'const' when it's only
|
|---|
| 3329 | a local so it is reasonably obvious to a reader that it is 'const'
|
|---|
| 3330 | anyway. It would be overkill to add 'const' to all locals that
|
|---|
| 3331 | never change.
|
|---|
| 3332 | * src/kwset.c (U): Avoid unnecessary parens.
|
|---|
| 3333 | (treefails, memoff2_kwset, bmexec_trans, bmexec, cwexec, acexec_trans):
|
|---|
| 3334 | Prefer SIZE_MAX to (size_t) -1.
|
|---|
| 3335 | (bmexec_trans, cwexec, acexec_trans):
|
|---|
| 3336 | Remove attributes for static functions that no longer seem needed.
|
|---|
| 3337 | (memoff2_kwset): Rename from memchr2_kwset, since it returns
|
|---|
| 3338 | an offset, not a pointer. All uses changed.
|
|---|
| 3339 | (cwexec, acexec_trans) [lint]: Remove initialization that is no
|
|---|
| 3340 | longer needed; at least, GCC 6.1 x86-64 does not need it.
|
|---|
| 3341 | (acexec_trans): Clarify code by using nesting rather than 'continue'.
|
|---|
| 3342 |
|
|---|
| 3343 | 2016-06-02 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 3344 |
|
|---|
| 3345 | grep: use memchr2 for two patterns of a character
|
|---|
| 3346 | * src/kwset.c (memchr2_kwset): Add a new function. grep uses memchr2 to
|
|---|
| 3347 | search just two letters.
|
|---|
| 3348 | (cwexec, acexec_trans): Use it.
|
|---|
| 3349 |
|
|---|
| 3350 | grep: -F multiword longest match not always needed
|
|---|
| 3351 | Searching multiple fixed words, grep immediately returns without longest
|
|---|
| 3352 | match if not needed. Without this change, grep tries longest match for
|
|---|
| 3353 | multiple words even if not needed.
|
|---|
| 3354 | * src/kwset.c (kwsexec, acexec, cwexec, bmexec): Add a bool argument
|
|---|
| 3355 | for whether longest match is needed. All callers changed.
|
|---|
| 3356 | * src/kwset.h (kwsexec): Update prototype.
|
|---|
| 3357 |
|
|---|
| 3358 | 2016-06-02 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 3359 |
|
|---|
| 3360 | grep: use Aho-Corasick algorithm to search multiple fixed words
|
|---|
| 3361 | Searching multiple fixed words, grep used the Commentz-Walter
|
|---|
| 3362 | algorithm, but this was O(m*n) and was very slow in the worst case.
|
|---|
| 3363 | For example:
|
|---|
| 3364 |
|
|---|
| 3365 | - input: yes `printf %040d` | head -10000000
|
|---|
| 3366 | - word1: x0000000000000000000
|
|---|
| 3367 | - word2: x
|
|---|
| 3368 |
|
|---|
| 3369 | This change instead uses the Aho-Corasick algorithm to search multiple
|
|---|
| 3370 | fixed words. It uses a high-quality trie-building function that is
|
|---|
| 3371 | already defined for Commentz-Walter in kwset.c.
|
|---|
| 3372 |
|
|---|
| 3373 | I see 7x speed-up even for a typical case on Fedora 21 with a 3.2GHz i5
|
|---|
| 3374 | by this change. Using best-of-5 trials for the benchmark:
|
|---|
| 3375 |
|
|---|
| 3376 | find /usr/share/doc/ -type f |
|
|---|
| 3377 | LC_ALL=C time -p xargs.sh src/grep -Ff /usr/share/dict/linux.words >/dev/null
|
|---|
| 3378 |
|
|---|
| 3379 | The results were:
|
|---|
| 3380 |
|
|---|
| 3381 | real 11.37 user 11.03 sys 0.24 [without the change]
|
|---|
| 3382 | real 1.49 user 1.31 sys 0.15 [with the change]
|
|---|
| 3383 |
|
|---|
| 3384 | * src/kwset.c (struct kwset): Add a new member 'mode'.
|
|---|
| 3385 | (kwsalloc): Use it.
|
|---|
| 3386 | All callers are changed.
|
|---|
| 3387 | (kwsincr): Using Aho-Corasick algorithm, build tries in normal order.
|
|---|
| 3388 | (acexec_trans, acexec): Add a new function.
|
|---|
| 3389 | (kwsexec): Use it.
|
|---|
| 3390 | * src/kwset.h (kwsalloc): Update a prototype.
|
|---|
| 3391 | * NEWS (Improvements): Mention it.
|
|---|
| 3392 |
|
|---|
| 3393 | 2016-05-13 Jim Meyering <meyering@fb.com>
|
|---|
| 3394 |
|
|---|
| 3395 | maint: do not let a LANGUAGE envvar setting perturb tests
|
|---|
| 3396 | E.g., running "LANGUAGE=eo make check" would provoke a failure
|
|---|
| 3397 | of the encoding-error test, on systems that mistakenly let that
|
|---|
| 3398 | envvar trump the setting of LC_ALL.
|
|---|
| 3399 | * tests/envvar-check: New file, copied from coreutils.
|
|---|
| 3400 | * tests/Makefile.am (EXTRA_DIST): Add it.
|
|---|
| 3401 | (TESTS_ENVIRONMENT): Source it.
|
|---|
| 3402 | Also select TMPDIR as we do for coreutils tests.
|
|---|
| 3403 | Reported by Benno Schulenberg in http://bugs.gnu.org/23527.
|
|---|
| 3404 |
|
|---|
| 3405 | 2016-05-02 Jim Meyering <meyering@fb.com>
|
|---|
| 3406 |
|
|---|
| 3407 | maint: avoid NEWS syntax-check failure
|
|---|
| 3408 | * NEWS: Move the mention of the /dev/null speed-up from the
|
|---|
| 3409 | block for 2.25 into the current, in-preparation block.
|
|---|
| 3410 |
|
|---|
| 3411 | 2016-05-01 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3412 |
|
|---|
| 3413 | dfa: prefer bool for boolean
|
|---|
| 3414 | * src/dfa.c (syntax_bits_set, dfasyntax, using_utf8, FETCH_WC)
|
|---|
| 3415 | (POP_LEX_STATE, State_transition):
|
|---|
| 3416 | * src/dfa.h (using_utf_8):
|
|---|
| 3417 | Use bool for boolean.
|
|---|
| 3418 |
|
|---|
| 3419 | 2016-05-01 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 3420 |
|
|---|
| 3421 | dfa: stop exporting internal functions
|
|---|
| 3422 | * src/dfa.c, src/dfa.h (dfaparse, dfaanalyze, dfastate, dfainit):
|
|---|
| 3423 | Now static.
|
|---|
| 3424 |
|
|---|
| 3425 | dfa: prefer bool at DFA interfaces
|
|---|
| 3426 | * src/dfa.c (struct dfa, dfasyntax, dfaanalyze, dfaexec_main)
|
|---|
| 3427 | (dfaexec_mb, dfaexec_sb, dfaexec_noop, dfaexec, dfacomp):
|
|---|
| 3428 | * src/dfa.h (dfasyntax, dfacomp, dfaexec, dfaanalyze):
|
|---|
| 3429 | * src/dfasearch.c (EGexecute):
|
|---|
| 3430 | Use bool for boolean.
|
|---|
| 3431 |
|
|---|
| 3432 | 2016-05-01 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3433 |
|
|---|
| 3434 | dfa: speed up checking for character boundary
|
|---|
| 3435 | This should help performance with gawk; not so much with grep.
|
|---|
| 3436 | Suggested by Norihiro Tanaka in: http://bugs.gnu.org/18777
|
|---|
| 3437 | * src/dfa.c (never_trail): New static var.
|
|---|
| 3438 | (dfasyntax): Initialize it.
|
|---|
| 3439 | (skip_remains_mb): Use it to speed up a common case in Gawk.
|
|---|
| 3440 |
|
|---|
| 3441 | grep: /dev/null output speedup
|
|---|
| 3442 | This sped up 'seq 10000000000 | grep . >/dev/null' by a factor of
|
|---|
| 3443 | 380,000 on my platform (Fedora 23, x86-64, AMD Phenom II X4 910e,
|
|---|
| 3444 | en_US.UTF-8 locale).
|
|---|
| 3445 | * NEWS: Document this.
|
|---|
| 3446 | * src/grep.c (grepbuf): exit_on_match no longer implies that -q
|
|---|
| 3447 | was specified, so when a match is found, exit with exit_failure if
|
|---|
| 3448 | an error was also found.
|
|---|
| 3449 | (grepdesc): Omit unnecessary S_ISREG and st_ino checks.
|
|---|
| 3450 | out_stat.st_ino is zero if stdout is not a regular file,
|
|---|
| 3451 | and this cannot possibly equal st->st_ino.
|
|---|
| 3452 | (main): Omit duplicate initialization of exit_failure. Do not
|
|---|
| 3453 | bother with isatty unless -q is not used and stdout is a character
|
|---|
| 3454 | special file and --color=auto and TERM says colorization is
|
|---|
| 3455 | possible. Most importantly, set exit_on_match if the output is
|
|---|
| 3456 | /dev/null.
|
|---|
| 3457 | * tests/grep-dev-null-out: New test.
|
|---|
| 3458 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 3459 | * tests/status: Do not require grep to actually read all the input
|
|---|
| 3460 | files when the output is /dev/null and a matching line has been
|
|---|
| 3461 | found.
|
|---|
| 3462 |
|
|---|
| 3463 | 2016-04-21 Jim Meyering <meyering@fb.com>
|
|---|
| 3464 |
|
|---|
| 3465 | maint: post-release administrivia
|
|---|
| 3466 | * NEWS: Add header line for next release.
|
|---|
| 3467 | * .prev-version: Record previous version.
|
|---|
| 3468 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 3469 |
|
|---|
| 3470 | version 2.25
|
|---|
| 3471 | * NEWS: Record release date.
|
|---|
| 3472 |
|
|---|
| 3473 | 2016-04-19 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3474 |
|
|---|
| 3475 | dfa: remove dependency on btowc
|
|---|
| 3476 | MirOS BSD btowc is a macro that (when GCC is being used) hardcodes
|
|---|
| 3477 | btowc (0x80) == WEOF regardless of locale, which contradicts
|
|---|
| 3478 | future POSIX in the C locale. Instead of bothering to develop a
|
|---|
| 3479 | Gnulib workaround for the btowc incompatibility, use mbrtowc,
|
|---|
| 3480 | which we are using elsewhere and fixing anyway, and are caching so
|
|---|
| 3481 | it is fast here. Problem reported by Nelson H. F. Beebe via Jim
|
|---|
| 3482 | Meyering in: http://bugs.gnu.org/23269#14
|
|---|
| 3483 | * bootstrap.conf (gnulib_modules): Remove btowc.
|
|---|
| 3484 | * src/dfa.c (struct dfa): Remove mbrtowc_cache member, replacing with ...
|
|---|
| 3485 | (mbrtowc_cache): ... this new static var. All uses changed.
|
|---|
| 3486 | (dfambcache): Remove; now done by setsyntax. Call removed.
|
|---|
| 3487 | (is_valid_unibyte_character): Remove.
|
|---|
| 3488 | (IS_WORD_CONSTITUENT): Remove this macro, replacing it with ...
|
|---|
| 3489 | (unibyte_word_constituent): ... this new function. It uses
|
|---|
| 3490 | mbrtowc_cache rather than btowc.
|
|---|
| 3491 | (dfasyntax): Initialize mbrtowc_cache before using it.
|
|---|
| 3492 |
|
|---|
| 3493 | 2016-04-10 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3494 |
|
|---|
| 3495 | grep: minor doc tweaks inspired by Debian
|
|---|
| 3496 | Problem reported by Santiago Ruano Rincón in: http://bugs.gnu.org/22911
|
|---|
| 3497 | * doc/grep.in.1:
|
|---|
| 3498 | * doc/grep.texi (Matching Control, grep Programs)
|
|---|
| 3499 | (Regular Expressions):
|
|---|
| 3500 | Document -e, -f, and PCRE more carefully.
|
|---|
| 3501 |
|
|---|
| 3502 | 2016-04-10 Jim Meyering <meyering@fb.com>
|
|---|
| 3503 |
|
|---|
| 3504 | maint: remove unused mbtoupper function
|
|---|
| 3505 | * src/searchutils.c (mbtoupper): Remove now-unused function.
|
|---|
| 3506 | Also remove inclusion of <assert.h>, since this change removed
|
|---|
| 3507 | the final use of assert.
|
|---|
| 3508 | * src/search.h (mbtoupper): Remove declaration.
|
|---|
| 3509 |
|
|---|
| 3510 | 2016-04-10 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3511 |
|
|---|
| 3512 | grep: in C locale, all bytes are valid characters
|
|---|
| 3513 | This works around glibc bug 19932:
|
|---|
| 3514 | https://sourceware.org/bugzilla/show_bug.cgi?id=19932
|
|---|
| 3515 | The actual bug fix was the update to the current version of Gnulib.
|
|---|
| 3516 | grep problem reported by Björn Jacke in: http://bugs.gnu.org/23234
|
|---|
| 3517 | * NEWS: Mention this.
|
|---|
| 3518 | * doc/grep.texi (File and Directory Selection): Crossref to LC_*
|
|---|
| 3519 | section. Suggest why -a or LC_ALL=C might be useful.
|
|---|
| 3520 | (Environment Variables): Mention 'locale -a'.
|
|---|
| 3521 | Say that LC_CTYPE also specifies encoding, and that every
|
|---|
| 3522 | byte is a valid character in the C or POSIX locale.
|
|---|
| 3523 | * tests/c-locale: New test.
|
|---|
| 3524 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 3525 |
|
|---|
| 3526 | build: update gnulib submodule to latest
|
|---|
| 3527 |
|
|---|
| 3528 | 2016-04-05 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3529 |
|
|---|
| 3530 | Give another example of binary file processing
|
|---|
| 3531 | Problem reported by Shlomi Fish
|
|---|
| 3532 | * doc/grep.texi (File and Directory Selection):
|
|---|
| 3533 | Document that 'q$' might match 'q' followed by a NUL
|
|---|
| 3534 | if --binary-files=binary is in effect.
|
|---|
| 3535 |
|
|---|
| 3536 | 2016-04-03 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3537 |
|
|---|
| 3538 | tests: test egrep/fgrep help only if our grep
|
|---|
| 3539 | Problem reported by Christian Weisgerber in: http://bugs.gnu.org/23146
|
|---|
| 3540 | * tests/Makefile.am (TESTS_ENVIRONMENT):
|
|---|
| 3541 | Test egrep and fgrep only if they use our grep.
|
|---|
| 3542 |
|
|---|
| 3543 | 2016-03-29 Jim Meyering <meyering@fb.com>
|
|---|
| 3544 |
|
|---|
| 3545 | tests: remove spurious test of egrep
|
|---|
| 3546 | * tests/reversed-range-endpoints: Do not test egrep here.
|
|---|
| 3547 | There is already a test of grep -E.
|
|---|
| 3548 | Prompted by http://bugs.gnu.org/23146
|
|---|
| 3549 |
|
|---|
| 3550 | 2016-03-23 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3551 |
|
|---|
| 3552 | grep: -Pz no longer misdiagnoses [^a]
|
|---|
| 3553 | Problem reported by Michael Jess.
|
|---|
| 3554 | * NEWS: Document this.
|
|---|
| 3555 | * src/pcresearch.c (Pcompile): Do not diagnose [^ when [ is unescaped.
|
|---|
| 3556 | * tests/pcre: Test for the bug.
|
|---|
| 3557 |
|
|---|
| 3558 | 2016-03-22 Jim Meyering <meyering@fb.com>
|
|---|
| 3559 |
|
|---|
| 3560 | maint: move new 'Improvements' blurb into proper section
|
|---|
| 3561 | * NEWS (Improvements): Move this new section from within the block
|
|---|
| 3562 | for the already-released 2.24 into the proper "next-release" block.
|
|---|
| 3563 | Also, retain the 2-blank-line separator between blocks.
|
|---|
| 3564 |
|
|---|
| 3565 | 2016-03-18 Jim Meyering <meyering@fb.com>
|
|---|
| 3566 |
|
|---|
| 3567 | maint: avoid spurious "binary file ... matches" in generated THANKS
|
|---|
| 3568 | * Makefile.am (THANKS): Don't apply grep to a stream containing
|
|---|
| 3569 | NUL bytes. Sync this rule from the one in coreutils: it was missing
|
|---|
| 3570 | some improvements.
|
|---|
| 3571 | Reported by Bailes Magio in http://bugs.gnu.org/22899
|
|---|
| 3572 |
|
|---|
| 3573 | 2016-03-18 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3574 |
|
|---|
| 3575 | grep: -oz now outputs null bytes, not newlines
|
|---|
| 3576 | * NEWS: Document this.
|
|---|
| 3577 | * doc/grep.texi (Other Options): Clarify that -z affects output
|
|---|
| 3578 | as well as input data.
|
|---|
| 3579 | * src/grep.c (print_line_middle): Output eolbyte, not newline, if -o.
|
|---|
| 3580 | * tests/null-byte: Test -o too.
|
|---|
| 3581 | * tests/pcre-context: Adjust test to match new behavior.
|
|---|
| 3582 |
|
|---|
| 3583 | 2016-03-17 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3584 |
|
|---|
| 3585 | grep: use errno consistently in write diagnostics
|
|---|
| 3586 | Feature request and initial version reported by Assaf Gordon in:
|
|---|
| 3587 | http://bugs.gnu.org/23031
|
|---|
| 3588 | * NEWS: Document this.
|
|---|
| 3589 | * src/grep.c: Include <stdarg.h>.
|
|---|
| 3590 | (stdout_errno): New static var.
|
|---|
| 3591 | (write_error_seen): Remove; superseded by stdout_errno.
|
|---|
| 3592 | All uses changed.
|
|---|
| 3593 | (putchar_errno, fputs_errno, printf_errno, fwrite_errno)
|
|---|
| 3594 | (fflush_errno): New static functions.
|
|---|
| 3595 | (print_filename, print_sep, print_offset, print_line_head)
|
|---|
| 3596 | (print_line_middle, print_line_tail, prline, prtext, grep)
|
|---|
| 3597 | (grepdesc): Use them.
|
|---|
| 3598 | * tests/write-error-msg: New file.
|
|---|
| 3599 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 3600 |
|
|---|
| 3601 | 2016-03-10 Jim Meyering <meyering@fb.com>
|
|---|
| 3602 |
|
|---|
| 3603 | maint: post-release administrivia
|
|---|
| 3604 | * NEWS: Add header line for next release.
|
|---|
| 3605 | * .prev-version: Record previous version.
|
|---|
| 3606 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 3607 |
|
|---|
| 3608 | version 2.24
|
|---|
| 3609 | * NEWS: Record release date.
|
|---|
| 3610 |
|
|---|
| 3611 | 2016-02-28 Jim Meyering <meyering@fb.com>
|
|---|
| 3612 |
|
|---|
| 3613 | maint: add dist-check.mk
|
|---|
| 3614 | This file augments "make distcheck" rules.
|
|---|
| 3615 | * dist-check.mk: New file, from coreutils via gzip.
|
|---|
| 3616 | * Makefile.am (EXTRA_DIST): Add it.
|
|---|
| 3617 | * cfg.mk: Include it.
|
|---|
| 3618 |
|
|---|
| 3619 | 2016-02-21 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3620 |
|
|---|
| 3621 | grep: -Pz is incompatible with ^ and $
|
|---|
| 3622 | Problem reported by Sergei Trofimovich in: http://bugs.gnu.org/22655
|
|---|
| 3623 | * NEWS: Document this.
|
|---|
| 3624 | * src/pcresearch.c (Pcompile): Warn with -Pz and anchors.
|
|---|
| 3625 | * tests/pcre: Test new behavior.
|
|---|
| 3626 |
|
|---|
| 3627 | 2016-02-21 Jim Meyering <meyering@fb.com>
|
|---|
| 3628 |
|
|---|
| 3629 | tests: test cleanup
|
|---|
| 3630 | * tests/z-anchor-newline: Remove test artifact that would write
|
|---|
| 3631 | to /t/x.
|
|---|
| 3632 |
|
|---|
| 3633 | 2016-02-20 Jim Meyering <meyering@fb.com>
|
|---|
| 3634 |
|
|---|
| 3635 | grep -z: avoid erroneous match with regexp anchor and \n in text
|
|---|
| 3636 | * src/dfasearch.c (EGexecute): Clear the newline_anchor bit when
|
|---|
| 3637 | eolbyte is not '\n'.
|
|---|
| 3638 | * tests/z-anchor-newline: New file.
|
|---|
| 3639 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 3640 | * NEWS (Bug fixes): Describe it.
|
|---|
| 3641 | Originally reported by Ulrich Mueller in
|
|---|
| 3642 | https://bugs.gentoo.org/show_bug.cgi?id=574662
|
|---|
| 3643 | Reported to us by Sergei Trofimovich as http://debbugs.gnu.org/22655
|
|---|
| 3644 |
|
|---|
| 3645 | tests: convert "cmd && fail=1" to "returns_ 1 cmd || fail=1"
|
|---|
| 3646 | The latter is robust, while the former can silently ignore
|
|---|
| 3647 | failure due to signals.
|
|---|
| 3648 | * cfg.mk (sc_prohibit_and_fail_1): New rule, copied from coreutils.
|
|---|
| 3649 | * tests/long-pattern-perf: Perform the above substitution.
|
|---|
| 3650 | * tests/mb-non-UTF8-performance: Likewise.
|
|---|
| 3651 | * tests/help-version: Merge from coreutils.
|
|---|
| 3652 |
|
|---|
| 3653 | 2016-02-09 Jim Meyering <meyering@fb.com>
|
|---|
| 3654 |
|
|---|
| 3655 | maint: add a check-very-expensive target
|
|---|
| 3656 | * Makefile.am (check-very-expensive): New convenience rule,
|
|---|
| 3657 | currently merely equivalent to check-expensive.
|
|---|
| 3658 |
|
|---|
| 3659 | 2016-02-04 Jim Meyering <meyering@fb.com>
|
|---|
| 3660 |
|
|---|
| 3661 | maint: post-release administrivia
|
|---|
| 3662 | * NEWS: Add header line for next release.
|
|---|
| 3663 | * .prev-version: Record previous version.
|
|---|
| 3664 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 3665 |
|
|---|
| 3666 | version 2.23
|
|---|
| 3667 | * NEWS: Record release date.
|
|---|
| 3668 |
|
|---|
| 3669 | 2016-02-02 Jim Meyering <meyering@fb.com>
|
|---|
| 3670 |
|
|---|
| 3671 | gnulib: update to latest
|
|---|
| 3672 | Update for this "make distcheck"-fixing change:
|
|---|
| 3673 | > verify-tests: also remove stray test-verify.Tpo
|
|---|
| 3674 |
|
|---|
| 3675 | 2016-02-01 Jim Meyering <meyering@fb.com>
|
|---|
| 3676 |
|
|---|
| 3677 | tests/null-byte: test another code path
|
|---|
| 3678 | * tests/null-byte: Also exercise the case in which there is
|
|---|
| 3679 | a match in the block along with the NUL byte.
|
|---|
| 3680 |
|
|---|
| 3681 | 2016-01-31 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3682 |
|
|---|
| 3683 | Omit excess "Binary file ... matches"
|
|---|
| 3684 | Problem reported in: http://bugs.gnu.org/22461
|
|---|
| 3685 | * src/grep.c (grep): Don't report "Binary file ... matches"
|
|---|
| 3686 | merely because the file contained both matches and binary data.
|
|---|
| 3687 | Insist that the binary data contained a match.
|
|---|
| 3688 | * tests/null-byte: Add a test for this.
|
|---|
| 3689 |
|
|---|
| 3690 | 2016-01-28 Jim Meyering <meyering@fb.com>
|
|---|
| 3691 |
|
|---|
| 3692 | gnulib: update to latest
|
|---|
| 3693 |
|
|---|
| 3694 | 2016-01-23 Jim Meyering <meyering@fb.com>
|
|---|
| 3695 |
|
|---|
| 3696 | gnulib: update to latest
|
|---|
| 3697 |
|
|---|
| 3698 | maint: fix typo in NEWS: s/a/an/
|
|---|
| 3699 |
|
|---|
| 3700 | 2016-01-15 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3701 |
|
|---|
| 3702 | grep: -x now supersedes -w more consistently
|
|---|
| 3703 | * NEWS, doc/grep.texi (Matching Control): Mention this.
|
|---|
| 3704 | * src/dfasearch.c (EGexecute):
|
|---|
| 3705 | * src/pcresearch.c (Pcompile):
|
|---|
| 3706 | Don't get confused by -w if -x is also present.
|
|---|
| 3707 | * src/pcresearch.c (Pcompile): Remove misleading comment about
|
|---|
| 3708 | non-UTF-8 multibyte locales, as PCRE doesn't support them.
|
|---|
| 3709 | Calculate buffer sizes more carefully; the old method
|
|---|
| 3710 | allocated a buffer slightly too big, seemingly due to luck.
|
|---|
| 3711 | * tests/backref-word, tests/pcre: Add tests for this bug.
|
|---|
| 3712 |
|
|---|
| 3713 | tests: omit update-copyright-tests
|
|---|
| 3714 | This test does not check how 'grep' itself operates, so it is
|
|---|
| 3715 | out of place for grep's 'make check'. Problem reported by Sam Razavi in:
|
|---|
| 3716 | http://bugs.gnu.org/22376
|
|---|
| 3717 | * bootstrap.conf (avoided_gnulib_modules): Add update-copyright-tests.
|
|---|
| 3718 |
|
|---|
| 3719 | 2016-01-11 Jim Meyering <meyering@fb.com>
|
|---|
| 3720 |
|
|---|
| 3721 | tests: do use "yes" but via an AWK replacement
|
|---|
| 3722 | Also, use sed Nq in place of head -N
|
|---|
| 3723 | * tests/init.cfg (yes): Define.
|
|---|
| 3724 | Thanks to Paul Eggert for this definition.
|
|---|
| 3725 | * tests/max-count-overread: Revert to using "yes".
|
|---|
| 3726 | * tests/mb-non-UTF8-performance: Likewise, and use
|
|---|
| 3727 | "sed Nq" in place of head -N.
|
|---|
| 3728 |
|
|---|
| 3729 | 2016-01-11 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3730 |
|
|---|
| 3731 | * tests/pcre-count: Don't assume the page size is 32kB.
|
|---|
| 3732 |
|
|---|
| 3733 | 2016-01-08 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3734 |
|
|---|
| 3735 | tests: port to other POSIXish platforms
|
|---|
| 3736 | I tested this on Solaris 10 and AIX 7.1.
|
|---|
| 3737 | * tests/max-count-overread:
|
|---|
| 3738 | * tests/mb-non-UTF8-performance:
|
|---|
| 3739 | Don't assume 'yes' exists, as 'yes' is not in POSIX.
|
|---|
| 3740 | * tests/mb-non-UTF8-performance:
|
|---|
| 3741 | Don't rely on 'head -1000', as that option syntax is not POSIX.
|
|---|
| 3742 | * tests/pcre-count: Don't rely on "printf '\x0'".
|
|---|
| 3743 | * tests/unibyte-binary: Don't assume \200 is an encoding error
|
|---|
| 3744 | in every unibyte locale.
|
|---|
| 3745 |
|
|---|
| 3746 | 2016-01-08 Jim Meyering <meyering@fb.com>
|
|---|
| 3747 |
|
|---|
| 3748 | tests: fix encoding-error test failure to use of printf '\xHH'
|
|---|
| 3749 | * tests/encoding-error: Don't rely on printf having support for \xHH
|
|---|
| 3750 | hexadecimal. That is not portable. Use \OOO octal, instead.
|
|---|
| 3751 |
|
|---|
| 3752 | maint: fix typo in NEWS: s/a/an/
|
|---|
| 3753 |
|
|---|
| 3754 | 2016-01-07 Jim Meyering <meyering@fb.com>
|
|---|
| 3755 |
|
|---|
| 3756 | mb-non-UTF8-performance: avoid FP test failure on fast hardware
|
|---|
| 3757 | * tests/mb-non-UTF8-performance: Don't use a fixed size.
|
|---|
| 3758 | Otherwise, on a fast system, the fixed-size unibyte test
|
|---|
| 3759 | would complete in a nominal 0 ms, which might well be
|
|---|
| 3760 | smaller than 1/30 of the multibyte duration, provoking
|
|---|
| 3761 | a false positive test failure. Instead, increase the
|
|---|
| 3762 | size of the input until we obtain a unibyte duration of
|
|---|
| 3763 | at least 10ms.
|
|---|
| 3764 |
|
|---|
| 3765 | 2016-01-07 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3766 |
|
|---|
| 3767 | doc: mention unibyte encoding fix
|
|---|
| 3768 | * NEWS: Document recent fix for encoding errors in unibyte locales.
|
|---|
| 3769 |
|
|---|
| 3770 | grep: improve unibyte -P performance
|
|---|
| 3771 | This is a followon to the recent changes prompted by Bug#20526.
|
|---|
| 3772 | In <http://bugs.gnu.org/bug=20526#86> Norihiro Tanaka pointed out
|
|---|
| 3773 | that grep mistakenly assumed that unibyte locales cannot have
|
|---|
| 3774 | encoding errors. Here, the mistake hurt performance significantly.
|
|---|
| 3775 | On Fedora 23 x86-64 in the C locale, this patch improved grep's
|
|---|
| 3776 | performance by a factor of 7 when run as "grep -P 'z.*a'" on the
|
|---|
| 3777 | output of "yes $(printf '\200\n') | head -n 1000000000".
|
|---|
| 3778 | * src/pcresearch.c (multibyte_locale) [HAVE_LIBPCRE]: New static var.
|
|---|
| 3779 | (Pcompile): Set it.
|
|---|
| 3780 | (Pexecute): Use it to avoid the need to call
|
|---|
| 3781 | buf_has_encoding_errors in unibyte locales.
|
|---|
| 3782 |
|
|---|
| 3783 | 2016-01-06 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3784 |
|
|---|
| 3785 | Improve on fix for Bug#22181
|
|---|
| 3786 | * src/pcresearch.c (Pexecute): Update subject when skipping past
|
|---|
| 3787 | easily-determined encoding errors, as this is faster than letting
|
|---|
| 3788 | pcre_exec skip them. On my platform this improves performance
|
|---|
| 3789 | 4.7x on a benchmark created via "yes $(printf '\200\200\200\200
|
|---|
| 3790 | \200\200\200\200\200\200\200\200\200\200\200\200\200\200\200\200x\n')
|
|---|
| 3791 | | head -n 1000000 >j; grep -oP y j" in a UTF-8 locale. Rework
|
|---|
| 3792 | code that deals with PCRE_ERROR_BADUTF8 return, to avoid an
|
|---|
| 3793 | incorrect (albeit currently harmless) 'bol = false' assignment.
|
|---|
| 3794 |
|
|---|
| 3795 | grep: restore -P optimization (followup fix)
|
|---|
| 3796 | * src/search.h (EGexecute, Fexecute, Pexecute):
|
|---|
| 3797 | Change decls to match new implementations.
|
|---|
| 3798 | I forgot to add this file to the previous commit.
|
|---|
| 3799 |
|
|---|
| 3800 | grep: restore -P PCRE_NO_UTF8_CHECK optimization
|
|---|
| 3801 | On my platform in the en_US.utf8 locale, this makes 'grep -P "z.*a" k'
|
|---|
| 3802 | 220x faster, where k is created by the shell command:
|
|---|
| 3803 | yes 'abcdefg hijklmn opqrstu vwxyz' | head -n 10000000 >k
|
|---|
| 3804 | * src/dfasearch.c (EGexecute):
|
|---|
| 3805 | * src/grep.c (execute_fp_t):
|
|---|
| 3806 | * src/kwsearch.c (Fexecute):
|
|---|
| 3807 | * src/pcresearch.c (Pexecute):
|
|---|
| 3808 | First arg is now char *, not char const *, since Pexecute now
|
|---|
| 3809 | temporarily modifies this argument.
|
|---|
| 3810 | * src/grep.c, src/grep.h (buf_has_encoding_errors): Now extern.
|
|---|
| 3811 | * src/pcresearch.c (Pexecute): Use it. If the input is free of
|
|---|
| 3812 | encoding errors, use a multiline search and the PCRE_NO_UTF8_CHECK
|
|---|
| 3813 | option, as this is typically way faster. This restores an
|
|---|
| 3814 | optimization that was removed with the recent changes for binary
|
|---|
| 3815 | file detection.
|
|---|
| 3816 |
|
|---|
| 3817 | 2016-01-05 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3818 |
|
|---|
| 3819 | Fix calculation of unibyte_mask
|
|---|
| 3820 | * src/grep.c (initialize_unibyte_mask): The old method worked for
|
|---|
| 3821 | UTF-8 and other typical encodings, but did not work for weird
|
|---|
| 3822 | encodings, e.g., one where all bytes other than 0x7f and 0x80 are
|
|---|
| 3823 | unibyte characters.
|
|---|
| 3824 |
|
|---|
| 3825 | 2016-01-01 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3826 |
|
|---|
| 3827 | grep: fix bug with with invalid unibyte sequence
|
|---|
| 3828 | This was introduced by the recent binary-data-detection changes.
|
|---|
| 3829 | Problem reported by Norihiro Tanaka in: http://bugs.gnu.org/20526#86
|
|---|
| 3830 | * src/grep.c (HIBYTE, easy_encoding, init_easy_encoding): Remove,
|
|---|
| 3831 | replacing with ...
|
|---|
| 3832 | (uword_max, unibyte_mask, initialize_unibyte_mask): ... this new
|
|---|
| 3833 | constant, static var, and function. All uses changed. The
|
|---|
| 3834 | unibyte_mask var generalizes the old local var hibyte_mask, which
|
|---|
| 3835 | worked only for encodings where every byte with 0x80 turned off is
|
|---|
| 3836 | a single-byte character.
|
|---|
| 3837 | (buf_has_encoding_errors): Return false immediately if
|
|---|
| 3838 | unibyte_mask is zero, not whether the current encoding is unibyte.
|
|---|
| 3839 | The old test was incorrect in unibyte locales in which some bytes
|
|---|
| 3840 | were encoding errors.
|
|---|
| 3841 | * tests/pcre-z: Require UTF-8 locale, since the grep -z . test now
|
|---|
| 3842 | needs this. Use printf \0 rather than tr. Port the 'grep -z .'
|
|---|
| 3843 | test to platforms where the C locale says '\200' is an encoding
|
|---|
| 3844 | error. Use cmp rather than compare, as the file is binary and
|
|---|
| 3845 | so non-GNU diff might not work.
|
|---|
| 3846 | * tests/unibyte-binary: New file.
|
|---|
| 3847 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 3848 |
|
|---|
| 3849 | 2016-01-01 Jim Meyering <meyering@fb.com>
|
|---|
| 3850 |
|
|---|
| 3851 | maint: update copyright year, bootstrap, init.sh
|
|---|
| 3852 | Run "make update-copyright" and then...
|
|---|
| 3853 |
|
|---|
| 3854 | * gnulib: Update to latest.
|
|---|
| 3855 | * tests/init.sh: Update from gnulib.
|
|---|
| 3856 | * bootstrap: Likewise.
|
|---|
| 3857 |
|
|---|
| 3858 | 2015-12-31 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3859 |
|
|---|
| 3860 | doc: clarify text vs binary match output
|
|---|
| 3861 | * NEWS:
|
|---|
| 3862 | * doc/grep.texi (File and Directory Selection):
|
|---|
| 3863 | Make it clearer that grep can now output matching text before
|
|---|
| 3864 | reporting a binary match. Problem reported by Norihiro Tanaka in:
|
|---|
| 3865 | http://bugs.gnu.org/20526#83
|
|---|
| 3866 |
|
|---|
| 3867 | doc: minor clarifications
|
|---|
| 3868 | * doc/grep.in.1, doc/grep.texi: Minor clarifications suggested by
|
|---|
| 3869 | Debian documentation patches. Problem reported by Santiago Ruano
|
|---|
| 3870 | Rincón in: http://bugs.gnu.org/18651
|
|---|
| 3871 |
|
|---|
| 3872 | grep: fix -l --line-buffer bug
|
|---|
| 3873 | Problem reported by Louis Sautier in: http://bugs.gnu.org/18750
|
|---|
| 3874 | * NEWS: Document this.
|
|---|
| 3875 | * src/grep.c (grep, grepdesc): If --line-buffered, flush
|
|---|
| 3876 | stdout after outputting newline (or null byte, if applicable).
|
|---|
| 3877 |
|
|---|
| 3878 | 2015-12-30 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3879 |
|
|---|
| 3880 | grep: remove duplicate init
|
|---|
| 3881 | * src/grep.c (print_line_middle): Remove duplicate initialization.
|
|---|
| 3882 |
|
|---|
| 3883 | grep: report line-buffered write error right away
|
|---|
| 3884 | * src/grep.c (prline): When line buffered, if there is a write
|
|---|
| 3885 | error, report it immediately rather than waiting until the next
|
|---|
| 3886 | line of output.
|
|---|
| 3887 |
|
|---|
| 3888 | grep: -c should keep counting after binary data
|
|---|
| 3889 | Problem and fix reported by Jaroslav Škarvada, and test case
|
|---|
| 3890 | reported by Norihiro Tanaka, in: http://bugs.gnu.org/22028
|
|---|
| 3891 | * NEWS: Document this.
|
|---|
| 3892 | * src/grep.c (grep): Don't stop counting merely because nulls seen.
|
|---|
| 3893 | * tests/pcre-count: New file.
|
|---|
| 3894 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 3895 |
|
|---|
| 3896 | dfa: port to tinycc
|
|---|
| 3897 | * src/dfa.c (add_utf8_anychar): Put 'const' after type.
|
|---|
| 3898 | Problem reported by Aharon Robbins in:
|
|---|
| 3899 | http://bugs.gnu.org/22260
|
|---|
| 3900 |
|
|---|
| 3901 | grep: be less picky about encoding errors
|
|---|
| 3902 | This fixes a longstanding problem introduced in grep 2.21,
|
|---|
| 3903 | which is overly picky about binary files.
|
|---|
| 3904 | * NEWS:
|
|---|
| 3905 | * doc/grep.texi (File and Directory Selection): Document this.
|
|---|
| 3906 | * src/grep.c (input_textbin, textbin_is_binary, buffer_textbin)
|
|---|
| 3907 | (file_textbin):
|
|---|
| 3908 | Remove. All uses removed.
|
|---|
| 3909 | (encoding_error_output): New static var.
|
|---|
| 3910 | (buf_has_encoding_errors, buf_has_nulls, file_must_have_nulls):
|
|---|
| 3911 | New functions, which reuse bits
|
|---|
| 3912 | and pieces of the removed functions.
|
|---|
| 3913 | (lastout, print_line_head, print_line_middle, print_line_tail, prline)
|
|---|
| 3914 | (prpending, prtext, grepbuf):
|
|---|
| 3915 | Avoid use of const, now that we have
|
|---|
| 3916 | functions that require modifying a sentinel.
|
|---|
| 3917 | (print_line_head): New arg LEN. All uses changed.
|
|---|
| 3918 | (print_line_head, print_line_tail):
|
|---|
| 3919 | Return indicator whether the output line was printed.
|
|---|
| 3920 | All uses changed.
|
|---|
| 3921 | (print_line_middle): Exit early on encoding error.
|
|---|
| 3922 | (grep): Use new method for determining whether file is binary.
|
|---|
| 3923 | * src/grep.h (enum textbin, TEXTBIN_BINARY, TEXTBIN_UNKNOWN)
|
|---|
| 3924 | (TEXTBIN_TEXT, input_textbin): Remove decls. All uses removed.
|
|---|
| 3925 | * src/pcresearch.c (Pexecute): Remove multiline optimization,
|
|---|
| 3926 | since the main program no longer checks for encoding errors on input.
|
|---|
| 3927 | * tests/encoding-error: New file.
|
|---|
| 3928 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 3929 |
|
|---|
| 3930 | 2015-12-29 Jim Meyering <meyering@fb.com>
|
|---|
| 3931 |
|
|---|
| 3932 | maint: correct (make sorted) order of test file names
|
|---|
| 3933 | * tests/Makefile.am (TESTS): Insert new test name in sorted order.
|
|---|
| 3934 |
|
|---|
| 3935 | 2015-12-28 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 3936 |
|
|---|
| 3937 | grep: --exclude matches trailing parts of args
|
|---|
| 3938 | Problem reported by Vincent Lefevre in:
|
|---|
| 3939 | http://bugs.gnu.org/22144
|
|---|
| 3940 | * NEWS:
|
|---|
| 3941 | * doc/grep.texi (File and Directory Selection): Document this.
|
|---|
| 3942 | * src/grep.c (excluded_patterns, excluded_directory_patterns):
|
|---|
| 3943 | Now 2-element arrays, with one element for subfiles and another
|
|---|
| 3944 | for command-line args. All uses changed. This implements the change.
|
|---|
| 3945 | (exclude_options): New function.
|
|---|
| 3946 | * tests/include-exclude: Test the change.
|
|---|
| 3947 |
|
|---|
| 3948 | 2015-12-18 Jim Meyering <meyering@fb.com>
|
|---|
| 3949 |
|
|---|
| 3950 | grep -oP: don't infloop when processing invalid UTF8 preceding a match
|
|---|
| 3951 | * src/pcresearch.c (Pexecute): When advancing SUBJECT past an
|
|---|
| 3952 | encoding error, don't blindly set P to that new value, since we
|
|---|
| 3953 | will soon compute SEARCH_OFFSET = P - SUBJECT, and mistakenly
|
|---|
| 3954 | making that difference too small would allow us to match some
|
|---|
| 3955 | previously-processed text, resulting in an infinite loop.
|
|---|
| 3956 | * NEWS (Bug fixes): Mention it.
|
|---|
| 3957 | * THANKS.in: Add Christian's name and email address.
|
|---|
| 3958 | * tests/pcre-invalid-utf8-infloop: New file.
|
|---|
| 3959 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 3960 | Reported by Christian Boltz in http://debbugs.gnu.org/22181
|
|---|
| 3961 | Introduced by commit, v2.21-37-g14f8e48.
|
|---|
| 3962 |
|
|---|
| 3963 | 2015-11-04 Jim Meyering <meyering@fb.com>
|
|---|
| 3964 |
|
|---|
| 3965 | tests: mark performance-related tests as expensive
|
|---|
| 3966 | These performance-related tests are slightly failure prone due to
|
|---|
| 3967 | varying system load during the two runs.
|
|---|
| 3968 | Marking these tests as "expensive" makes it so they are no longer run
|
|---|
| 3969 | via "make check". You can still run them via make "check-expensive".
|
|---|
| 3970 | This makes them less likely to be run by regular users.
|
|---|
| 3971 | * tests/long-pattern-perf: Use expensive_.
|
|---|
| 3972 | * tests/mb-non-UTF8-performance: Likewise.
|
|---|
| 3973 | Reported by Jaroslav Skarvada in http://debbugs.gnu.org/21826
|
|---|
| 3974 | and by Andreas Schwab in http://debbugs.gnu.org/21812.
|
|---|
| 3975 |
|
|---|
| 3976 | 2015-11-01 Jim Meyering <meyering@fb.com>
|
|---|
| 3977 |
|
|---|
| 3978 | maint: post-release administrivia
|
|---|
| 3979 | * NEWS: Add header line for next release.
|
|---|
| 3980 | * .prev-version: Record previous version.
|
|---|
| 3981 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 3982 |
|
|---|
| 3983 | version 2.22
|
|---|
| 3984 | * NEWS: Record release date.
|
|---|
| 3985 |
|
|---|
| 3986 | tests: pcre-jitstack: upon failure, retry with no stack size limit
|
|---|
| 3987 | * tests/pcre-jitstack: Don't let an example that provokes inordinate
|
|---|
| 3988 | stack space use cause a test failure. Thanks to reports from and
|
|---|
| 3989 | analysis by Bruce Dubbs; see http://debbugs.gnu.org/21755
|
|---|
| 3990 |
|
|---|
| 3991 | 2015-10-27 Jim Meyering <meyering@fb.com>
|
|---|
| 3992 |
|
|---|
| 3993 | maint: update THANKS.in
|
|---|
| 3994 | * THANKS.in: Add name+email of those who found and reported
|
|---|
| 3995 | the bug that made grep -E '^x|x$' match any "x".
|
|---|
| 3996 |
|
|---|
| 3997 | 2015-10-25 Zev Weiss <zev@bewilderbeest.net>
|
|---|
| 3998 |
|
|---|
| 3999 | dfa: plug a memory leak in dfamust
|
|---|
| 4000 | * src/dfa.c (dfamust): Ensure MP is freed, by refraining
|
|---|
| 4001 | from returning early when, at "done:" *RESULT is NULL.
|
|---|
| 4002 |
|
|---|
| 4003 | 2015-10-25 Jim Meyering <meyering@fb.com>
|
|---|
| 4004 |
|
|---|
| 4005 | gnulib: update to latest
|
|---|
| 4006 | * gnulib: Pull in one more portability fix:
|
|---|
| 4007 | stdalign: port to Sun C 5.9
|
|---|
| 4008 |
|
|---|
| 4009 | 2015-10-24 Jim Meyering <meyering@fb.com>
|
|---|
| 4010 |
|
|---|
| 4011 | gnulib: update to latest, for portability fixes
|
|---|
| 4012 | * gnulib: Pull in changes like these:
|
|---|
| 4013 | fts: port to C11 alignof
|
|---|
| 4014 | stdalign: work around pre-4.9 GCC x86 bug
|
|---|
| 4015 |
|
|---|
| 4016 | maint: NEWS: correct/amend
|
|---|
| 4017 | * NEWS: Move the long-regexp-performance-improvement from
|
|---|
| 4018 | "Bug fixes" to "Improvements." Say more and include an example.
|
|---|
| 4019 | The -Fw degradation was introduced in commit v2.18-125-g94555dd
|
|---|
| 4020 |
|
|---|
| 4021 | tests: avoid spurious failure on OpenBSD 5.8
|
|---|
| 4022 | * tests/fedora: Don't rely on "diff - FILE" reading from stdin.
|
|---|
| 4023 | Reported privately by Nelson Beebe.
|
|---|
| 4024 |
|
|---|
| 4025 | 2015-10-17 Jim Meyering <meyering@fb.com>
|
|---|
| 4026 |
|
|---|
| 4027 | gnulib: update to latest; also bootstrap and tests/init.sh
|
|---|
| 4028 | * bootstrap: Update from gnulib.
|
|---|
| 4029 | * tests/init.sh: Likewise.
|
|---|
| 4030 | * gnulib: Update submodule to latest.
|
|---|
| 4031 |
|
|---|
| 4032 | build: avoid spurious bootstrap failure involving pkg.m4
|
|---|
| 4033 | Running ./bootstrap could fail mistakenly at the very end in
|
|---|
| 4034 | its attempt to obtain a copy of pkg.m4. It would search only
|
|---|
| 4035 | $(aclocal --print-ac-dir) and some other directories, but not
|
|---|
| 4036 | those listed in $(aclocal --print-ac-dir)/dirlist.
|
|---|
| 4037 | * bootstrap.conf (bootstrap_post_import_hook): Also search the
|
|---|
| 4038 | directories named in $(aclocal --print-ac-dir)/dirlist when that
|
|---|
| 4039 | file exists with nonzero size.
|
|---|
| 4040 |
|
|---|
| 4041 | 2015-10-16 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 4042 |
|
|---|
| 4043 | maint: add news item
|
|---|
| 4044 | * NEWS: Document grep -Fw speedup.
|
|---|
| 4045 |
|
|---|
| 4046 | grep: simplify previous change
|
|---|
| 4047 | * src/grep.c (main): Simplify recently-changed grep -Fw test.
|
|---|
| 4048 |
|
|---|
| 4049 | 2015-10-16 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4050 |
|
|---|
| 4051 | grep: use grep matcher for grep -Fw when unibyte
|
|---|
| 4052 | In single byte locales with grep -Fw, prefer the grep matcher to the
|
|---|
| 4053 | kwset matcher, as the former uses KWset and a DFA, whereas the latter
|
|---|
| 4054 | calls kwsexec many times until it matches a word.
|
|---|
| 4055 | * src/grep.c (main): Change pattern for fgrep into grep for grep -Fw in
|
|---|
| 4056 | single byte locales.
|
|---|
| 4057 |
|
|---|
| 4058 | 2015-10-16 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 4059 |
|
|---|
| 4060 | grep: use memchr/memrchar
|
|---|
| 4061 | * src/kwsearch.c (Fexecute): Prefer memchr and memrchr to doing it
|
|---|
| 4062 | by hand.
|
|---|
| 4063 |
|
|---|
| 4064 | 2015-10-16 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4065 |
|
|---|
| 4066 | grep: improve performance of grep -Fw
|
|---|
| 4067 | * src/kwsearch.c (Fexecute): grep -Fw examined whether the previous
|
|---|
| 4068 | character is a word character after matching from the head of the
|
|---|
| 4069 | buffer. It is extremely slow. Now, if grep found a potential match,
|
|---|
| 4070 | it looks for the previous newline, and examines from there.
|
|---|
| 4071 |
|
|---|
| 4072 | 2015-10-13 Jim Meyering <meyering@fb.com>
|
|---|
| 4073 |
|
|---|
| 4074 | maint: use single quote rather than UTF-8 multi-byte version
|
|---|
| 4075 | * tests/backref-alt: Translate unnecessary non-ASCII in comment.
|
|---|
| 4076 |
|
|---|
| 4077 | 2015-10-13 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 4078 |
|
|---|
| 4079 | dfa: make the executable a bit smaller
|
|---|
| 4080 | * src/dfa.c (dfamust): Hoist MB_CUR_MAX calculation out of loops.
|
|---|
| 4081 |
|
|---|
| 4082 | 2015-10-13 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4083 |
|
|---|
| 4084 | dfa: fix bug in alternate of sub-patterns that differ only in constraints
|
|---|
| 4085 | Fix a bug where a line incorrectly matches alternates of sub-patterns
|
|---|
| 4086 | that differ only in the constraints, e.g., the ERE '^a|a$'.
|
|---|
| 4087 | Reported by Greg Boyd in: http://debbugs.gnu.org/21670
|
|---|
| 4088 | * src/dfa.c (dfamust): For a pattern with constraints, check that it is
|
|---|
| 4089 | matched including the constraints, to judge whether it is exact.
|
|---|
| 4090 |
|
|---|
| 4091 | dfa: fix off-by-one error
|
|---|
| 4092 | * src/dfa.c (dfamust): Fix off-by-one error in computing 'must' length,
|
|---|
| 4093 | which caused the 'must' to be too short. See:
|
|---|
| 4094 | http://bugs.gnu.org/21670#28
|
|---|
| 4095 |
|
|---|
| 4096 | 2015-10-12 Jim Meyering <meyering@fb.com>
|
|---|
| 4097 |
|
|---|
| 4098 | doc: NEWS: mention a bug fix
|
|---|
| 4099 | * NEWS (Bug fixes): Describe it.
|
|---|
| 4100 | This bug was introduced by commit v2.18-85-g2c94326
|
|---|
| 4101 | and fixed by commit v2.21-51-g256a4b4.
|
|---|
| 4102 |
|
|---|
| 4103 | 2015-10-11 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 4104 |
|
|---|
| 4105 | tests: add test case for Bug#21670
|
|---|
| 4106 | * tests/options: Add test #4 to catch Bug#21670.
|
|---|
| 4107 | Also, do not overescape # in shell strings.
|
|---|
| 4108 |
|
|---|
| 4109 | 2015-09-19 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 4110 |
|
|---|
| 4111 | Add test for pop_fail_stack bug
|
|---|
| 4112 | Problem reported by Hanno Böck in: http://bugs.gnu.org/21513
|
|---|
| 4113 | If you use --with-included-regex the bug fix is in gnulib, here:
|
|---|
| 4114 | http://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=5513b40999149090987a0341c018d05d3eea1272
|
|---|
| 4115 | If you use glibc, the bug fix has not been installed yet.
|
|---|
| 4116 | * tests/Makefile.am (XFAIL_TESTS): Add backref-alt if system matcher.
|
|---|
| 4117 | (TESTS): Add backref-alt.
|
|---|
| 4118 | * tests/backref-alt: New file.
|
|---|
| 4119 | * tests/triple-backref: Remove unused var.
|
|---|
| 4120 | Don't skip if tested with glibc, as Makefile.am now handles this.
|
|---|
| 4121 |
|
|---|
| 4122 | build: update gnulib submodule to latest
|
|---|
| 4123 |
|
|---|
| 4124 | 2015-08-19 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4125 |
|
|---|
| 4126 | grep: avoid use of uninitialized variable
|
|---|
| 4127 | EGexecute would use "backref" uninitialized.
|
|---|
| 4128 | While that could have no bearing on correctness, it could
|
|---|
| 4129 | impact performance, via an unnecessary use of regexp.
|
|---|
| 4130 | * src/dfasearch.c (EGexecute): Initialize backref.
|
|---|
| 4131 | Reported as http://debbugs.gnu.org/21273
|
|---|
| 4132 | Introduced by commit v2.21-55-gea0ebaa.
|
|---|
| 4133 |
|
|---|
| 4134 | 2015-08-12 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4135 |
|
|---|
| 4136 | grep: remove fgrep code for case insensitive match
|
|---|
| 4137 | The fgrep matcher is no longer called in case insensitive matching,
|
|---|
| 4138 | so remove the code to support it.
|
|---|
| 4139 | * src/kwsearch.c (mb_case_map_apply): Remove function.
|
|---|
| 4140 | (Fexecute): Remove now-unused code.
|
|---|
| 4141 |
|
|---|
| 4142 | 2015-08-12 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 4143 |
|
|---|
| 4144 | dfa: optimize [x-x]
|
|---|
| 4145 | * src/dfa.c (parse_bracket_exp): Treat [x-x] as if it were [x].
|
|---|
| 4146 | This also pacifies GCC, which otherwise complains about wc2
|
|---|
| 4147 | being set but not used.
|
|---|
| 4148 |
|
|---|
| 4149 | 2015-08-12 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4150 |
|
|---|
| 4151 | dfa: remove unused multibyte support
|
|---|
| 4152 | Now regex should be used for range, collating element, equivalent class
|
|---|
| 4153 | in non POSIX locales. So remove code to support these features.
|
|---|
| 4154 | * dfa.c (struct mb_char_classes): Remove members ch_classes,
|
|---|
| 4155 | nch_classes, ranges, nranges, equivs, nequivs, coll_elems, ncoll_elems.
|
|---|
| 4156 | All uses removed.
|
|---|
| 4157 | (match_mb_charset): Remove function.
|
|---|
| 4158 |
|
|---|
| 4159 | 2015-08-01 Jim Meyering <meyering@fb.com>
|
|---|
| 4160 |
|
|---|
| 4161 | tests: mb-non-UTF8-performance: use new function
|
|---|
| 4162 | * tests/mb-non-UTF8-performance: Rewrite to use
|
|---|
| 4163 | the user-time measuring function in init.cfg.
|
|---|
| 4164 |
|
|---|
| 4165 | tests: long-pattern-perf: measure user time, not elapsed
|
|---|
| 4166 | Measuring user time makes this test less prone to false
|
|---|
| 4167 | positive failure, and also lets us use a tighter bound.
|
|---|
| 4168 | * tests/long-pattern-perf: Measure elapsed user time rather than
|
|---|
| 4169 | wall-clock time, to permit a tighter bound on the ratio of
|
|---|
| 4170 | N-to-10N timings. Suggested by Giuseppe Ottaviano.
|
|---|
| 4171 | Also, use regexps built from mostly 5-digit numbers, so that the 10:1
|
|---|
| 4172 | ratio applies to lines of "seq" output as well as to total bytes.
|
|---|
| 4173 |
|
|---|
| 4174 | tests: new function to measure elapsed user time
|
|---|
| 4175 | * tests/init.cfg (user_time_): New function.
|
|---|
| 4176 |
|
|---|
| 4177 | 2015-07-25 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4178 |
|
|---|
| 4179 | dfa: remove word delimiter support for multibyte locales
|
|---|
| 4180 | DFA supports word delimiter expressions, but it does not behave
|
|---|
| 4181 | correctly for multibyte locales. Even if it were to be fixed,
|
|---|
| 4182 | the DFA matcher's performance would be no better than that of regex.
|
|---|
| 4183 | Thus, this change removes DFA support for word delimiter expressions
|
|---|
| 4184 | in multibyte locales.
|
|---|
| 4185 |
|
|---|
| 4186 | * src/dfa.c (dfa_supported): Return false also when a pattern uses any
|
|---|
| 4187 | word delimiter expression in a multibyte locale.
|
|---|
| 4188 |
|
|---|
| 4189 | 2015-07-25 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4190 |
|
|---|
| 4191 | dfa: avoid execution for a pattern including an unsupported expression
|
|---|
| 4192 | If a pattern includes a construct unsupported by the DFA matcher,
|
|---|
| 4193 | the DFA search would fail in most cases. Make dfaexec immediately
|
|---|
| 4194 | return for any such pattern.
|
|---|
| 4195 |
|
|---|
| 4196 | * src/dfa.c (struct dfa_state) [has_backref, has_mbcset]: Remove members
|
|---|
| 4197 | and all uses.
|
|---|
| 4198 | (dfaexec_main): Remove 'backref' parameter. Update callers.
|
|---|
| 4199 | (dfaexec_noop): New function.
|
|---|
| 4200 | (dfa_supported): New function.
|
|---|
| 4201 | (dfassbuild): Remove now-unused code.
|
|---|
| 4202 | (dfacomp): When a pattern uses a DFA-unsupported construct, do not
|
|---|
| 4203 | waste time performing any further analysis.
|
|---|
| 4204 |
|
|---|
| 4205 | 2015-07-19 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4206 |
|
|---|
| 4207 | dfa: DEBUG: print detail of DFA states
|
|---|
| 4208 | When compiled with -DDEBUG, grep outputs tokens etc.
|
|---|
| 4209 | With this change, also print DFA states and transitions.
|
|---|
| 4210 | This change is very useful when debugging those.
|
|---|
| 4211 |
|
|---|
| 4212 | * src/dfa.c (prtok) [DEBUG]: Change `%c' to `%02x' in printf format.
|
|---|
| 4213 | (state_index) [DEBUG]: Print detail of new state.
|
|---|
| 4214 | (dfastate) [DEBUG]: Print detail of DFA states.
|
|---|
| 4215 | Reported as http://debbugs.gnu.org/18707
|
|---|
| 4216 |
|
|---|
| 4217 | 2015-07-18 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4218 |
|
|---|
| 4219 | tests: sjis-mb: accept two more locales
|
|---|
| 4220 | * tests/sjis-mb: Accept the ja_JP.SJIS and ja_JP.PCK locales
|
|---|
| 4221 | as well as ja_JP.SHIFT_JIS, so this test is less likely to
|
|---|
| 4222 | be skipped unnecessarily. Reported as http://bugs.gnu.org/18983
|
|---|
| 4223 |
|
|---|
| 4224 | 2015-07-18 Jim Meyering <meyering@fb.com>
|
|---|
| 4225 |
|
|---|
| 4226 | tests: add a test for the performance fix
|
|---|
| 4227 | * tests/long-pattern-perf: New file.
|
|---|
| 4228 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 4229 |
|
|---|
| 4230 | 2015-07-18 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4231 |
|
|---|
| 4232 | dfa: speed up handling of long pattern
|
|---|
| 4233 | DFA tries to find a long sequence of characters that must appear
|
|---|
| 4234 | in any matching line. However, when a pattern is long (length N),
|
|---|
| 4235 | it is very slow, because it makes O(N^2) strstr calls.
|
|---|
| 4236 | This change reduces that to O(N) by processing each sequence of
|
|---|
| 4237 | adjacent "regular" characters as a group.
|
|---|
| 4238 |
|
|---|
| 4239 | Compare the run times of this command before and after this change:
|
|---|
| 4240 | (on a i7-4770S CPU @ 3.10GHz using rawhide (~fedora 22) and compiled
|
|---|
| 4241 | with gcc 6.0.0 20150627)
|
|---|
| 4242 | : | env time -f %e grep -f <(seq -s '' 9999)
|
|---|
| 4243 | Before: 0.85
|
|---|
| 4244 | After: 0.02
|
|---|
| 4245 |
|
|---|
| 4246 | * src/dfa.c (dfamust): Process each string of concatenated normal
|
|---|
| 4247 | characters as a unit.
|
|---|
| 4248 | * NEWS (Improvement): Mention it.
|
|---|
| 4249 | Prompted by a bug report and patch by Ivan Yanikov
|
|---|
| 4250 | in http://bugs.gnu.org/15191#5
|
|---|
| 4251 |
|
|---|
| 4252 | 2015-07-17 Jim Meyering <meyering@fb.com>
|
|---|
| 4253 |
|
|---|
| 4254 | tests: fix mis-applied patch.
|
|---|
| 4255 | * tests/include-exclude: I applied "|sort" to the wrong creation
|
|---|
| 4256 | of "out", and didn't push the same patch that I'd tested.
|
|---|
| 4257 |
|
|---|
| 4258 | tests: avoid FS-dependent false-positive failure
|
|---|
| 4259 | * tests/include-exclude: Sort file name list, so that this test
|
|---|
| 4260 | is not sensitive to the order in which those names are returned
|
|---|
| 4261 | via readdir. I noticed the failure on a Fedora 21 system using ext4.
|
|---|
| 4262 | Also fix a typo: s/framework_failure+/framework_failure_/
|
|---|
| 4263 |
|
|---|
| 4264 | 2015-07-13 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 4265 |
|
|---|
| 4266 | grep: fix bug with --exclude-dir and command line
|
|---|
| 4267 | Reported by Aron Griffis in: http://bugs.gnu.org/21027
|
|---|
| 4268 | * NEWS: Document this.
|
|---|
| 4269 | * src/grep.c (grepdirent): Don't check whether the file is skipped
|
|---|
| 4270 | when on the command line, as that's the caller's responsibility.
|
|---|
| 4271 | (main): Anchor the exclude patterns.
|
|---|
| 4272 | * tests/include-exclude: Adjust test case to match fixed behavior.
|
|---|
| 4273 | Add some more test cases.
|
|---|
| 4274 |
|
|---|
| 4275 | tests: fix $? typo in null-byte
|
|---|
| 4276 | * tests/null-byte: Don't assume $? survives an invocation of 'test'.
|
|---|
| 4277 |
|
|---|
| 4278 | 2015-07-05 Jim Meyering <meyering@fb.com>
|
|---|
| 4279 |
|
|---|
| 4280 | maint: dfa: used unsigned types where appropriate
|
|---|
| 4281 | * src/dfa.c (case_folded_counterparts): Return unsigned int, not int.
|
|---|
| 4282 | Change type of two locals to unsigned int, to reflect that their
|
|---|
| 4283 | values are never negative.
|
|---|
| 4284 | (parse_bracket_exp): Adjust type of result at each use, as well
|
|---|
| 4285 | as that of related index variables.
|
|---|
| 4286 |
|
|---|
| 4287 | 2015-07-04 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4288 |
|
|---|
| 4289 | dfa: build struct dfamust on demand
|
|---|
| 4290 | If we won't use KWset, do not build a "struct dfamust".
|
|---|
| 4291 | Now it is built only when needed.
|
|---|
| 4292 | * src/dfa.c (struct dfa) [musts]: Remove member.
|
|---|
| 4293 | (dfacomp): Don't build dfamust here.
|
|---|
| 4294 | (dfamustfree): New function to free a struct dfamust.
|
|---|
| 4295 | (dfamust): Make it a global function, and make it return a pointer
|
|---|
| 4296 | to a malloc'd struct dfamust.
|
|---|
| 4297 | (dfamusts): Remove it.
|
|---|
| 4298 | * src/dfa.h (struct dfamust) [next]: Remove member.
|
|---|
| 4299 | In the implementation preceding this patch, there was
|
|---|
| 4300 | never more than one of these in a given "struct dfa".
|
|---|
| 4301 | (dfamustfree, dfamust): Add prototypes.
|
|---|
| 4302 | (dfamusts): Remove prototype.
|
|---|
| 4303 | (dfaalloc): Declare with _GL_ATTRIBUTE_MALLOC.
|
|---|
| 4304 | To make that symbol usable there, move the inclusion
|
|---|
| 4305 | of "xalloc.h" from dfa.c to this file, dfa.h.
|
|---|
| 4306 | * src/dfasearch.c (kwsmusts): Adapt to use the new interface.
|
|---|
| 4307 | Update the comments to reflect reality.
|
|---|
| 4308 | This addresses http://bugs.gnu.org/17715
|
|---|
| 4309 |
|
|---|
| 4310 | 2015-07-04 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 4311 |
|
|---|
| 4312 | grep: use recent gnulib syntax bits
|
|---|
| 4313 | * src/grep.c (Gcompile, Ecompile): Use plain RE_SYNTAX_GREP
|
|---|
| 4314 | and RE_SYNTAX_EGREP, now that we assume a recent-enough gnulib.
|
|---|
| 4315 |
|
|---|
| 4316 | maint: ignore gendocs_template_min
|
|---|
| 4317 | * doc/.gitignore: Add '/gendocs_template_min'.
|
|---|
| 4318 |
|
|---|
| 4319 | build: update gnulib submodule to latest
|
|---|
| 4320 |
|
|---|
| 4321 | dfa: '.' and '[^x]' now consistently match newline
|
|---|
| 4322 | * src/dfa.c (parse_bracket_exp, lex, add_utf8_anychar)
|
|---|
| 4323 | (match_anychar): RE_DOT_NEWLINE and RE_HAT_LISTS_NOT_NEWLINE
|
|---|
| 4324 | are about LF, not about eolbyte. This patch does not affect
|
|---|
| 4325 | 'grep', but may affect other users of dfa.c.
|
|---|
| 4326 |
|
|---|
| 4327 | grep: -z '[^x]' now consistently matches newline
|
|---|
| 4328 | Problem reported by Norihiro Tanaka in: http://bugs.gnu.org/20974#19
|
|---|
| 4329 | * NEWS: Document this.
|
|---|
| 4330 | * src/grep.c (Gcompile, Ecompile): Clear RE_HAT_LISTS_NOT_NEWLINE.
|
|---|
| 4331 | * tests/utf8-bracket: Test this.
|
|---|
| 4332 |
|
|---|
| 4333 | 2015-07-03 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 4334 |
|
|---|
| 4335 | grep: -z '.' now consistently matches newline
|
|---|
| 4336 | Problem reported by Balazs Kezes in: http://bugs.gnu.org/20974
|
|---|
| 4337 | * NEWS: Document this.
|
|---|
| 4338 | * tests/utf8-bracket: New file, to test for this bug.
|
|---|
| 4339 | * src/grep.c (Gcompile, Ecompile): Also specify RE_DOT_NEWLINE.
|
|---|
| 4340 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 4341 |
|
|---|
| 4342 | grep: simplify print_line_middle slightly
|
|---|
| 4343 | * src/grep.c (print_line_middle): Simplify.
|
|---|
| 4344 |
|
|---|
| 4345 | grep: don't mishandle left context in -P
|
|---|
| 4346 | http://bugs.gnu.org/20957
|
|---|
| 4347 | * src/pcresearch.c (jit_exec): New arg SEARCH_OFFSET.
|
|---|
| 4348 | Caller changed.
|
|---|
| 4349 | (Pexecute): Pass the left context to pcre_exec, so that PCRE
|
|---|
| 4350 | regular-expression matching can see it.
|
|---|
| 4351 | * tests/pcre-context: New file, to test for this bug.
|
|---|
| 4352 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 4353 |
|
|---|
| 4354 | 2015-06-28 Jim Meyering <meyering@fb.com>
|
|---|
| 4355 |
|
|---|
| 4356 | tests/case-fold-backref: factor test
|
|---|
| 4357 |
|
|---|
| 4358 | 2015-06-26 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 4359 |
|
|---|
| 4360 | grep: don't hang on command-line fifo if -D skip
|
|---|
| 4361 | * NEWS: Document this.
|
|---|
| 4362 | * src/grep.c (skip_devices):
|
|---|
| 4363 | New function, with code taken from grepdirent.
|
|---|
| 4364 | (grepdirent): Use it. Avoid an unnecessary initialization.
|
|---|
| 4365 | (grepfile): If skipping devices, open files with O_NONBLOCK.
|
|---|
| 4366 | Throw in O_NOCTTY while we're at it.
|
|---|
| 4367 | (grepdesc): Skip devices here, too. Not only does this fix the
|
|---|
| 4368 | bug, it fixes an unlikely race condition if some other process
|
|---|
| 4369 | renames a device between fstatat and openat.
|
|---|
| 4370 | * tests/skip-device: Add a test for this bug.
|
|---|
| 4371 |
|
|---|
| 4372 | grep: minor tweaks
|
|---|
| 4373 | * src/grep.c (main): Change recently-added static vars to be
|
|---|
| 4374 | constants, which makes them sharable. Prefer 'return' to 'exit'
|
|---|
| 4375 | when returning/exiting from 'main'. Move decl closer to first use
|
|---|
| 4376 | and rename local from 'ok' (which was confusing) to 'status'.
|
|---|
| 4377 | Prefer named constant STDOUT_FILENO to unnamed constant 1.
|
|---|
| 4378 |
|
|---|
| 4379 | 2015-06-26 Jim Meyering <meyering@fb.com>
|
|---|
| 4380 |
|
|---|
| 4381 | maint: unify three argv-processing calls
|
|---|
| 4382 | * src/grep.c (main): Unify three calls to grep_commandline_arg.
|
|---|
| 4383 |
|
|---|
| 4384 | maint: alphabetize anonymous enum member names
|
|---|
| 4385 |
|
|---|
| 4386 | 2015-05-30 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 4387 |
|
|---|
| 4388 | test: tighten tests for bracket exprs
|
|---|
| 4389 | * tests/posix-bracket: Test '[a-a[.-.]--]'.
|
|---|
| 4390 | Also, test that failures are with status 1
|
|---|
| 4391 | (nonmatching data), not status 2 (invalid expressions).
|
|---|
| 4392 |
|
|---|
| 4393 | 2015-04-26 Jim Meyering <meyering@fb.com>
|
|---|
| 4394 |
|
|---|
| 4395 | maint: update bootstrap from gnulib
|
|---|
| 4396 | * bootstrap: Update from gnulib.
|
|---|
| 4397 |
|
|---|
| 4398 | maint: reword a diagnostic not to trigger leading capital check
|
|---|
| 4399 | * src/pcresearch.c: Reword diagnostic to avoid "make syntax-check"
|
|---|
| 4400 | failure.
|
|---|
| 4401 |
|
|---|
| 4402 | maint: sort test names in tests/Makefile.am and add syntax-check rule
|
|---|
| 4403 | * cfg.mk (sc_sorted_tests): New rule.
|
|---|
| 4404 | * tests/Makefile.am (TESTS): Alphabetize.
|
|---|
| 4405 |
|
|---|
| 4406 | 2015-04-25 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4407 |
|
|---|
| 4408 | dfa: make find_pred return NULL for an invalid predicate
|
|---|
| 4409 | This could never happen when invoked via grep, but could have triggered
|
|---|
| 4410 | a bug if dfa.c's find_pred function were invoked by some other program.
|
|---|
| 4411 | * src/dfa.c (find_pred): Return NULL for an invalid predicate.
|
|---|
| 4412 | * tests/invalid-char-class: New file to test for this.
|
|---|
| 4413 | * tests/Makefile.am (TESTS): Add that new file name to the list.
|
|---|
| 4414 | This addresses http://debbugs.gnu.org/18631
|
|---|
| 4415 |
|
|---|
| 4416 | 2015-04-06 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 4417 |
|
|---|
| 4418 | build: improve pkg-config doc and error handling
|
|---|
| 4419 | Error-handling improvement suggested by Mike Frysinger in:
|
|---|
| 4420 | http://bugs.gnu.org/16757#29
|
|---|
| 4421 | * NEWS: Document pkg-config changes.
|
|---|
| 4422 | * README-prereq: pkg-config is now a prereq when building from
|
|---|
| 4423 | repository.
|
|---|
| 4424 | * m4/pcre.m4 (gl_FUNC_PCRE): Report an error if pcre is explicitly
|
|---|
| 4425 | requested but not available. Defer to user-supplied PCRE_CFLAGS
|
|---|
| 4426 | and PCRE_LIBS.
|
|---|
| 4427 |
|
|---|
| 4428 | build: remove typo and don't bother with /usr/include/pcre
|
|---|
| 4429 | Problem reported by Holger Bruenjes.
|
|---|
| 4430 | * m4/pcre.m4: Remove test for /usr/include/libpng (a typo).
|
|---|
| 4431 | Come to think of it, don't bother worrying about
|
|---|
| 4432 | /usr/include/pcre, as hosts with that problem can use pkg-config
|
|---|
| 4433 | or configure with CFLAGS by hand.
|
|---|
| 4434 |
|
|---|
| 4435 | build: use pkg-config (if available) to configure libpcre
|
|---|
| 4436 | Problem reported by Mike Frysinger in: http://bugs.gnu.org/16757
|
|---|
| 4437 | * bootstrap.conf (bootstrap_post_import_hook):
|
|---|
| 4438 | Copy pkg-config's pkg.m4.
|
|---|
| 4439 | * configure.ac: Invoke PKG_PROG_PKG_CONFIG.
|
|---|
| 4440 | * m4/pcre.m4 (gl_FUNC_PCRE): Rewrite to use pkg-config if
|
|---|
| 4441 | available, and to test that pcre_compile can be linked to.
|
|---|
| 4442 | * src/Makefile.am (AM_CFLAGS): Add PCRE_CFLAGS.
|
|---|
| 4443 | (grep_LDADD): Add PCRE_LIBS.
|
|---|
| 4444 | * src/pcresearch.c: Simply include <pcre.h> if HAVE_LIBPCRE,
|
|---|
| 4445 | since 'configure' arranges for the appropriate -I option now.
|
|---|
| 4446 |
|
|---|
| 4447 | 2015-03-11 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 4448 |
|
|---|
| 4449 | grep: output "." file name in diagnostic
|
|---|
| 4450 | This is bug C as reported by David Grayson in:
|
|---|
| 4451 | http://bugs.gnu.org/16444#18
|
|---|
| 4452 | This bug occurs only in obscure circumstances, and I didn't see
|
|---|
| 4453 | how to write a reasonable test case for it.
|
|---|
| 4454 | * src/grep.c (filename_prefix_len): Remove, replacing with ...
|
|---|
| 4455 | (omit_dot_slash): New static var. All uses of the former replaced
|
|---|
| 4456 | with uses of the latter.
|
|---|
| 4457 | (grepdirent): Don't add 2 if the filename is just ".".
|
|---|
| 4458 |
|
|---|
| 4459 | egrep, fgrep: just use what's in PATH
|
|---|
| 4460 | * src/egrep.sh: Don't monkey with PATH; just use whatever 'grep'
|
|---|
| 4461 | is in the path. This is simpler, and lets the user specify
|
|---|
| 4462 | default options with a script for only grep, with no need for
|
|---|
| 4463 | egrep and fgrep scripts.
|
|---|
| 4464 | Fixes: bug#19998
|
|---|
| 4465 |
|
|---|
| 4466 | doc: give a script wrapper example
|
|---|
| 4467 | * doc/grep.texi (Environment Variables): Give an example of a
|
|---|
| 4468 | wrapper script, as an alternative to using GREP_OPTIONS.
|
|---|
| 4469 | Fixes: bug#19998
|
|---|
| 4470 |
|
|---|
| 4471 | doc: clarify how -a matches
|
|---|
| 4472 | * doc/grep.in.1, doc/grep.texi (File and Directory Selection):
|
|---|
| 4473 | Give an example of how non-text bytes affect pattern matching in
|
|---|
| 4474 | binary files.
|
|---|
| 4475 | Fixes: bug#20080
|
|---|
| 4476 |
|
|---|
| 4477 | 2015-02-23 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 4478 |
|
|---|
| 4479 | Cover the non-INSTALL case
|
|---|
| 4480 | * README: Mention what to do if there is no INSTALL file.
|
|---|
| 4481 | Fixes: bug#19928
|
|---|
| 4482 |
|
|---|
| 4483 | 2015-02-11 Jim Meyering <meyering@fb.com>
|
|---|
| 4484 |
|
|---|
| 4485 | maint: use ASAN-poisoning more carefully
|
|---|
| 4486 | The ASAN-poisoning instituted by commit v2.21-14-g1555185 was
|
|---|
| 4487 | incomplete, since the poisoned tail of the read buffer could well
|
|---|
| 4488 | be the target of a legitimate follow-on read. To accommodate that,
|
|---|
| 4489 | we must unpoison each such region just before beginning fillbuf's
|
|---|
| 4490 | read loop.
|
|---|
| 4491 | * src/grep.c [HAVE_ASAN] (asan_poison): Define.
|
|---|
| 4492 | (clear_asan_poison): Define.
|
|---|
| 4493 | (fillbuf): Clear before reading, since we are likely to read
|
|---|
| 4494 | into memory that was poisoned on the preceding iteration.
|
|---|
| 4495 | * tests/two-files: New file, to test for this.
|
|---|
| 4496 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 4497 |
|
|---|
| 4498 | 2015-02-10 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 4499 |
|
|---|
| 4500 | Grow the JIT stack if it becomes exhausted
|
|---|
| 4501 | Problem reported by Oliver Freyermuth in: http://bugs.gnu.org/19833
|
|---|
| 4502 | * NEWS: Document the fix.
|
|---|
| 4503 | * tests/Makefile.am (TESTS): Add pcre-jitstack.
|
|---|
| 4504 | * tests/pcre-jitstack: New file.
|
|---|
| 4505 | * src/pcresearch.c (NSUB): Move decl earlier, since it's needed
|
|---|
| 4506 | earlier now.
|
|---|
| 4507 | (jit_stack_size) [PCRE_STUDY_JIT_COMPILE]: New static var.
|
|---|
| 4508 | (jit_exec): New function.
|
|---|
| 4509 | (Pcompile): Initialize jit_stack_size.
|
|---|
| 4510 | (Pexecute): Use new jit_exec function. Report a useful diagnostic
|
|---|
| 4511 | if the error is PCRE_ERROR_JIT_STACKLIMIT.
|
|---|
| 4512 |
|
|---|
| 4513 | 2015-02-01 Jim Meyering <meyering@fb.com>
|
|---|
| 4514 |
|
|---|
| 4515 | maint: reference CVE-2015-1345 from NEWS
|
|---|
| 4516 | * NEWS: Mention the CVE that was addressed by v2.21-13-g83a95bd,
|
|---|
| 4517 | "grep -F: fix a heap buffer (read) overrun".
|
|---|
| 4518 |
|
|---|
| 4519 | 2015-01-18 Jim Meyering <meyering@fb.com>
|
|---|
| 4520 |
|
|---|
| 4521 | maint: convert "goto" to "continue" and remove now-spurious label
|
|---|
| 4522 | * src/kwset.c (bmexec_trans): Using "goto big_advance" here is
|
|---|
| 4523 | equivalent to using "continue". Make that change and remove
|
|---|
| 4524 | the now-unused label.
|
|---|
| 4525 |
|
|---|
| 4526 | 2015-01-10 Jim Meyering <meyering@fb.com>
|
|---|
| 4527 |
|
|---|
| 4528 | tests: add support for ASAN memory poisoning
|
|---|
| 4529 | This lets us reliably detect with ASAN some UMR bugs
|
|---|
| 4530 | that would otherwise be detectable only some of the time
|
|---|
| 4531 | with MSAN. Use __asan_poison_memory_region to mark the unused
|
|---|
| 4532 | portion of a read buffer as inaccessible. Then, with ASAN,
|
|---|
| 4533 | any attempt to access those bytes results in an ASAN abort.
|
|---|
| 4534 | * src/system.h: Include "ignore-value.h".
|
|---|
| 4535 | (__has_feature): Define.
|
|---|
| 4536 | (HAVE_ASAN): Define when address sanitizer is enabled.
|
|---|
| 4537 | [HAVE_ASAN]: Declare these two __asan_* symbols.
|
|---|
| 4538 | [!HAVE_ASAN] (__asan_poison_memory_region): Define stub.
|
|---|
| 4539 | [!HAVE_ASAN] (__asan_unpoison_memory_region): Likewise.
|
|---|
| 4540 | * src/grep.c: Use __asan_poison_memory_region.
|
|---|
| 4541 |
|
|---|
| 4542 | 2015-01-09 Yuliy Pisetsky <ypisetsky@fb.com>
|
|---|
| 4543 |
|
|---|
| 4544 | grep -F: fix a heap buffer (read) overrun
|
|---|
| 4545 | grep's read buffer is often filled to its full size, except when
|
|---|
| 4546 | reading the final buffer of a file. In that case, the number of
|
|---|
| 4547 | bytes read may be far less than the size of the buffer. However, for
|
|---|
| 4548 | certain unusual pattern/text combinations, grep -F would mistakenly
|
|---|
| 4549 | examine bytes in that uninitialized region of memory when searching
|
|---|
| 4550 | for a match. With carefully chosen inputs, one can cause grep -F to
|
|---|
| 4551 | read beyond the end of that buffer altogether. This problem arose via
|
|---|
| 4552 | commit v2.18-90-g73893ff with the introduction of a more efficient
|
|---|
| 4553 | heuristic using what is now the memchr_kwset function. The use of
|
|---|
| 4554 | that function in bmexec_trans could leave TP much larger than EP,
|
|---|
| 4555 | and the subsequent call to bm_delta2_search would mistakenly access
|
|---|
| 4556 | beyond end of the main input read buffer.
|
|---|
| 4557 |
|
|---|
| 4558 | * src/kwset.c (bmexec_trans): When TP reaches or exceeds EP,
|
|---|
| 4559 | do not call bm_delta2_search.
|
|---|
| 4560 | * tests/kwset-abuse: New file.
|
|---|
| 4561 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 4562 | * THANKS.in: Update.
|
|---|
| 4563 | * NEWS (Bug fixes): Mention it.
|
|---|
| 4564 |
|
|---|
| 4565 | Prior to this patch, this command would trigger a UMR:
|
|---|
| 4566 |
|
|---|
| 4567 | printf %0360db 0 | valgrind src/grep -F $(printf %019dXb 0)
|
|---|
| 4568 |
|
|---|
| 4569 | Use of uninitialised value of size 8
|
|---|
| 4570 | at 0x4142BE: bmexec_trans (kwset.c:657)
|
|---|
| 4571 | by 0x4143CA: bmexec (kwset.c:678)
|
|---|
| 4572 | by 0x414973: kwsexec (kwset.c:848)
|
|---|
| 4573 | by 0x414DC4: Fexecute (kwsearch.c:128)
|
|---|
| 4574 | by 0x404E2E: grepbuf (grep.c:1238)
|
|---|
| 4575 | by 0x4054BF: grep (grep.c:1417)
|
|---|
| 4576 | by 0x405CEB: grepdesc (grep.c:1645)
|
|---|
| 4577 | by 0x405EC1: grep_command_line_arg (grep.c:1692)
|
|---|
| 4578 | by 0x4077D4: main (grep.c:2570)
|
|---|
| 4579 |
|
|---|
| 4580 | See the accompanying test for how to trigger the heap buffer overrun.
|
|---|
| 4581 |
|
|---|
| 4582 | Thanks to Nima Aghdaii for testing and finding numerous
|
|---|
| 4583 | ways to break early iterations of this patch.
|
|---|
| 4584 |
|
|---|
| 4585 | 2015-01-08 Jim Meyering <meyering@fb.com>
|
|---|
| 4586 |
|
|---|
| 4587 | grep: avoid false-positive UMR
|
|---|
| 4588 | For some inputs, valgrind would report an uninitialized
|
|---|
| 4589 | memory read error, but it was harmless.
|
|---|
| 4590 | * src/grep.c (fillbuf): Initialize those trailing bytes.
|
|---|
| 4591 |
|
|---|
| 4592 | 2015-01-01 Jim Meyering <meyering@fb.com>
|
|---|
| 4593 |
|
|---|
| 4594 | gnulib: update to latest
|
|---|
| 4595 |
|
|---|
| 4596 | maint: update copyright year ranges to include 2015
|
|---|
| 4597 | Run "make update-copyright". Also, ...
|
|---|
| 4598 | * grep.texi: Update manually, converting each "--" to "-".
|
|---|
| 4599 |
|
|---|
| 4600 | 2014-12-15 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 4601 |
|
|---|
| 4602 | doc: document binary-data heuristic better
|
|---|
| 4603 | Problem reported by Martin Hoch in: http://bugs.gnu.org/19388
|
|---|
| 4604 | * doc/grep.texi (File and Directory Selection):
|
|---|
| 4605 | Document what non-text bytes are.
|
|---|
| 4606 | (Usage): Fix cross reference.
|
|---|
| 4607 |
|
|---|
| 4608 | 2014-12-12 Jim Meyering <meyering@fb.com>
|
|---|
| 4609 |
|
|---|
| 4610 | maint: fix a new "make syntax-check" failure
|
|---|
| 4611 | * tests/dfa-match-aux.c: s/can not/cannot/
|
|---|
| 4612 |
|
|---|
| 4613 | 2014-12-12 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4614 |
|
|---|
| 4615 | build: avoid build failure with --enable-gcc-warnings and no PCRE
|
|---|
| 4616 | * src/pcresearch.c [HAVE_LIBPCRE] (empty_match): Guard the declaration
|
|---|
| 4617 | of this PCRE-only variable.
|
|---|
| 4618 |
|
|---|
| 4619 | 2014-12-07 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 4620 |
|
|---|
| 4621 | tests: port fmbtest to CentOS 6 and earlier
|
|---|
| 4622 | * tests/fmbtest: Port to platforms where the 'sed' pattern
|
|---|
| 4623 | '[^0-9]' does not match every non-digit character. Problem
|
|---|
| 4624 | reported by Norihiro Tanaka in: http://bugs.gnu.org/19293
|
|---|
| 4625 |
|
|---|
| 4626 | 2014-12-06 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4627 |
|
|---|
| 4628 | dfa: simplify dfaexec
|
|---|
| 4629 | * src/dfa.c (dfaexec): Simplify by rearrangement of IF conditions.
|
|---|
| 4630 | This commit induces no semantic change, and reverts part of commit
|
|---|
| 4631 | v2.5.4-144-gbafa134.
|
|---|
| 4632 |
|
|---|
| 4633 | 2014-12-06 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4634 |
|
|---|
| 4635 | dfa: avoid invalid match or infinite loop in unused matching mode
|
|---|
| 4636 | Neither grep nor gawk uses this DFA code in its matching mode,
|
|---|
| 4637 | since each always calls dfacomp with a nonzero final argument.
|
|---|
| 4638 | However, when used in that mode, it had bug:
|
|---|
| 4639 | After failing to match in matching mode, it should return NULL,
|
|---|
| 4640 | but instead would either report a false match or enter an
|
|---|
| 4641 | infinite loop.
|
|---|
| 4642 |
|
|---|
| 4643 | * src/dfa.c (dfaexec_main): After failing to match in matching mode
|
|---|
| 4644 | return NULL, rather than transitioning to the next state.
|
|---|
| 4645 | * tests/dfa-match: Add a new test.
|
|---|
| 4646 | * tests/dfa-match-aux.c: Add a new program to exercise this
|
|---|
| 4647 | otherwise-unused part of dfa.c.
|
|---|
| 4648 | * tests/Makefile.am: Add a rule to build new test.
|
|---|
| 4649 | (check_PROGRAMS): Add dfa-match-aux.
|
|---|
| 4650 | (AM_CPPFLAGS): Add -I$(top_srcdir)/src.
|
|---|
| 4651 | (TESTS): Add dfa-match.
|
|---|
| 4652 | * cfg.mk (exclude_file_name_regexp--sc_bindtextdomain):
|
|---|
| 4653 | (exclude_file_name_regexp--sc_prohibit_atoi_atof):
|
|---|
| 4654 | Exempt the new test file from some syntax-check rules.
|
|---|
| 4655 |
|
|---|
| 4656 | 2014-12-04 Santiago Ruano Rincón <santiago@debian.org>
|
|---|
| 4657 |
|
|---|
| 4658 | doc: document grep-2.11 change in behavior of -r, --recursive
|
|---|
| 4659 | * doc/grep.texi (--recursive, -r): Mention the new behavior
|
|---|
| 4660 | of recursively searching "." when there is no FILE argument.
|
|---|
| 4661 | * doc/grep.in.1: Likewise.
|
|---|
| 4662 | That change first appeared in grep-2.11, released on 2012-03-02.
|
|---|
| 4663 |
|
|---|
| 4664 | 2014-11-24 Jim Meyering <meyering@fb.com>
|
|---|
| 4665 |
|
|---|
| 4666 | maint: correct for four Author: name misspellings
|
|---|
| 4667 | * .mailmap: Correct for misspelling in Norihiro Tanaka's last name
|
|---|
| 4668 | as listed in four commit Author: fields: s/Norihirio/Norihiro/
|
|---|
| 4669 |
|
|---|
| 4670 | 2014-11-23 Jim Meyering <meyering@fb.com>
|
|---|
| 4671 |
|
|---|
| 4672 | maint: post-release administrivia
|
|---|
| 4673 | * NEWS: Add header line for next release.
|
|---|
| 4674 | * .prev-version: Record previous version.
|
|---|
| 4675 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 4676 |
|
|---|
| 4677 | version 2.21
|
|---|
| 4678 | * NEWS: Record release date.
|
|---|
| 4679 |
|
|---|
| 4680 | 2014-11-21 Jim Meyering <meyering@fb.com>
|
|---|
| 4681 |
|
|---|
| 4682 | tests: sjis-mb: remove now-obsolete and failing sub-tests
|
|---|
| 4683 | * tests/sjis-mb: Commit v2.18-123-geb3292b changed how grep
|
|---|
| 4684 | handles patterns with encoding errors. These SJIS tests are
|
|---|
| 4685 | skipped so often that we didn't notice until now that there were
|
|---|
| 4686 | two tests of that changed behavior, and that on any system with
|
|---|
| 4687 | the ja_JP.SHIFT_JIS locale, they would always fail. Remove those
|
|---|
| 4688 | two tests, since this functionality is well tested separately,
|
|---|
| 4689 | via tests/prefix-of-multibyte.
|
|---|
| 4690 |
|
|---|
| 4691 | 2014-11-20 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4692 |
|
|---|
| 4693 | grep -F could erroneously fail to match in non-UTF8 multibyte locales
|
|---|
| 4694 | This fixes a bug that can strike only when using a non-UTF8 multibyte
|
|---|
| 4695 | locale like ja_JP.SHIFT_JIS.
|
|---|
| 4696 |
|
|---|
| 4697 | Consider this example: it would mistakenly fail to match before
|
|---|
| 4698 | this patch:
|
|---|
| 4699 |
|
|---|
| 4700 | printf '\203AA\n'|LC_ALL=ja_JP.SHIFT_JIS src/grep -F A
|
|---|
| 4701 |
|
|---|
| 4702 | When searching for a single byte that happens to be the latter
|
|---|
| 4703 | byte of a multibyte character, and the target byte also follows
|
|---|
| 4704 | that multibyte character, grep -F would advance an internal pointer
|
|---|
| 4705 | by one byte too many, thus missing the target byte. A test case
|
|---|
| 4706 | for this bug is already included in tests/sjis-mb.
|
|---|
| 4707 |
|
|---|
| 4708 | * src/kwsearch.c (Fexecute): Skip one byte less, after matched middle of a
|
|---|
| 4709 | multi-byte character. Introduced by commit v2.18-119-gfb7d538.
|
|---|
| 4710 |
|
|---|
| 4711 | 2014-11-17 Jim Meyering <meyering@fb.com>
|
|---|
| 4712 |
|
|---|
| 4713 | tests: big-match: disable OOM-provoking subtest
|
|---|
| 4714 | * tests/big-match: Our application of this regexp '^.*x\(\)\1'
|
|---|
| 4715 | to a file containing a single matching line of length 2GiB+2
|
|---|
| 4716 | would cause inordinate memory consumption (over 100GB) via
|
|---|
| 4717 | regexec.c, but no leak. That would cause disruption on most
|
|---|
| 4718 | systems, so remove this subtest. Reported by Assaf Gordon.
|
|---|
| 4719 |
|
|---|
| 4720 | 2014-11-16 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4721 |
|
|---|
| 4722 | dfa: avoid undefined behavior
|
|---|
| 4723 | * src/dfa.c (dfassbuild): Don't call memcpy with a second
|
|---|
| 4724 | argument of NULL, even when the size (3rd argument) is 0.
|
|---|
| 4725 |
|
|---|
| 4726 | 2014-11-14 Jim Meyering <meyering@fb.com>
|
|---|
| 4727 |
|
|---|
| 4728 | gnulib: update to latest
|
|---|
| 4729 |
|
|---|
| 4730 | 2014-11-14 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4731 |
|
|---|
| 4732 | grep -F -x -o PAT would print an extra newline for each match
|
|---|
| 4733 | * src/kwsearch.c (Fexecute): Correctly compute the length of a match
|
|---|
| 4734 | by subtracting 2 (not 1) when match_lines is set. With -x, we augment
|
|---|
| 4735 | the "line" by both prepending and appending an EOLBYTE to the search
|
|---|
| 4736 | pattern. Here, we must correct for that. However, to compensate,
|
|---|
| 4737 | when we are using -x (--line-regexp) and start_ptr is NULL, we have
|
|---|
| 4738 | to add 1 to the length so that we still print the trailing EOLBYTE.
|
|---|
| 4739 | Introduced by commit v2.18-85-g2c94326.
|
|---|
| 4740 | * tests/match-lines: Add a new test.
|
|---|
| 4741 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 4742 | * NEWS (Bug fixes): Mention it.
|
|---|
| 4743 |
|
|---|
| 4744 | 2014-11-11 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 4745 |
|
|---|
| 4746 | tests: port to Darwin
|
|---|
| 4747 | The 'sed' command 's/.//' does not delete all bytes in the C locale.
|
|---|
| 4748 | Problem reported by Nelson H. F. Beebe.
|
|---|
| 4749 | * tests/fmbtest: Don't assume that sed treats bytes with the
|
|---|
| 4750 | top bit set as valid characters in the C locale, as this is not
|
|---|
| 4751 | true for Darwin. Use the cs_CZ.UTF-8 locale instead, and
|
|---|
| 4752 | simplify the sed script.
|
|---|
| 4753 |
|
|---|
| 4754 | tests: fix recently-introduced stray output
|
|---|
| 4755 | * tests/init.cfg (require_pcre_): Remove stray debugging output.
|
|---|
| 4756 |
|
|---|
| 4757 | build: port to GCC 4.6.4 + glibc 2.5
|
|---|
| 4758 | On platforms this old, building with _FORTIFY_SOURCE equal to 2
|
|---|
| 4759 | results in duplicate definitions of standard library functions.
|
|---|
| 4760 | Problem reported by Nelson H. F. Beebe.
|
|---|
| 4761 | * configure.ac (_FORTIFY_SOURCE): Sort after GNULIB_PORTCHECK.
|
|---|
| 4762 | By default, do not enable this unless GNULIB_PORTCHECK is defined.
|
|---|
| 4763 | This better matches the original intent, which as I recall was to
|
|---|
| 4764 | enable these extra checks only with --enable-gcc-warnings.
|
|---|
| 4765 |
|
|---|
| 4766 | tests: port to libpcre sans UTF-8 support
|
|---|
| 4767 | Problem reported by Nelson H. F. Beebe.
|
|---|
| 4768 | * tests/pcre-infloop, tests/pcre-invalid-utf8-input, tests/pcre-utf8:
|
|---|
| 4769 | Skip the test unless PCRE works in an en_US.UTF-8 locale.
|
|---|
| 4770 |
|
|---|
| 4771 | 2014-11-09 Jim Meyering <meyering@fb.com>
|
|---|
| 4772 |
|
|---|
| 4773 | tests: do not fail when the zh_CN.UTF-8 locale is not installed
|
|---|
| 4774 | * tests/word-multibyte: This test would fail on a system with
|
|---|
| 4775 | no zh_CN.UTF-8 locale. Use it only if it is installed.
|
|---|
| 4776 |
|
|---|
| 4777 | tests: avoid hex_printf_ portability problems
|
|---|
| 4778 | * tests/init.cfg (hex_printf_): Spell out a-f and A-F, for
|
|---|
| 4779 | non-C locales, ensure that the input to sed is newline-terminated,
|
|---|
| 4780 | and quote the final octal format string.
|
|---|
| 4781 | Suggestions from Paul Eggert.
|
|---|
| 4782 |
|
|---|
| 4783 | 2014-11-08 Jim Meyering <meyering@fb.com>
|
|---|
| 4784 |
|
|---|
| 4785 | tests: avoid a multibyte tr portability problem
|
|---|
| 4786 | * tests/init.cfg (tr): New wrapper function.
|
|---|
| 4787 | See comments for details. Reported by Norihiro Tanaka
|
|---|
| 4788 | in http://debbugs.gnu.org/18991
|
|---|
| 4789 |
|
|---|
| 4790 | maint: remove spurious LC_ALL setting from one test
|
|---|
| 4791 | * tests/word-multibyte: Remove unnecessary setting of LC_ALL.
|
|---|
| 4792 |
|
|---|
| 4793 | tests: fix typo in previous change
|
|---|
| 4794 | * tests/init.cfg (hex_printf_): Fix typo s/A-f/A-F/.
|
|---|
| 4795 | For the record, I introduced that error, not Norihiro.
|
|---|
| 4796 |
|
|---|
| 4797 | 2014-11-08 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4798 |
|
|---|
| 4799 | tests: avoid awk+printf+\xHH portability trap
|
|---|
| 4800 | * tests/init.cfg (hex_printf_): Rewrite in terms of printf and sed.
|
|---|
| 4801 | Using awk's printf with \xHH in the format string was not portable
|
|---|
| 4802 | to the awk of Solaris 10, AIX 7 or HP-UX 11.23, as reported in
|
|---|
| 4803 | http://debbugs.gnu.org/18987.
|
|---|
| 4804 | * tests/word-multibyte: Use printf rather than hex_printf_,
|
|---|
| 4805 | and give the character we're printing a name: e_acute (rather
|
|---|
| 4806 | than A-grave), since that is used in other tests.
|
|---|
| 4807 | a trailing \n in the format string, adjust by removing it, and
|
|---|
| 4808 | instead invoking echo.
|
|---|
| 4809 | * tests/multibyte-white-space: Simply remove each trailing \n.
|
|---|
| 4810 | They were not needed.
|
|---|
| 4811 |
|
|---|
| 4812 | 2014-11-07 Jim Meyering <meyering@fb.com>
|
|---|
| 4813 |
|
|---|
| 4814 | tests: avoid printf+\xHH portability trap
|
|---|
| 4815 | * tests/word-multibyte: Using the bourne shell's printf function
|
|---|
| 4816 | with strings like "\xHH\xHH" happens to work for most interactive
|
|---|
| 4817 | shells, but not for dash. That is not portable. Use our hex_printf_
|
|---|
| 4818 | awk wrapper instead. Without this change, this test would fail on
|
|---|
| 4819 | a Debian system for which /bin/sh is configured to be "dash".
|
|---|
| 4820 |
|
|---|
| 4821 | maint: move helper function, hex_printf to init.cfg
|
|---|
| 4822 | * tests/init.cfg (hex_printf_): New function, from ...
|
|---|
| 4823 | * tests/multibyte-white-space: ... here. Reflect the
|
|---|
| 4824 | s/hex_print/hex_printf_/ renaming.
|
|---|
| 4825 |
|
|---|
| 4826 | 2014-11-02 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 4827 |
|
|---|
| 4828 | grep: port O_NOFOLLOW errno checking to NetBSD
|
|---|
| 4829 | Problem reported by Assaf Gordon in: http://bugs.gnu.org/18892
|
|---|
| 4830 | * NEWS: Document it.
|
|---|
| 4831 | * src/grep.c (open_symlink_nofollow_error):
|
|---|
| 4832 | New function, which does the right thing on NetBSD.
|
|---|
| 4833 | (grepfile): Use it.
|
|---|
| 4834 |
|
|---|
| 4835 | 2014-10-31 Jim Meyering <meyering@fb.com>
|
|---|
| 4836 |
|
|---|
| 4837 | build: generate man pages even when existing targets are read-only
|
|---|
| 4838 | * doc/Makefile.am (grep.1): Use mv -f to move temporary to target,
|
|---|
| 4839 | in case the target is read-only. Also, always make the generated
|
|---|
| 4840 | files read-only.
|
|---|
| 4841 | (egrep.1 fgrep.1): Likewise.
|
|---|
| 4842 | This avoids a build failure reported by Eric Blake in
|
|---|
| 4843 | http://lists.gnu.org/archive/html/bug-grep/2014-10/msg00112.html
|
|---|
| 4844 |
|
|---|
| 4845 | 2014-10-30 Jim Meyering <meyering@fb.com>
|
|---|
| 4846 |
|
|---|
| 4847 | tests: avoid false-positive failure due to some zh_CN.* locales
|
|---|
| 4848 | On some systems, and for some zh_CN.* locales (e.g., OpenBSD5.5) the
|
|---|
| 4849 | E-acute pair of bytes do not qualify as a word-constituent character.
|
|---|
| 4850 | * tests/word-multibyte: Use zh_CN.UTF-8, rather than "zh_CN".
|
|---|
| 4851 | Reported by Assaf Gordon and Bruce Dubbs in
|
|---|
| 4852 | http://debbugs.gnu.org/18892
|
|---|
| 4853 |
|
|---|
| 4854 | 2014-10-29 Jim Meyering <meyering@fb.com>
|
|---|
| 4855 |
|
|---|
| 4856 | gnulib: update to latest; bootstrap, too
|
|---|
| 4857 | * gnulib: Update to latest.
|
|---|
| 4858 | * bootstrap: Copy latest from gnulib.
|
|---|
| 4859 |
|
|---|
| 4860 | 2014-10-28 Jim Meyering <meyering@fb.com>
|
|---|
| 4861 |
|
|---|
| 4862 | tests: make new test script executable
|
|---|
| 4863 | * tests/word-multibyte: Make this file executable.
|
|---|
| 4864 |
|
|---|
| 4865 | 2014-10-28 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4866 |
|
|---|
| 4867 | dfa: make \w and \W work in multibyte locales
|
|---|
| 4868 | Reported by Jaroslav Skarvada in: http://bugs.gnu.org/18817
|
|---|
| 4869 | Now, \w and \W are supported in not only single byte locale but multibyte
|
|---|
| 4870 | locale.
|
|---|
| 4871 |
|
|---|
| 4872 | * src/dfa.c (PUSH_LEX_STATE, POP_LEX_STATE): Move definitions "up",
|
|---|
| 4873 | so they are not within the function.
|
|---|
| 4874 | (lex): Make \w and \W work in a multibyte locale, the same way
|
|---|
| 4875 | we made \s and \S work.
|
|---|
| 4876 | * tests/word-multibyte: New test for this change.
|
|---|
| 4877 | * tests/Makefile.am: Add a rule to build new test.
|
|---|
| 4878 | * NEWS (Bug fixes): Mention it.
|
|---|
| 4879 |
|
|---|
| 4880 | 2014-10-26 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4881 |
|
|---|
| 4882 | dfa: avoid false match in a non-UTF8 multibyte locale
|
|---|
| 4883 | This command should print nothing:
|
|---|
| 4884 |
|
|---|
| 4885 | printf '\263\244\263\244\n' \
|
|---|
| 4886 | | LC_ALL=ja_JP.eucJP grep -E "$(printf '^x|\244\263')"
|
|---|
| 4887 |
|
|---|
| 4888 | Before this patch, it would print its sole input line.
|
|---|
| 4889 | * src/dfa.c (struct dfa): Add new members: min_trcount,
|
|---|
| 4890 | initstate_letter, initstate_others.
|
|---|
| 4891 | (dfaanalyze): Build states with not only a newline context but others.
|
|---|
| 4892 | (build_state): Don't release initial states.
|
|---|
| 4893 | (skip_remains_mb): Add a parameter.
|
|---|
| 4894 | Add a comment describing all parameters.
|
|---|
| 4895 | (dfaexec_main): When there are multiple start states, we are about
|
|---|
| 4896 | to transition from one state to another and the current byte is not
|
|---|
| 4897 | the first byte of a multibyte character, first advance past the
|
|---|
| 4898 | current multibyte character.
|
|---|
| 4899 | * tests/euc-mb: Add a new test.
|
|---|
| 4900 | * NEWS (Bug fixes): Mention it.
|
|---|
| 4901 | This addresses http://debbugs.gnu.org/18685
|
|---|
| 4902 |
|
|---|
| 4903 | 2014-10-25 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 4904 |
|
|---|
| 4905 | tests: work around older libpcre bugs when testing -P and UTF-8
|
|---|
| 4906 | * tests/pcre-invalid-utf8-input: Add require_timeout_ and
|
|---|
| 4907 | require_compiled_in_MB_support. Put a timeout of 3 seconds on
|
|---|
| 4908 | grep, to avoid having this test case loop forever with older
|
|---|
| 4909 | versions of libpcre, such as those found on RHEL 6.5.
|
|---|
| 4910 | Reported by Jim Meyering in: http://bugs.gnu.org/18806#34
|
|---|
| 4911 |
|
|---|
| 4912 | 2014-10-24 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4913 |
|
|---|
| 4914 | tests: add test for grep -P fix
|
|---|
| 4915 | * tests/pcre-o: New test for this change.
|
|---|
| 4916 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 4917 |
|
|---|
| 4918 | 2014-10-24 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 4919 |
|
|---|
| 4920 | grep: fix grep -P crash
|
|---|
| 4921 | Reported by Shlomi Fish in: http://bugs.gnu.org/18806
|
|---|
| 4922 | Commit 9fa500407137f49f6edc3c6b4ee6c7096f0190c5 (2014-09-16) is a
|
|---|
| 4923 | hack that I put in to speed up 'grep -P'. Unfortunately, not only
|
|---|
| 4924 | is it violation of modularity, it's also a bug magnet, as we have
|
|---|
| 4925 | found out with Bug#18738 and Bug#18806. Remove the optimization
|
|---|
| 4926 | instead of applying more bandaids. Perhaps we can think of a
|
|---|
| 4927 | better way of doing the optimization, or perhaps we can just live
|
|---|
| 4928 | with a slower grep -P (as -P is inherently slower anyway...).
|
|---|
| 4929 | * src/grep.c, src/grep.h (validated_boundary):
|
|---|
| 4930 | Remove. All uses removed.
|
|---|
| 4931 | * src/pcresearch.c (Pexecute): Do not worry about validated_boundary.
|
|---|
| 4932 |
|
|---|
| 4933 | 2014-10-19 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4934 |
|
|---|
| 4935 | dfa: remove two erroneous clauses from a now-unused function
|
|---|
| 4936 | RE_DOT_NEWLINE and RE_DOT_NOT_NULL apply only to a dot that
|
|---|
| 4937 | matches any character. Do not consider them when matching
|
|---|
| 4938 | with a bracket expression.
|
|---|
| 4939 |
|
|---|
| 4940 | * src/dfa.c (match_mb_charset): Remove tests for RE_DOT_NEWLINE
|
|---|
| 4941 | and RE_DOT_NOT_NULL.
|
|---|
| 4942 |
|
|---|
| 4943 | 2014-10-19 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4944 |
|
|---|
| 4945 | dfa: process all MBCSET constructs via glibc's matcher
|
|---|
| 4946 | The DFA matcher does not support collating symbols or equivalence
|
|---|
| 4947 | classes, so ensure that any MBCSET reference is handled by the glibc
|
|---|
| 4948 | matcher. dfa.c already handled this in one case, but not the other,
|
|---|
| 4949 | so that a command like "printf '\0' |src/grep -aE '^\s?$'" would
|
|---|
| 4950 | mistakenly end up using dfa.c's match_mb_charset function rather
|
|---|
| 4951 | than glibc's matcher.
|
|---|
| 4952 |
|
|---|
| 4953 | * src/dfa.c (dfaexec_main): Move that code into the
|
|---|
| 4954 | State_transition macro. This renders the match_mb_charset
|
|---|
| 4955 | unused by grep.
|
|---|
| 4956 | * tests/multibyte-white-space: Add a test to exercise the
|
|---|
| 4957 | just-rendered-inaccessible code path.
|
|---|
| 4958 |
|
|---|
| 4959 | 2014-10-15 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4960 |
|
|---|
| 4961 | grep: initialize validation_boundary properly before use
|
|---|
| 4962 | * src/grep.c (main): Initialize validation_boundary before pre-searching
|
|---|
| 4963 | for an empty line.
|
|---|
| 4964 |
|
|---|
| 4965 | 2014-10-15 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 4966 |
|
|---|
| 4967 | grep: fix off-by-one bug in -P optimization
|
|---|
| 4968 | Reported by Norihiro Tanaka in: http://bugs.gnu.org/18738
|
|---|
| 4969 | * src/pcresearch.c (Pexecute): Fix off-by-one bug with
|
|---|
| 4970 | validation_boundary.
|
|---|
| 4971 | * tests/init.cfg (envvar_check_fail): Catch off-by-one bug.
|
|---|
| 4972 |
|
|---|
| 4973 | 2014-10-08 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4974 |
|
|---|
| 4975 | dfa: fix a theoretical bug
|
|---|
| 4976 | * src/dfa.c (dfaexec_main): After searching for a match from
|
|---|
| 4977 | the initial state, set the previous state, S1, to 0.
|
|---|
| 4978 | So far, we have found no case in which this fix makes a difference.
|
|---|
| 4979 | See http://debbugs.gnu.org/18645
|
|---|
| 4980 |
|
|---|
| 4981 | 2014-10-07 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 4982 |
|
|---|
| 4983 | doc: modernize and simplify man page
|
|---|
| 4984 | * doc/grep.in.1 (Tx, Id): Remove. All uses removed.
|
|---|
| 4985 | (MTO, URL): New macros, used for email and URL.
|
|---|
| 4986 | Use them when appropriate.
|
|---|
| 4987 | In main text, omit chatty discussions of other implementations;
|
|---|
| 4988 | the full manual suffices for this sort of thing.
|
|---|
| 4989 |
|
|---|
| 4990 | doc: clarify exit status
|
|---|
| 4991 | Reported by Santiago Ruano Rincón in: http://bugs.gnu.org/18651
|
|---|
| 4992 | * doc/grep.in.1 (EXIT STATUS):
|
|---|
| 4993 | * doc/grep.texi (Exit Status): Clarify.
|
|---|
| 4994 |
|
|---|
| 4995 | 2014-10-07 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 4996 |
|
|---|
| 4997 | dfa: test for just-fixed bug
|
|---|
| 4998 | * tests/mb-dot-newline: New file.
|
|---|
| 4999 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 5000 | * NEWS (Bug fixes): Mention it.
|
|---|
| 5001 | Bisection suggests that the bug was introduced by
|
|---|
| 5002 | commit v2.18-123-geb3292b. Also see
|
|---|
| 5003 | http://debbugs.gnu.org/cgi/bugreport.cgi?msg=17;bug=18580
|
|---|
| 5004 |
|
|---|
| 5005 | 2014-10-05 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 5006 |
|
|---|
| 5007 | dfa: factor out a new nontrivial block of duplicated code
|
|---|
| 5008 | * src/dfa.c (State_transition): New macro.
|
|---|
| 5009 | (dfaexec_main): Use it twice.
|
|---|
| 5010 |
|
|---|
| 5011 | dfa: check end of input buffer after transition in non-UTF8 multibyte locale
|
|---|
| 5012 | * src/dfa.c (dfaexec_main): Check for end of input buffer after each
|
|---|
| 5013 | transition in a non-UTF8 multibyte locale.
|
|---|
| 5014 | * tests/mb-non-UTF8-overrun: New test.
|
|---|
| 5015 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 5016 | * src/grep.c (main): With this fix, we no longer need the fourth
|
|---|
| 5017 | byte of "eolbytes".
|
|---|
| 5018 |
|
|---|
| 5019 | 2014-10-04 Jim Meyering <meyering@fb.com>
|
|---|
| 5020 |
|
|---|
| 5021 | grep: avoid stack buffer read-underrun and overrun
|
|---|
| 5022 | Testing binaries built with -fsanitize=address caused aborts due
|
|---|
| 5023 | to stack underrun and overrun.
|
|---|
| 5024 | * src/grep.c (main): Allocate a larger buffer for eolbytes:
|
|---|
| 5025 | one byte before the beginning and one more after the end.
|
|---|
| 5026 | For details, see http://debbugs.gnu.org/18580#44.
|
|---|
| 5027 |
|
|---|
| 5028 | 2014-10-04 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 5029 |
|
|---|
| 5030 | grep: fix subscript error when testing whether empty lines match
|
|---|
| 5031 | src/grep.c (grep): When testing whether an empty line matches,
|
|---|
| 5032 | make the input buffer one byte longer, as dfaexec uses that
|
|---|
| 5033 | for a sentinel.
|
|---|
| 5034 |
|
|---|
| 5035 | 2014-09-27 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5036 |
|
|---|
| 5037 | dfa: minor tweaks, mostly to remove __attribute__ ((noinline))
|
|---|
| 5038 | That attribute isn't portable, and I found a way to get similar
|
|---|
| 5039 | performance with standard C features.
|
|---|
| 5040 | * NEWS: Document the recently-installed performance improvement.
|
|---|
| 5041 | * src/dfa.c (struct dfa): New member dfaexec.
|
|---|
| 5042 | (dfaexec_main): Remove unnecessary 'const'.
|
|---|
| 5043 | (dfaexec_mb, dfaexec_sb): Remove __attribute__ ((noinline));
|
|---|
| 5044 | no longer needed.
|
|---|
| 5045 | (dfaexec): Use new dfaexec member.
|
|---|
| 5046 | (dfainit, dfaoptimize, dfassbuild): Initialize it.
|
|---|
| 5047 |
|
|---|
| 5048 | 2014-09-27 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 5049 |
|
|---|
| 5050 | dfa: separate dfaexec function to help optimization by compiler
|
|---|
| 5051 | * src/dfa.c (dfaexec_main): Rename from dfaexec, add inline attribute.
|
|---|
| 5052 | (dfaexec_mb): New function. Run it when d->multibyte is true. For this
|
|---|
| 5053 | function inlination must be avoided.
|
|---|
| 5054 | (dfaexec_sb): New function. Run it when d->multibyte is false. For this
|
|---|
| 5055 | function inlination must be avoided.
|
|---|
| 5056 | (dfaexec): Call dfaexec_mb or dfaexec_sb accoding to d->multibyte.
|
|---|
| 5057 |
|
|---|
| 5058 | 2014-09-27 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 5059 |
|
|---|
| 5060 | dfa: speed-up at initial state
|
|---|
| 5061 | DFA state is always 0 until have found potential match. So we improve
|
|---|
| 5062 | matching there by continuing to use the transition table.
|
|---|
| 5063 |
|
|---|
| 5064 | * src/dfa.c (skip_remains_mb): New function.
|
|---|
| 5065 | (dfaexec): Speed-up at initial state.
|
|---|
| 5066 |
|
|---|
| 5067 | 2014-09-27 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5068 |
|
|---|
| 5069 | maint: generalize the -Wcast-align fix
|
|---|
| 5070 | * src/grep.c (CAST_ALIGNED): New macro.
|
|---|
| 5071 | (skip_easy_bytes): Use it.
|
|---|
| 5072 |
|
|---|
| 5073 | 2014-09-27 Jim Meyering <meyering@fb.com>
|
|---|
| 5074 |
|
|---|
| 5075 | maint: suppress a false-positive -Wcast-align warning
|
|---|
| 5076 | Building with --enable-gcc-warnings and gcc-4.9.1 would provoke this:
|
|---|
| 5077 | grep.c:499:12: error: cast from 'const char *' to 'const uword *'\
|
|---|
| 5078 | (aka 'const unsigned long *') increases required alignment from\
|
|---|
| 5079 | 1 to 8 [-Werror,-Wcast-align]
|
|---|
| 5080 | for (s = (uword const *) p; ! (*s & hibyte_mask); s++)
|
|---|
| 5081 | ^~~~~~~~~~~~~~~~~
|
|---|
| 5082 | * src/grep.c (skip_easy_bytes): Use a pragma to suppress
|
|---|
| 5083 | gcc's false-positive cast-alignment warning.
|
|---|
| 5084 |
|
|---|
| 5085 | 2014-09-26 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5086 |
|
|---|
| 5087 | grep: don't check extensively for invalid prefix bytes unless -P
|
|---|
| 5088 | Problem reported by Jim Meyering in: http://bugs.gnu.org/18454#56
|
|---|
| 5089 | * src/grep.c (grep): After the first buffer is checked, leave the
|
|---|
| 5090 | file-type checker in TEXTBIN_UNKNOWN state only when -P is used.
|
|---|
| 5091 | Only the -P matcher has performance problems with checking binary
|
|---|
| 5092 | data that make it worthwhile to check every prefix input byte so
|
|---|
| 5093 | the -P matcher's TEXTBIN_UNKNOWN optimizations can come into play.
|
|---|
| 5094 | Other matchers can simply check the data directly, and using
|
|---|
| 5095 | TEXTBIN_UNKNOWN with them slows 'grep' down for no benefit.
|
|---|
| 5096 |
|
|---|
| 5097 | grep: scan for valid multibyte strings more quickly
|
|---|
| 5098 | Scan valid multibyte strings more quickly in the common case of
|
|---|
| 5099 | encodings that are upward compatible with ASCII, such as UTF-8.
|
|---|
| 5100 | You'd think there'd be a fast standard way to do this nowadays,
|
|---|
| 5101 | but nooooo....
|
|---|
| 5102 | Problem reported by Jim Meyering in: http://bugs.gnu.org/18454#56
|
|---|
| 5103 | * src/grep.c (HIBYTE): New constant.
|
|---|
| 5104 | (easy_encoding): New static var.
|
|---|
| 5105 | (init_easy_encoding, skip_easy_bytes): New functions.
|
|---|
| 5106 | (uword): New type.
|
|---|
| 5107 | (buffer_textbin): Skip easy bytes quickly.
|
|---|
| 5108 | Don't bother with mb_clen here, since skip_easy_bytes typically
|
|---|
| 5109 | captures the easy cases; just use mbrlen directly.
|
|---|
| 5110 | (buffer_textbin, file_textbin): First arg is no longer a const
|
|---|
| 5111 | pointer, since the byte past the end is now an overwritten sentinel.
|
|---|
| 5112 | (fillbuf): Make room for a uword after the buffer, for skip_easy_bytes.
|
|---|
| 5113 | (main): Call init_easy_encoding.
|
|---|
| 5114 |
|
|---|
| 5115 | 2014-09-17 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5116 |
|
|---|
| 5117 | grep: speed up processing of holes before EOF on Solaris
|
|---|
| 5118 | * src/grep.c (fillbuf): If SEEK_DATA fails with errno == ENXIO,
|
|---|
| 5119 | skip over the hole at EOF.
|
|---|
| 5120 |
|
|---|
| 5121 | grep: port to platforms lacking SEEK_DATA
|
|---|
| 5122 | Reported by Norihiro Tanaka in: http://bugs.gnu.org/18454#38
|
|---|
| 5123 | * src/grep.c (SEEK_DATA): Default to SEEK_SET if not defined.
|
|---|
| 5124 | (SEEK_HOLE): Move to top level, and default it to SEEK_SET.
|
|---|
| 5125 | (file_textbin): Adjust to new default.
|
|---|
| 5126 | (fillbuf): Don't bother with SEEK_DATA if it defaults to SEEK_SET.
|
|---|
| 5127 |
|
|---|
| 5128 | grep: skip past holes efficiently
|
|---|
| 5129 | Take advantage of the relaxed rules for treating non-text bytes in
|
|---|
| 5130 | binary data, by efficiently skipping past holes on platforms
|
|---|
| 5131 | supporting lseek's SEEK_DATA flag.
|
|---|
| 5132 | On one test on a circa-2008 Sun Fire V40z running Solaris 11.2,
|
|---|
| 5133 | 'grep x' took 0.009 real-time seconds to scan a holey file of size
|
|---|
| 5134 | 9,223,372,036,854,775,802 bytes, for a nominal scan rate of 1 ZB/s.
|
|---|
| 5135 | grep 2.20's scan rate on this platform was 843 MB/s, so this is a
|
|---|
| 5136 | speedup by a factor of 1.2 trillion. The speedup factor is not
|
|---|
| 5137 | as great on GNU/Linux hosts, due to what appear to be SEEK_DATA
|
|---|
| 5138 | inefficiencies, but presumably this will be cleared up in time.
|
|---|
| 5139 | * NEWS: Document this.
|
|---|
| 5140 | * src/grep.c, src/grep.h (eolbyte): Now char, not unsigned char.
|
|---|
| 5141 | This is for compatibility with the rest of the code.
|
|---|
| 5142 | The old (performance?) reasons for 'unsigned char' are now moot.
|
|---|
| 5143 | * src/grep.c (skip_nuls, skip_empty_lines, seek_data_failed):
|
|---|
| 5144 | New static vars.
|
|---|
| 5145 | (totalnl): Move up, since it's about input, not output, and
|
|---|
| 5146 | fillbuf now uses it.
|
|---|
| 5147 | (add_count): Move up, since fillbuf now uses it.
|
|---|
| 5148 | (all_zeros): New function.
|
|---|
| 5149 | (fillbuf): Use SEEK_DATA to skip past holes efficiently,
|
|---|
| 5150 | on systems that support this.
|
|---|
| 5151 | (grep, main): Set the new static vars.
|
|---|
| 5152 |
|
|---|
| 5153 | grep: improve -P performance in typical cases
|
|---|
| 5154 | * src/grep.c, src/grep.h (enum textbin): Move to grep.h.
|
|---|
| 5155 | (input_textbin, validated_boundary): New vars.
|
|---|
| 5156 | * src/grep.c (grepbuf, grep): Initialize them.
|
|---|
| 5157 | * src/pcresearch.c (Pexecute): Do a multiline search
|
|---|
| 5158 | when the input is known to be free of encoding errors.
|
|---|
| 5159 | Quickly discard bytes that are obviously encoding errors.
|
|---|
| 5160 | Quickly match empty strings.
|
|---|
| 5161 |
|
|---|
| 5162 | grep: minor -P speedup with jit_stack
|
|---|
| 5163 | * src/pcresearch.c (jit_stack): No longer static.
|
|---|
| 5164 |
|
|---|
| 5165 | grep: non-text bytes in binary data may be treated as line ends
|
|---|
| 5166 | * NEWS, doc/grep.texi (File and Directory Selection):
|
|---|
| 5167 | Document this change.
|
|---|
| 5168 | * src/grep.c (zap_nuls): New function.
|
|---|
| 5169 | (grep): Use it.
|
|---|
| 5170 | * tests/null-byte: Relax to allow new behavior.
|
|---|
| 5171 |
|
|---|
| 5172 | grep: -z no longer considers '\200' to be binary data
|
|---|
| 5173 | This avoids a problem when using grep -z in a Windows-1252 locale.
|
|---|
| 5174 | Plus, it lets 'grep -z' run a bit faster.
|
|---|
| 5175 | * NEWS: Document this.
|
|---|
| 5176 | * src/grep.c (buffer_textbin): Don't look for '\200' if -z.
|
|---|
| 5177 | * tests/pcre-z: Test for new behavior.
|
|---|
| 5178 |
|
|---|
| 5179 | grep: refactor binary-vs-unknown-vs-text flags for clarity
|
|---|
| 5180 | * src/grep.c (enum textbin): New enum.
|
|---|
| 5181 | (textbin_is_binary): New function.
|
|---|
| 5182 | (buffer_textbin, file_textbin, grep): Use them, for clarity.
|
|---|
| 5183 |
|
|---|
| 5184 | 2014-09-16 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5185 |
|
|---|
| 5186 | grep: fix -P speedup bug with empty match
|
|---|
| 5187 | * src/pcresearch.c (NSUB): New top-level constant, replacing
|
|---|
| 5188 | 'nsub' within Pexecute.
|
|---|
| 5189 | (Pcompile, Pexecute): Use it.
|
|---|
| 5190 | (Pexecute): Don't assume sub[1] is zero after a PCRE_ERROR_BADUTF8
|
|---|
| 5191 | match failure.
|
|---|
| 5192 | * tests/pcre-invalid-utf8-input: Test for this bug.
|
|---|
| 5193 |
|
|---|
| 5194 | grep: port -P speedup to hosts lacking PCRE_STUDY_JIT_COMPILE
|
|---|
| 5195 | * src/pcresearch.c (Pcompile): Do not assume that
|
|---|
| 5196 | PCRE_STUDY_JIT_COMPILE is defined.
|
|---|
| 5197 | (empty_match): Define on all platforms.
|
|---|
| 5198 |
|
|---|
| 5199 | grep: use mbclen cache in one more place
|
|---|
| 5200 | * src/grep.c (fgrep_to_grep_pattern): Use mb_clen here, too.
|
|---|
| 5201 |
|
|---|
| 5202 | grep: avoid false alarms for mb_clen and to_uchar
|
|---|
| 5203 | * cfg.mk (_gl_TS_unmarked_extern_functions): New var,
|
|---|
| 5204 | to bypass the tight_scope false alarms on mb_clen and to_uchar.
|
|---|
| 5205 |
|
|---|
| 5206 | grep: use mbclen cache more effectively
|
|---|
| 5207 | * src/grep.c (buffer_textbin, contains_encoding_error):
|
|---|
| 5208 | Use mb_clen for speed.
|
|---|
| 5209 | (buffer_textbin): Bypass mb_clen in unibyte locales.
|
|---|
| 5210 | (main): Always initialize the cache, since it's sometimes used in
|
|---|
| 5211 | unibyte locales now. Initialize it before contains_encoding_error
|
|---|
| 5212 | might be called.
|
|---|
| 5213 | * src/search.h (SEARCH_INLINE): New macro.
|
|---|
| 5214 | (mbclen_cache): Now extern decl.
|
|---|
| 5215 | (mb_clen): New inline function.
|
|---|
| 5216 | * src/searchutils.c (SEARCH_INLINE, SYSTEM_INLINE): Define.
|
|---|
| 5217 | (mbclen_cache): Now extern.
|
|---|
| 5218 | (build_mbclen_cache): Put 1 into the cache when mbrlen returns 0.
|
|---|
| 5219 | (mb_goback): Use mb_len for speed, and rely on it returning nonzero.
|
|---|
| 5220 | * src/system.h (SYSTEM_INLINE): New macro.
|
|---|
| 5221 | (to_uchar): Use it.
|
|---|
| 5222 |
|
|---|
| 5223 | grep: improve performance for older glibc
|
|---|
| 5224 | glibc has a bug where mbrlen and mbrtowc mishandle length-0 inputs.
|
|---|
| 5225 | Working around it in gnulib slows grep down, so disable the tests for it
|
|---|
| 5226 | and make sure grep works even if the bug is present.
|
|---|
| 5227 | * bootstrap.conf (avoided_gnulib_modules): Add mbrtowc-tests.
|
|---|
| 5228 | * configure.ac (gl_cv_func_mbrtowc_empty_input): Assume yes.
|
|---|
| 5229 | * src/searchutils.c (mb_next_wc): Don't invoke mbrtowc on empty input.
|
|---|
| 5230 |
|
|---|
| 5231 | grep: treat a file as binary if its prefix contains encoding errors
|
|---|
| 5232 | * NEWS:
|
|---|
| 5233 | * doc/grep.texi (File and Directory Selection):
|
|---|
| 5234 | Document this.
|
|---|
| 5235 | * src/grep.c (buffer_encoding, buffer_textbin): New functions.
|
|---|
| 5236 | (file_textbin): Rename from file_is_binary. Now returns 3-way value.
|
|---|
| 5237 | All callers changed.
|
|---|
| 5238 | (file_textbin, grep): Check the input more carefully for text vs
|
|---|
| 5239 | binary data.
|
|---|
| 5240 | (contains_encoding_error): Remove; use replaced by buffer_encoding.
|
|---|
| 5241 | * tests/backref-multibyte-slow:
|
|---|
| 5242 | * tests/high-bit-range:
|
|---|
| 5243 | * tests/invalid-multibyte-infloop:
|
|---|
| 5244 | Use -a, since the input is now considered to be binary.
|
|---|
| 5245 | * tests/invalid-multibyte-infloop: Add a check for new behavior.
|
|---|
| 5246 |
|
|---|
| 5247 | grep: use bool for boolean in grep.c
|
|---|
| 5248 | * src/grep.c (show_version, suppress_errors, only_matching)
|
|---|
| 5249 | (align_tabs, match_icase, match_words, match_lines, errseen)
|
|---|
| 5250 | (write_error_seen, is_device_mode, usable_st_size)
|
|---|
| 5251 | (file_is_binary, skipped_file, reset, fillbuf, out_quiet)
|
|---|
| 5252 | (out_line, out_byte, count_matches, no_filenames, line_buffered)
|
|---|
| 5253 | (done_on_match, exit_on_match, print_line_head, prline, grep)
|
|---|
| 5254 | (grepdirent, grepfile, grepdesc, grep_command_line_arg)
|
|---|
| 5255 | (get_nondigit_option, main): Use bool for boolean.
|
|---|
| 5256 | (print_line_head, prline): Use char for byte.
|
|---|
| 5257 | * src/grep.h: Include <stdbool.h>, and adjust decls to match
|
|---|
| 5258 | changes in grep.c.
|
|---|
| 5259 |
|
|---|
| 5260 | grep: speed up -P on files containing many multibyte errors
|
|---|
| 5261 | * src/pcresearch.c (empty_match): New var.
|
|---|
| 5262 | (Pcompile): Set it.
|
|---|
| 5263 | (Pexecute): Use it.
|
|---|
| 5264 |
|
|---|
| 5265 | grep: remove/refactor unnecessary code about line splitting
|
|---|
| 5266 | * src/grep.c (do_execute): Remove. Caller now uses 'execute'.
|
|---|
| 5267 | * src/pcresearch.c (Pexecute): Improve comment about this.
|
|---|
| 5268 |
|
|---|
| 5269 | 2014-09-12 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5270 |
|
|---|
| 5271 | grep: diagnose -P in non-UTF-8 multibyte locale
|
|---|
| 5272 | * src/pcresearch.c (Pcompile):
|
|---|
| 5273 | libpcre supports only unibyte and UTF-8 locales,
|
|---|
| 5274 | so report an error and exit if used in other locales.
|
|---|
| 5275 | * NEWS: Mention this.
|
|---|
| 5276 | * tests/euc-mb: Test this.
|
|---|
| 5277 |
|
|---|
| 5278 | 2014-09-12 Jim Meyering <meyering@fb.com>
|
|---|
| 5279 |
|
|---|
| 5280 | doc: move NEWS note about GREP_OPTIONS into proper section
|
|---|
| 5281 | * NEWS (Changes in behavior): Move the note about GREP_OPTIONS
|
|---|
| 5282 | from the 2.20 section into the section for the upcoming release.
|
|---|
| 5283 |
|
|---|
| 5284 | 2014-09-12 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5285 |
|
|---|
| 5286 | grep: make GREP_OPTIONS obsolescent
|
|---|
| 5287 | * NEWS:
|
|---|
| 5288 | * doc/grep.in.1 (ENVIRONMENT_VARIABLES):
|
|---|
| 5289 | * doc/grep.texi (Environment Variables):
|
|---|
| 5290 | Document that GREP_OPTIONS is obsolescent now.
|
|---|
| 5291 | * src/grep.c (main): Warn if GREP_OPTIONS is used.
|
|---|
| 5292 | * tests/r-dot, tests/skip-device: Don't use GREP_OPTIONS.
|
|---|
| 5293 |
|
|---|
| 5294 | 2014-09-11 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5295 |
|
|---|
| 5296 | doc: bug tracker has moved to debbugs.gnu.org
|
|---|
| 5297 | * README (KNOWN BUGS):
|
|---|
| 5298 | * doc/grep.in.1:
|
|---|
| 5299 | * doc/grep.texi (Reporting Bugs): Document this.
|
|---|
| 5300 |
|
|---|
| 5301 | grep: fix false matches with -P '...$' and invalid UTF-8
|
|---|
| 5302 | * tests/pcre-invalid-utf8-input: Add a test for that.
|
|---|
| 5303 |
|
|---|
| 5304 | grep: fix false matches with -P '...$' and invalid UTF-8
|
|---|
| 5305 | * src/pcresearch.c (Pexecute): Use PCRE_NOTEOL when matching
|
|---|
| 5306 | initial substrings of a line.
|
|---|
| 5307 |
|
|---|
| 5308 | 2014-09-10 Jim Meyering <meyering@fb.com>
|
|---|
| 5309 |
|
|---|
| 5310 | tests: add expect-to-fail test for a glibc regexp bug
|
|---|
| 5311 | * tests/triple-backref: New file.
|
|---|
| 5312 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 5313 | (XFAIL_TESTS): List it as a known, always-failing test.
|
|---|
| 5314 | Based on the bug report from Paul Eggert:
|
|---|
| 5315 | https://sourceware.org/bugzilla/show_bug.cgi?id=17356
|
|---|
| 5316 |
|
|---|
| 5317 | maint: avoid distcheck failure
|
|---|
| 5318 | * Makefile.am (EXTRA_DIST): Add .mailmap.
|
|---|
| 5319 |
|
|---|
| 5320 | 2014-09-10 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5321 |
|
|---|
| 5322 | grep: port recent fix to older pcre version
|
|---|
| 5323 | * src/pcresearch.c (Pexecute): Don't assume that a pcre_exec
|
|---|
| 5324 | that returns PCRE_ERROR_NOMATCH leaves its sub argument alone.
|
|---|
| 5325 | This assumption is false for libpcre-3 version 8.31-2ubuntu2.
|
|---|
| 5326 |
|
|---|
| 5327 | 2014-09-09 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5328 |
|
|---|
| 5329 | grep: -P now treats invalid UTF-8 input as non-matching
|
|---|
| 5330 | Problem reported by Santiago Vila in: http://bugs.gnu.org/18266
|
|---|
| 5331 | * NEWS: Mention this.
|
|---|
| 5332 | * src/pcresearch.c (Pexecute): Treat UTF-8 encoding errors
|
|---|
| 5333 | as non-matching data, instead of exiting 'grep'.
|
|---|
| 5334 | * tests/pcre-infloop: grep now exits with status 1, not 2.
|
|---|
| 5335 | * tests/pcre-invalid-utf8-input: grep now exits with status 0, not 2.
|
|---|
| 5336 |
|
|---|
| 5337 | 2014-08-14 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5338 |
|
|---|
| 5339 | grep: fix integer-width bugs in undossify_input etc.
|
|---|
| 5340 | undossify_input bug reported by Vincent Lefevre in:
|
|---|
| 5341 | http://bugs.gnu.org/18269
|
|---|
| 5342 | * src/dosbuf.c (undossify_input): Return size_t, not int.
|
|---|
| 5343 | * src/grep.c (fillbuf): Work portably even if safe_read returns a
|
|---|
| 5344 | value greater than SSIZE_MAX, e.g., if there's an I/O error.
|
|---|
| 5345 |
|
|---|
| 5346 | 2014-08-03 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5347 |
|
|---|
| 5348 | doc: document LANGUAGE
|
|---|
| 5349 | Reported by Benno Schulenberg in: http://bugs.gnu.org/18185
|
|---|
| 5350 | * doc/grep.texi (Environment Variables): Document LANGUAGE.
|
|---|
| 5351 |
|
|---|
| 5352 | doc: prefer @env to @code
|
|---|
| 5353 | Reported by Benno Schulenberg in: http://bugs.gnu.org/18184
|
|---|
| 5354 | * doc/grep.texi: Avoid @code in favor of @env, or of nothing at all.
|
|---|
| 5355 |
|
|---|
| 5356 | 2014-07-11 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5357 |
|
|---|
| 5358 | doc: Document -r vs --exclude more carefully.
|
|---|
| 5359 | Problem reported by Hugues Andreux in: http://bugs.gnu.org/17763
|
|---|
| 5360 | * doc/grep.texi (File and Directory Selection): Be more careful
|
|---|
| 5361 | about documenting the interaction between recursive searching,
|
|---|
| 5362 | --include, --exclude, and --exclude-dir.
|
|---|
| 5363 |
|
|---|
| 5364 | 2014-06-27 Jim Meyering <meyering@fb.com>
|
|---|
| 5365 |
|
|---|
| 5366 | maint: split long lines, and enforce the 80-column limit
|
|---|
| 5367 | * cfg.mk (sc_long_lines): New rule, from coreutils; exempt tests/*
|
|---|
| 5368 | * src/grep.c (usage): Tweak -F wording to shorten a line.
|
|---|
| 5369 | Correct grammar in a comment.
|
|---|
| 5370 | Split the --exclude-file=... description to fit within 80 columns.
|
|---|
| 5371 | Use emit_bug_reporting_address, eliminating another long line.
|
|---|
| 5372 | * src/dfa.c: Split long lines. No semantic change.
|
|---|
| 5373 | * doc/grep.texi: Likewise.
|
|---|
| 5374 | * tests/include-exclude: Split a long line.
|
|---|
| 5375 | * tests/backref: Split long lines.
|
|---|
| 5376 | * tests/empty: Likewise.
|
|---|
| 5377 | * tests/fmbtest: Likewise.
|
|---|
| 5378 |
|
|---|
| 5379 | doc: update HACKING
|
|---|
| 5380 | * HACKING: Update from coreutils.
|
|---|
| 5381 |
|
|---|
| 5382 | maint: generate distributed THANKS from VC'd THANKS.in
|
|---|
| 5383 | * Makefile.am (THANKS): New rule.
|
|---|
| 5384 | * THANKS.in: New file.
|
|---|
| 5385 | * THANKS: Remove. Now it's generated from the combination of
|
|---|
| 5386 | THANKS.in and git logs.
|
|---|
| 5387 | * .mailmap: New file.
|
|---|
| 5388 | * cfg.mk (sc_THANKS_in_duplicates): New syntax-check rule, from
|
|---|
| 5389 | coreutils.
|
|---|
| 5390 | * .gitignore: Add THANKS.
|
|---|
| 5391 | * thanks-gen: New file, from coreutils.
|
|---|
| 5392 |
|
|---|
| 5393 | 2014-06-27 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5394 |
|
|---|
| 5395 | grep: with -E, unmatched ')' matches itself
|
|---|
| 5396 | Problem reported by Nathan Weeks in: http://bugs.gnu.org/17856
|
|---|
| 5397 | * src/grep.c (Ecompile): Also specify RE_UNMATCHED_RIGHT_PAREN_ORD.
|
|---|
| 5398 | * doc/grep.texi (Fundamental Structure), NEWS: Document this.
|
|---|
| 5399 | * tests/ere.tests: Add a couple of tests for this.
|
|---|
| 5400 | * tests/spencer1.tests: Fix exit status.
|
|---|
| 5401 |
|
|---|
| 5402 | 2014-06-17 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5403 |
|
|---|
| 5404 | build: avoid -Wstack-protector
|
|---|
| 5405 | This allows the use of --enable-gcc-warnings on Gentoo and Ubuntu.
|
|---|
| 5406 | See: http://bugs.gnu.org/17793
|
|---|
| 5407 | * configure.ac (WERROR_CFLAGS): Avoid -Wstack-protector.
|
|---|
| 5408 |
|
|---|
| 5409 | This can be worked around, but the cure is worse than the disease.
|
|---|
| 5410 |
|
|---|
| 5411 | 2014-06-17 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5412 |
|
|---|
| 5413 | build: don't make output files read-only
|
|---|
| 5414 | This led to problems, such as the prompt "mv: try to overwrite
|
|---|
| 5415 | 'egrep', overriding mode 0555 (r-xr-xr-x)? " during a build.
|
|---|
| 5416 | It can be worked around, but the cure is worse than the disease;
|
|---|
| 5417 | making output files read-only is more trouble than it's worth.
|
|---|
| 5418 | * doc/Makefile.am (grep.1, egrep.1, fgrep.1):
|
|---|
| 5419 | * lib/Makefile.am (colorize.c):
|
|---|
| 5420 | * src/Makefile.am (egrep fgrep):
|
|---|
| 5421 | Don't make output files read-only. Prefer separate commands to
|
|---|
| 5422 | '&&' when either will do.
|
|---|
| 5423 |
|
|---|
| 5424 | 2014-06-08 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5425 |
|
|---|
| 5426 | maint: remove grep.spec
|
|---|
| 5427 | * grep.spec: Remove; obsolete and evidently not used.
|
|---|
| 5428 |
|
|---|
| 5429 | 2014-06-07 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5430 |
|
|---|
| 5431 | doc: use gnulib fdl module
|
|---|
| 5432 | * bootstrap.conf (gnulib_modules): Add fdl.
|
|---|
| 5433 | * doc/fdl.texi: Remove, as this now comes from gnulib.
|
|---|
| 5434 | * doc/.gitignore: Update to match current sources.
|
|---|
| 5435 |
|
|---|
| 5436 | 2014-06-06 Jim Meyering <meyering@fb.com>
|
|---|
| 5437 |
|
|---|
| 5438 | build: improve rule to generate egrep+fgrep scripts
|
|---|
| 5439 | * src/Makefile.am (egrep fgrep): chmod a=rx generated files,
|
|---|
| 5440 | and remove $@-t before attempting to redirect to it, in case it
|
|---|
| 5441 | is read-only.
|
|---|
| 5442 |
|
|---|
| 5443 | build: don't redirect directly to $@
|
|---|
| 5444 | * lib/Makefile.am (colorize.c): Don't redirect directly to target, $@.
|
|---|
| 5445 | Otherwise, we could create a corrupt colorize.c file with a
|
|---|
| 5446 | timestamp that indicates it is up to date.
|
|---|
| 5447 | Also, make the generated file read-only.
|
|---|
| 5448 |
|
|---|
| 5449 | 2014-06-05 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5450 |
|
|---|
| 5451 | grep: undo part of previous change
|
|---|
| 5452 | * src/dfa.c (enlist): Undo part of previous change that doesn't
|
|---|
| 5453 | look correct and doesn't help performance much anyway.
|
|---|
| 5454 |
|
|---|
| 5455 | grep: use system strstr if available and fast
|
|---|
| 5456 | Problem reported by Norihiro Tanaka in: http://bugs.gnu.org/17700
|
|---|
| 5457 | * NEWS: Document this.
|
|---|
| 5458 | * bootstrap.conf (gnulib_modules): Add strstr.
|
|---|
| 5459 | * src/dfa.c (istrstr): Remove.
|
|---|
| 5460 | (enlist): Use strstr instead. Wait until we need memory before
|
|---|
| 5461 | allocating it; this can save an unnecessary allocate and free.
|
|---|
| 5462 |
|
|---|
| 5463 | build: update gnulib submodule to latest
|
|---|
| 5464 |
|
|---|
| 5465 | 2014-06-03 Jim Meyering <meyering@fb.com>
|
|---|
| 5466 |
|
|---|
| 5467 | maint: post-release administrivia
|
|---|
| 5468 | * NEWS: Add header line for next release.
|
|---|
| 5469 | * .prev-version: Record previous version.
|
|---|
| 5470 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 5471 |
|
|---|
| 5472 | version 2.20
|
|---|
| 5473 | * NEWS: Record release date.
|
|---|
| 5474 |
|
|---|
| 5475 | 2014-05-30 Jim Meyering <meyering@fb.com>
|
|---|
| 5476 |
|
|---|
| 5477 | grep: fix --max-count=N (-m N) to stop reading after Nth match
|
|---|
| 5478 | With --max-count=N (-m N), grep is supposed to stop reading input
|
|---|
| 5479 | after it has found the Nth match. However, a recent context-
|
|---|
| 5480 | related change made it so grep would always read to end of file.
|
|---|
| 5481 | * src/grep.c (prtext): Don't let a negative "out_after" value
|
|---|
| 5482 | make "pending" line count negative.
|
|---|
| 5483 | * tests/max-count-overread: New test, for this.
|
|---|
| 5484 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 5485 | * NEWS (Bug fixes): Mention it.
|
|---|
| 5486 | * THANKS: Add names of two recent bug reporters.
|
|---|
| 5487 | This bug was introduced by commit v2.18-139-g5122195.
|
|---|
| 5488 | Reported by Marc Aldorasi in http://bugs.gnu.org/17640.
|
|---|
| 5489 |
|
|---|
| 5490 | 2014-05-29 Jim Meyering <meyering@fb.com>
|
|---|
| 5491 |
|
|---|
| 5492 | dfa: fix off-by-one under-allocation from recent change
|
|---|
| 5493 | Commit v2.19-10-gc32ff67 mistakenly made this change:
|
|---|
| 5494 | -realloc_trans_if_necessary (d, 1);
|
|---|
| 5495 | +realloc_trans_if_necessary (d, 0);
|
|---|
| 5496 | which led to a heap buffer overflow.
|
|---|
| 5497 | * src/dfa.c (dfaexec): Allocate space for one state, as before.
|
|---|
| 5498 |
|
|---|
| 5499 | 2014-05-28 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 5500 |
|
|---|
| 5501 | dfa: fix bug with regex containing multiple begin/end-line constraints
|
|---|
| 5502 | grep -E 'a(b$|c$)' would mistakenly match "aa".
|
|---|
| 5503 | * src/dfa.c (dfamust): When resetting 'is' in OR, also reset
|
|---|
| 5504 | 'begline' and 'endline' of 'must'.
|
|---|
| 5505 | * NEWS (Bug fixes): Mention it.
|
|---|
| 5506 | This bug was introduced via commit v2.18-85-g2c94326.
|
|---|
| 5507 | Reported by Péter Radics in <http://bugs.gnu.org/17617>.
|
|---|
| 5508 |
|
|---|
| 5509 | 2014-05-26 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 5510 |
|
|---|
| 5511 | dfa: simplify building initial state
|
|---|
| 5512 | build_state_zero doesn't need the struct dfa to be initialized,
|
|---|
| 5513 | so remove the initialization and simplify.
|
|---|
| 5514 | * src/dfa.c (build_state_zero): Remove.
|
|---|
| 5515 | (dfaexec): Call realloc_trans_if_necessary and build_state directly.
|
|---|
| 5516 |
|
|---|
| 5517 | dfa: revert "grep: do not count newline before the start of buffer"
|
|---|
| 5518 | This reverts commit 5dc3af2806d21455b818be3f9da26c372e4a7f8d.
|
|---|
| 5519 | The previous change renders that commit unnecessary.
|
|---|
| 5520 |
|
|---|
| 5521 | dfa: do not clear the first state of a transition table
|
|---|
| 5522 | If number of DFA states reaches 1024, build_state clears transition
|
|---|
| 5523 | tables to save memory. However, the initial state is always used,
|
|---|
| 5524 | so clearing it just wastes time.
|
|---|
| 5525 | * src/dfa.c (build_state): Do not clear the initial state's
|
|---|
| 5526 | transition and failure tables.
|
|---|
| 5527 |
|
|---|
| 5528 | grep: remove unnecessary argument
|
|---|
| 5529 | * src/grep.c (do_execute): Remove argument 'start_ptr'. It's always null.
|
|---|
| 5530 | All uses changed.
|
|---|
| 5531 |
|
|---|
| 5532 | 2014-05-24 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5533 |
|
|---|
| 5534 | grep: --exclude-dir=FOO/ now ignores the trailing slash
|
|---|
| 5535 | Problem reported by Khaled Ziyaeen; see: http://bugs.gnu.org/17481
|
|---|
| 5536 | * NEWS, doc/grep.texi (File and Directory Selection): Document this.
|
|---|
| 5537 | * src/grep.c (main): Implement this.
|
|---|
| 5538 | * tests/include-exclude: Test this.
|
|---|
| 5539 |
|
|---|
| 5540 | dist: don't distribute lib/colorize.c
|
|---|
| 5541 | 'configure' creates this file, so it shouldn't be distributed; see:
|
|---|
| 5542 | http://bugs.gnu.org/17480
|
|---|
| 5543 | * configure.ac (COLORIZE_SOURCE): New macro.
|
|---|
| 5544 | Don't use AC_CONFIG_LINKS for lib/colorize.c.
|
|---|
| 5545 | * lib/Makefile.am (nodist_libgreputils_a_SOURCES): New macro.
|
|---|
| 5546 | (libgreputils_a_SOURCES): Remove colorize.c.
|
|---|
| 5547 | (CLEANFILES): Add colorize.c
|
|---|
| 5548 | (colorize.c): New rule.
|
|---|
| 5549 |
|
|---|
| 5550 | 2014-05-23 behoffski <behoffski@grouse.com.au>
|
|---|
| 5551 |
|
|---|
| 5552 | maint: uncapitalize first letter of two dfaerror message strings
|
|---|
| 5553 | * dfa.c (lex): Make two message strings consistent with all of
|
|---|
| 5554 | the others: do not capitalize the first letter of the first word.
|
|---|
| 5555 |
|
|---|
| 5556 | 2014-05-23 Jim Meyering <meyering@fb.com>
|
|---|
| 5557 |
|
|---|
| 5558 | maint: revert "grep: port mb_next_wc to RHEL 6.5 x86-64"
|
|---|
| 5559 | This reverts commit v2.18-148-ga6ae68d.
|
|---|
| 5560 | Now that we have gnulib change v0.1-131-g2a045bc, "mbrlen, mbrtowc:
|
|---|
| 5561 | fix bug with empty input", this work-around is no longer needed.
|
|---|
| 5562 |
|
|---|
| 5563 | gnulib: update, for mbrlen/mbrtowc empty input bug fix
|
|---|
| 5564 |
|
|---|
| 5565 | 2014-05-22 Jim Meyering <meyering@fb.com>
|
|---|
| 5566 |
|
|---|
| 5567 | maint: post-release administrivia
|
|---|
| 5568 | * NEWS: Add header line for next release.
|
|---|
| 5569 | * .prev-version: Record previous version.
|
|---|
| 5570 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 5571 |
|
|---|
| 5572 | version 2.19
|
|---|
| 5573 | * NEWS: Record release date.
|
|---|
| 5574 |
|
|---|
| 5575 | 2014-05-21 Jim Meyering <meyering@fb.com>
|
|---|
| 5576 |
|
|---|
| 5577 | maint: avoid new false-positive syntax-check failure
|
|---|
| 5578 | * cfg.mk (exclude_file_name_regexp--sc_prohibit_doubled_word):
|
|---|
| 5579 | Exempt new test file that contains legitimate use of "in in".
|
|---|
| 5580 |
|
|---|
| 5581 | 2014-05-17 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 5582 |
|
|---|
| 5583 | tests: add test case for newline-count fix
|
|---|
| 5584 | * tests/count-newline: New test.
|
|---|
| 5585 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 5586 |
|
|---|
| 5587 | 2014-05-16 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 5588 |
|
|---|
| 5589 | grep: do not count newline before the start of buffer
|
|---|
| 5590 | * src/dfa.c (build_state): When checking whether the previous
|
|---|
| 5591 | character was a newline, do not count any newline before the
|
|---|
| 5592 | start of the buffer.
|
|---|
| 5593 |
|
|---|
| 5594 | 2014-05-15 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5595 |
|
|---|
| 5596 | grep: port mb_next_wc to RHEL 6.5 x86-64
|
|---|
| 5597 | * src/searchutils.c (mb_next_wc): Work around glibc bug 16950; see:
|
|---|
| 5598 | https://sourceware.org/bugzilla/show_bug.cgi?id=16950
|
|---|
| 5599 | This bug was masked in the other GNU/Linux tests I made. It was
|
|---|
| 5600 | exposed on RHEL 6.5 x86-64, where the compiler (GCC Red Hat 4.4.7-4)
|
|---|
| 5601 | happened to use temporaries in a different way.
|
|---|
| 5602 | Also see recent changes to the Gnulib documentation in this area:
|
|---|
| 5603 | http://lists.gnu.org/archive/html/bug-gnulib/2014-05/msg00013.html
|
|---|
| 5604 |
|
|---|
| 5605 | tests: port mb-non-UTF8-performance to RHEL 6.5
|
|---|
| 5606 | * tests/mb-non-UTF8-performance (timeout): Use an integer,
|
|---|
| 5607 | as 'timeout 1.234' doesn't work in EUC locales.
|
|---|
| 5608 |
|
|---|
| 5609 | 2014-05-12 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5610 |
|
|---|
| 5611 | egrep, fgrep: port to Solaris 10 /bin/sh
|
|---|
| 5612 | This old shell doesn't grok ${0%/*}; see: http://bugs.gnu.org/17471
|
|---|
| 5613 | * src/Makefile.am (egrep fgrep): Don't assume the shell does substrings.
|
|---|
| 5614 | * src/egrep.sh (dir): New var, so that the substring calculation is
|
|---|
| 5615 | done only once (which is a small win even with newer shells),
|
|---|
| 5616 | and so that the calculation is easier to edit on older shells.
|
|---|
| 5617 |
|
|---|
| 5618 | 2014-05-10 Jim Meyering <meyering@fb.com>
|
|---|
| 5619 |
|
|---|
| 5620 | maint: NEWS: adjust wording to reflect move
|
|---|
| 5621 | * NEWS (Improvements): Correct direction-relative wording,
|
|---|
| 5622 | now that the referent is below, not above.
|
|---|
| 5623 |
|
|---|
| 5624 | maint: NEWS: move "Improvements" to the top
|
|---|
| 5625 | * NEWS: Move the small "Improvements" section to precede
|
|---|
| 5626 | the longer "Bug fixes" one.
|
|---|
| 5627 |
|
|---|
| 5628 | gnulib: update submodule to latest, and bootstrap
|
|---|
| 5629 | * gnulib: Update submodule.
|
|---|
| 5630 | * bootstrap: Update from gnulib.
|
|---|
| 5631 |
|
|---|
| 5632 | 2014-05-10 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5633 |
|
|---|
| 5634 | dfa: omit double includes
|
|---|
| 5635 | * src/dfa.c: Don't include stddef.h or stdbool.h, as dfa.h includes
|
|---|
| 5636 | them already, and it's the same module as we are.
|
|---|
| 5637 | Suggested by Aharon Robbins in: http://bugs.gnu.org/17458
|
|---|
| 5638 |
|
|---|
| 5639 | dfa: fix bug with \< etc in multibyte locales
|
|---|
| 5640 | Problem reported by Stephane Chazelas in: http://bugs.gnu.org/16867
|
|---|
| 5641 | * NEWS: Document the fix.
|
|---|
| 5642 | * src/dfa.c (dfaoptimize): Remove any superset if changing from
|
|---|
| 5643 | UTF-8 to unibyte, and if the pattern has no backreferences.
|
|---|
| 5644 | (dfassbuild): In multibyte locales, treat \< \> \b \B as
|
|---|
| 5645 | backreferences in the DFA, since the DFA relies on unibyte
|
|---|
| 5646 | tests to check them.
|
|---|
| 5647 | (dfacomp): Optimize after building the superset, so that
|
|---|
| 5648 | dfassbuild can depend on d->multibyte. A downside is that
|
|---|
| 5649 | dfaoptimize must remove supersets that are likely slower than the
|
|---|
| 5650 | DFA after optimization, but that's been done in the
|
|---|
| 5651 | above-described change.
|
|---|
| 5652 | * tests/Makefile.am (XFAIL_TESTS): Remove word-delim-multibyte,
|
|---|
| 5653 | since the test works now.
|
|---|
| 5654 |
|
|---|
| 5655 | tests: add test case for -C 0 change
|
|---|
| 5656 | * tests/context-0: New test.
|
|---|
| 5657 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 5658 |
|
|---|
| 5659 | grep: -A 0, -B 0, -C 0 now output a separator
|
|---|
| 5660 | Problem reported by Dan Jacobson in: http://bugs.gnu.org/17380
|
|---|
| 5661 | * NEWS:
|
|---|
| 5662 | * doc/grep.texi (Context Line Control): Document this.
|
|---|
| 5663 | * src/grep.c (prtext): Output a separator even if context is zero.
|
|---|
| 5664 | (main): Default context is now -1, not 0.
|
|---|
| 5665 |
|
|---|
| 5666 | 2014-05-09 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5667 |
|
|---|
| 5668 | grep: minor improvements to retry-DFA-superset patch
|
|---|
| 5669 | * src/dfasearch.c (EGexecute): Avoid unnecessary test in a context
|
|---|
| 5670 | where memrchr cannot return a null pointer.
|
|---|
| 5671 |
|
|---|
| 5672 | 2014-05-09 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 5673 |
|
|---|
| 5674 | grep: retry DFA superset after matching multiple lines
|
|---|
| 5675 | * src/dfasearch.c (EGexecute): Without this patch, the code reverts
|
|---|
| 5676 | to KWset when the DFA superset matches multiple lines.
|
|---|
| 5677 | However, if the DFA superset matches multiple lines, it most likely
|
|---|
| 5678 | also matches a single line, and reverting to KWset means dfafast
|
|---|
| 5679 | won't work effectively. Change the code so that it retries the DFA
|
|---|
| 5680 | superset immediately after it matches multipline lines. On my platform
|
|---|
| 5681 | this improves the performance of "LC_ALL=C grep '\(ab\)cd\1d' k" from
|
|---|
| 5682 | 3.48 to 2.14 seconds realtime, where k contains the output of
|
|---|
| 5683 | "yes abcdabc | head -50000000".
|
|---|
| 5684 |
|
|---|
| 5685 | dfa: fix inconsistency in multibyte locales
|
|---|
| 5686 | * src/dfa.c (dfaexec): Use the same exit condition in multibyte
|
|---|
| 5687 | locales as in unibyte.
|
|---|
| 5688 |
|
|---|
| 5689 | 2014-05-08 Jim Meyering <meyering@fb.com>
|
|---|
| 5690 |
|
|---|
| 5691 | maint: mark some breakless cases with /* fallthrough */ comment
|
|---|
| 5692 | * src/dfa.c (addtok_mb, dfaanalyze): Add comment so that it is
|
|---|
| 5693 | clear that the "break" statement is deliberately omitted.
|
|---|
| 5694 |
|
|---|
| 5695 | 2014-05-08 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5696 |
|
|---|
| 5697 | dfa: assume C89 for CHAR_BIT
|
|---|
| 5698 | * src/dfa.c (CHARBITS): Remove. All uses replaced by CHAR_BIT.
|
|---|
| 5699 | (NOTCHAR): Now an enum, since it need not be a macro.
|
|---|
| 5700 |
|
|---|
| 5701 | dfa: don't assume unsigned int is exactly 32 bits wide
|
|---|
| 5702 | Sun C 5.12 (sparc) warns of the potential unportability.
|
|---|
| 5703 | * src/dfa.c (charclass_word): New type, for clarity.
|
|---|
| 5704 | All relevant uses of 'unsigned' changed.
|
|---|
| 5705 | (CHARCLASS_WORD_BITS): Rename from INTBITS. All uses changed.
|
|---|
| 5706 | Now an enum, since it needn't be a macro.
|
|---|
| 5707 | (CHARCLASS_WORD_MASK): New macro.
|
|---|
| 5708 | (CHARCLASS_WORDS): Rename from CHARCLASS_INTS. All uses changed.
|
|---|
| 5709 | (setbit, clrbit): Cast 1 to charclass_word, for clarity.
|
|---|
| 5710 | (notset, add_utf8_anychar, dfastats):
|
|---|
| 5711 | Don't assume unsigned int is exactly 32 bits wide.
|
|---|
| 5712 | (dfastate): Don't rely on implementation-defined conversion of
|
|---|
| 5713 | greater-than-INT_MAX unsigned to int. Change bit test to resemble
|
|---|
| 5714 | tstbit more.
|
|---|
| 5715 |
|
|---|
| 5716 | maint: fix indenting to pacify 'prohibit_tab_based_indentation'
|
|---|
| 5717 | * src/dfa.c: Use spaces and not tabs to indent some lines.
|
|---|
| 5718 |
|
|---|
| 5719 | grep: simplify and clarify invert-related code
|
|---|
| 5720 | * src/grep.c (out_invert, prtext): Use bool for booleans.
|
|---|
| 5721 | (prline): Remove unnecessary '!!' on a value that is always 0 or 1.
|
|---|
| 5722 | (prtext): Remove last arg NLINESP; use !out_invert instead. All uses
|
|---|
| 5723 | changed. Move decls to nearer uses, since we can assume C99 here.
|
|---|
| 5724 | Update 'outleft' and 'after_last_match' here; it's simpler.
|
|---|
| 5725 | (grepbuf): Compute return value by subtracting new from old 'outleft',
|
|---|
| 5726 | rather than by keeping a separate running total. Avoid code duplication
|
|---|
| 5727 | by arranging for prtext to be called from one place, not three.
|
|---|
| 5728 |
|
|---|
| 5729 | 2014-05-08 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 5730 |
|
|---|
| 5731 | grep: improve performance of -v when combined with -L, -l or -q
|
|---|
| 5732 | Problem reported by Jörn Hees in: http://bugs.gnu.org/17427
|
|---|
| 5733 | * src/grep.c (grepbuf, grep): When -v is combined with -L, -l, or -q,
|
|---|
| 5734 | don't read data unnecessarily after a non-match is found.
|
|---|
| 5735 |
|
|---|
| 5736 | 2014-05-06 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5737 |
|
|---|
| 5738 | doc: mention performance changes
|
|---|
| 5739 | * NEWS: Discuss recent performance improvements and downgrades.
|
|---|
| 5740 |
|
|---|
| 5741 | dfa: clarify use of "if"
|
|---|
| 5742 | The phrase "Y is true if X" is logically equivalent to "X implies Y",
|
|---|
| 5743 | but often "X if and only if Y" was intended.
|
|---|
| 5744 | * src/dfa.c, src/dfa.h: Reword to avoid the incorrect use of "if".
|
|---|
| 5745 |
|
|---|
| 5746 | dfa: minor performance improvement for previous change
|
|---|
| 5747 | * src/dfa.c (struct dfa): New member 'fast'. Remove 'has_backref'.
|
|---|
| 5748 | All uses changed.
|
|---|
| 5749 |
|
|---|
| 5750 | 2014-05-06 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 5751 |
|
|---|
| 5752 | dfa: speed up 'dfaisfast'
|
|---|
| 5753 | * src/dfa.c (struct dfa): New member 'has_backref'.
|
|---|
| 5754 | (addtok_mb): Set it.
|
|---|
| 5755 | (dfaisfast): Use it.
|
|---|
| 5756 |
|
|---|
| 5757 | 2014-05-05 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5758 |
|
|---|
| 5759 | grep: fix -w match next to a multibyte letter
|
|---|
| 5760 | * NEWS: Document this.
|
|---|
| 5761 | * src/dfasearch.c, src/kwsearch.c (WCHAR): Remove.
|
|---|
| 5762 | (wordchar): New static function.
|
|---|
| 5763 | * src/dfasearch.c (EGexecute):
|
|---|
| 5764 | * src/kwsearch.c (Fexecute): Use the new functions, so that the
|
|---|
| 5765 | code works correctly if a multibyte character adjacent to the
|
|---|
| 5766 | match has two or more bytes.
|
|---|
| 5767 | * src/search.h, src/searchutils.c (mb_prev_wc, mb_next_wc):
|
|---|
| 5768 | New functions.
|
|---|
| 5769 | * tests/word-delim-multibyte: Add a test for grep -w (which now
|
|---|
| 5770 | passes), and a test for \> (which still fails). The \< test also
|
|---|
| 5771 | still fails.
|
|---|
| 5772 |
|
|---|
| 5773 | grep: improve internal API for multibyte boundary
|
|---|
| 5774 | * src/search.h, src/searchutils.c (mb_goback): Rename from
|
|---|
| 5775 | is_mb_middle. Omit last arg. Return number of bytes to go back,
|
|---|
| 5776 | not just a boolean. All uses changed.
|
|---|
| 5777 | * src/dfasearch.c (EGexecute):
|
|---|
| 5778 | * src/kwsearch.c (Fexecute): Adjust to API change.
|
|---|
| 5779 | * src/kwsearch.c (Fexecute): Eliminate common subexpression.
|
|---|
| 5780 |
|
|---|
| 5781 | grep: fix encoding-error incompatibilities among regex, DFA, KWset
|
|---|
| 5782 | This follows up to http://bugs.gnu.org/17376 and fixes a different
|
|---|
| 5783 | set of incompatibilities, namely between the regex matcher and the
|
|---|
| 5784 | other matchers, when the pattern contains encoding errors.
|
|---|
| 5785 | The GNU regex matcher is not consistent in this area: sometimes
|
|---|
| 5786 | an encoding error matches only itself, and sometimes it
|
|---|
| 5787 | matches part of a multibyte character. There is no documentation
|
|---|
| 5788 | for grep's behavior in this area and users don't seem to care,
|
|---|
| 5789 | and it's simpler to defer to the regex matcher for problematic
|
|---|
| 5790 | cases like these.
|
|---|
| 5791 | * NEWS: Document this.
|
|---|
| 5792 | * src/dfa.c (ctok): Remove. All uses removed.
|
|---|
| 5793 | (parse_bracket_exp, atom): Use BACKREF if a pattern contains
|
|---|
| 5794 | an encoding error, so that the matcher will revert to regex.
|
|---|
| 5795 | * src/dfasearch.c, src/grep.c, src/pcresearch.c, src/searchutils.c:
|
|---|
| 5796 | Don't include dfa.h, since search.h now does that for us.
|
|---|
| 5797 | * src/dfasearch.c (EGexecute):
|
|---|
| 5798 | * src/kwsearch.c (Fexecute): In a UTF-8 locale, there's no need to
|
|---|
| 5799 | worry about matching part of a multibyte character.
|
|---|
| 5800 | * src/grep.c (contains_encoding_error): New static function.
|
|---|
| 5801 | (main): Use it, so that grep -F is consistent with plain fgrep
|
|---|
| 5802 | when the pattern contains an encoding error.
|
|---|
| 5803 | * src/search.h: Include dfa.h, so that kwsearch.c can call using_utf8.
|
|---|
| 5804 | * src/searchutils.c (is_mb_middle): Remove UTF-8-specific code.
|
|---|
| 5805 | Callers now ensure that we are in a non-UTF-8 locale.
|
|---|
| 5806 | The code was clearly wrong, anyway.
|
|---|
| 5807 | * tests/fgrep-infloop, tests/invalid-multibyte-infloop:
|
|---|
| 5808 | * tests/prefix-of-multibyte:
|
|---|
| 5809 | Do not require that grep have a particular behavor for this test.
|
|---|
| 5810 | It's OK to match (exit status 0), not match (exit status 1), or
|
|---|
| 5811 | report an error (exit status 2), since the pattern contains an
|
|---|
| 5812 | encoding error and grep's behavior is not specified for such
|
|---|
| 5813 | patterns. Test only that KWset, DFA, and regex agree.
|
|---|
| 5814 | * tests/prefix-of-multibyte: Add tests for ABCABC and __..._ABCABC___.
|
|---|
| 5815 |
|
|---|
| 5816 | 2014-05-04 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5817 |
|
|---|
| 5818 | dfa: minor simplification
|
|---|
| 5819 | * src/dfa.c (parse_bracket_exp): Use enum, not macro, and move var
|
|---|
| 5820 | to just the scope it's needed.
|
|---|
| 5821 |
|
|---|
| 5822 | grep: simplify and fix problems with KWset-DFA agreement patch
|
|---|
| 5823 | * src/dfa.c (dfambcache, parse_bracket_exp): Simplify.
|
|---|
| 5824 | (mbs_to_wchar, wctok, FETCH_WC, match_anychar, match_mb_charset)
|
|---|
| 5825 | (check_matching_with_multibyte_ops, transit_state_consume_1char)
|
|---|
| 5826 | (transit_state, dfaexec): Use wint_t, not wchar_t, so that
|
|---|
| 5827 | WEOF is treated correctly on platforms where WEOF is not a valid
|
|---|
| 5828 | wchar_t value.
|
|---|
| 5829 | (ctok, lex): Use int, not unsigned int, for characters,
|
|---|
| 5830 | so that EOF is treated more naturally.
|
|---|
| 5831 | (parse_bracket_exp): Use NOTCHAR to mark uninitialized char, since
|
|---|
| 5832 | FETCH_WC can now set the char to EOF.
|
|---|
| 5833 | (lex): Remove unnecessary test for EOF.
|
|---|
| 5834 | (parse_bracket_exp, atom): Swap then and else parts, to put
|
|---|
| 5835 | the small one first; this is more readable here.
|
|---|
| 5836 | * src/searchutils.c (is_mb_middle): Simplify.
|
|---|
| 5837 |
|
|---|
| 5838 | tests: improve coverage for prefix-of-multibyte
|
|---|
| 5839 | * tests/prefix-of-multibyte: Also test the regex version.
|
|---|
| 5840 |
|
|---|
| 5841 | 2014-05-04 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 5842 |
|
|---|
| 5843 | grep: make KWset and DFA agree about invalid sequences in patterns
|
|---|
| 5844 | See: http://bugs.gnu.org/17376
|
|---|
| 5845 | * src/dfa.c (dfambcache): Don't cache invalid sequences, because they can't be
|
|---|
| 5846 | represented by wide characters.
|
|---|
| 5847 | (dfambcache, mbs_to_wchar): Return WEOF for invalid sequences.
|
|---|
| 5848 | (ctok): New global variable.
|
|---|
| 5849 | (parse_bracket_exp, atom, match_anychar, match_mb_charset): Don't allow WEOF.
|
|---|
| 5850 | (lex): Set 'ctok'.
|
|---|
| 5851 | * src/kwsearch.c (Fexecute):
|
|---|
| 5852 | * src/searchutils.c (is_mb_middle): Don't check here.
|
|---|
| 5853 | * tests/invalid-multibyte-infloop: Adjust to fixed behavior.
|
|---|
| 5854 | * tests/prefix-of-multibyte: Add test cases for this bug.
|
|---|
| 5855 |
|
|---|
| 5856 | 2014-05-03 Jim Meyering <meyering@fb.com>
|
|---|
| 5857 |
|
|---|
| 5858 | maint: make ChangeLog generation more robust
|
|---|
| 5859 | * Makefile.am (gen-ChangeLog): Sync changes from GNU coreutils,
|
|---|
| 5860 | to ensure exit status is propagated, and to support an optional
|
|---|
| 5861 | git-log-fix file.
|
|---|
| 5862 |
|
|---|
| 5863 | 2014-05-03 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5864 |
|
|---|
| 5865 | grep: clarify EGexecute slightly
|
|---|
| 5866 | * src/dfasearch.c (EGexecute): Change if-then-else to !if-else-then.
|
|---|
| 5867 |
|
|---|
| 5868 | 2014-05-03 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 5869 |
|
|---|
| 5870 | grep: fix the bug in previous patch.
|
|---|
| 5871 | * src/dfasearch.c (EGexecute): Do it.
|
|---|
| 5872 |
|
|---|
| 5873 | 2014-04-30 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5874 |
|
|---|
| 5875 | grep: simplify EGexecute further
|
|---|
| 5876 | * src/dfa.c, src/dfa.h (dfasuperset): Arg is now const pointer.
|
|---|
| 5877 | Now pure.
|
|---|
| 5878 | * src/dfasearch.c (EGexecute): Coalesce some duplicate code.
|
|---|
| 5879 | Don't worry about memrchr returning NULL when that's impossible.
|
|---|
| 5880 |
|
|---|
| 5881 | 2014-04-30 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 5882 |
|
|---|
| 5883 | grep: adjust timing back to kwset when dfaisfast is true
|
|---|
| 5884 | * src/dfasearch.c (EGexecute): If DFA fails after kwset succeeds,
|
|---|
| 5885 | the code doesn't return to kwset until it reaches the end of the buffer
|
|---|
| 5886 | or finds a match. Because of this, although some cases speed up,
|
|---|
| 5887 | others slow down.
|
|---|
| 5888 |
|
|---|
| 5889 | Adjust the heuristic for switching to the DFA, so that it
|
|---|
| 5890 | is more likely to switch at the right times.
|
|---|
| 5891 |
|
|---|
| 5892 | 2014-04-30 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 5893 |
|
|---|
| 5894 | grep: simplify superset
|
|---|
| 5895 | * src/dfa.h (dfahint): Remove decl.
|
|---|
| 5896 | (dfasuperset): New decl.
|
|---|
| 5897 | * src/dfa.c (dfahint): Remove.
|
|---|
| 5898 | (dfassbuild): Rename from dfasuperset.
|
|---|
| 5899 | (dfasuperset): New function. It returns the superset of D.
|
|---|
| 5900 | * src/dfasearch.c: Use dfasuperset instead of dfahint, and simplify.
|
|---|
| 5901 |
|
|---|
| 5902 | dfa: optimize memory allocation
|
|---|
| 5903 | * src/dfa.c (epsclosure): get the value of 'visited' from the argument.
|
|---|
| 5904 | (dfaanalyze): Define and allocate variable 'visited'.
|
|---|
| 5905 | (dfastate): Use not 'insert' but 'merge' to insert positions for
|
|---|
| 5906 | state 0 of DFA.
|
|---|
| 5907 |
|
|---|
| 5908 | 2014-04-29 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 5909 |
|
|---|
| 5910 | kwset: improve performance by inlining tr
|
|---|
| 5911 | Without this change, older versions of GCC won't inline 'tr', and this
|
|---|
| 5912 | can hurt performance significantly. See: http://bugs.gnu.org/17229#64
|
|---|
| 5913 | * src/kwset.c (tr): Make it inline.
|
|---|
| 5914 |
|
|---|
| 5915 | 2014-04-27 Jim Meyering <meyering@fb.com>
|
|---|
| 5916 |
|
|---|
| 5917 | gnulib: update to latest
|
|---|
| 5918 | * gnulib: This fixes a bug whereby running bootstrap
|
|---|
| 5919 | would remove our build-aux/git-log-fix file.
|
|---|
| 5920 |
|
|---|
| 5921 | 2014-04-27 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5922 |
|
|---|
| 5923 | kwset: improve performance by inlining more
|
|---|
| 5924 | Problem reported by Norihiro Tanaka in <http://bugs.gnu.org/17229#55>.
|
|---|
| 5925 | * src/kwset.c (bmexec_trans): Rename from bmexec, and make it inline.
|
|---|
| 5926 | (bmexec): New implementation, which calls bmexec_trans. This helps
|
|---|
| 5927 | GCC inline more aggressively with the default optimization, and
|
|---|
| 5928 | improves performance 25% with the reported benchmark on my host.
|
|---|
| 5929 |
|
|---|
| 5930 | 2014-04-26 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5931 |
|
|---|
| 5932 | kwset: speed up by using memchr2
|
|---|
| 5933 | Idea suggested by Eric Blake in: http://bugs.gnu.org/17229#43
|
|---|
| 5934 | * bootstrap.conf (gnulib_modules): Add memchr2.
|
|---|
| 5935 | * src/kwset.c: Include stdint.h, for uintptr_t. Include memchr2.h.
|
|---|
| 5936 | (struct kwset): New members gc1, gc2, gc1help.
|
|---|
| 5937 | (tr): Move earlier, so it can be used earlier.
|
|---|
| 5938 | (kwsprep): Initialize struct kwset's new members.
|
|---|
| 5939 | (memchr_kwset): Rename from memchr_trans. Combine C and TRANS args into
|
|---|
| 5940 | new arg KWSET. All uses changed. Use memchr2 when appropriate.
|
|---|
| 5941 | (bmexec): Use new members instead of recomputing their values.
|
|---|
| 5942 | Increase advance_heuristic; it's just a guess, but memchr2 probably
|
|---|
| 5943 | makes it reasonable to increase it.
|
|---|
| 5944 |
|
|---|
| 5945 | kwset: improve performance when large Boyer-Moore key doesn't match
|
|---|
| 5946 | * src/kwset.c (bmexec): As a heuristic, prefer memchr to seeking
|
|---|
| 5947 | by delta1 only when the latter doesn't advance much.
|
|---|
| 5948 |
|
|---|
| 5949 | dfa: fix index bug in previous patch, and simplify
|
|---|
| 5950 | * src/dfa.c, src/dfa.h (dfaisfast): Arg is const pointer.
|
|---|
| 5951 | * src/dfa.c (dfaisfast): Simplify, since supersets never contain BACKREF.
|
|---|
| 5952 | * src/dfa.h (dfaisfast): Declare to be pure.
|
|---|
| 5953 | * src/dfasearch.c (EGexecute): Fix typo that could cause buffer
|
|---|
| 5954 | read overrun when !dfafast. Hoist duplicate computation out
|
|---|
| 5955 | of an if's then and else parts.
|
|---|
| 5956 |
|
|---|
| 5957 | 2014-04-26 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 5958 |
|
|---|
| 5959 | grep: speed up for a case to repeat failure in DFA after success in kwset
|
|---|
| 5960 | A DFA is typically much faster if it is unibyte and does not set BACKREF.
|
|---|
| 5961 | Skip kwset if the DFA is fast. For example:
|
|---|
| 5962 |
|
|---|
| 5963 | yes abcdabc | head -50000000 >k
|
|---|
| 5964 | env LC_ALL=C time -p src/grep -i 'abcd.bd' k
|
|---|
| 5965 |
|
|---|
| 5966 | This improved real-time from 4.86 to 1.34 s.
|
|---|
| 5967 |
|
|---|
| 5968 | * src/dfa.c, src/dfa.h (dfaisfast): New function.
|
|---|
| 5969 | * src/dfasearch.c (EGexecute): Use it.
|
|---|
| 5970 |
|
|---|
| 5971 | 2014-04-24 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5972 |
|
|---|
| 5973 | dfa: fix recently-introduced memory leak
|
|---|
| 5974 | Problem reported by Aharon Robbins in: http://bugs.gnu.org/17341
|
|---|
| 5975 | * src/dfa.c (dfasuperset): free after dfafree.
|
|---|
| 5976 |
|
|---|
| 5977 | misc: fix doc and test bugs re grep -z
|
|---|
| 5978 | Problem reported by Stephane Chazelas in: http://bugs.gnu.org/16871
|
|---|
| 5979 | * doc/grep.texi (Usage): Remove incorrect example with -P.
|
|---|
| 5980 | * tests/pcre: Improve test so that it actually tests whether \s
|
|---|
| 5981 | matches a newline.
|
|---|
| 5982 |
|
|---|
| 5983 | dfa: minor simplification of dfaexec
|
|---|
| 5984 | * src/dfa.c (dfaexec): Streamline updating of returned values.
|
|---|
| 5985 | Don't bother to check d->multibyte before updating mbp.
|
|---|
| 5986 | Avoid duplicate p > end test.
|
|---|
| 5987 |
|
|---|
| 5988 | 2014-04-24 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 5989 |
|
|---|
| 5990 | dfa: simplify and be more consistent about MB_CUR_MAX
|
|---|
| 5991 | * src/dfa.c (struct dfa): New member 'multibyte',
|
|---|
| 5992 | replacing 'mb_cur_max'. All uses changed. Use this new member
|
|---|
| 5993 | consistently, instead of sometimes referring to MB_CUR_MAX directly.
|
|---|
| 5994 |
|
|---|
| 5995 | dfa: fix comment
|
|---|
| 5996 | * src/dfa.c (maybe_realloc): Fix comment to match behavior better.
|
|---|
| 5997 |
|
|---|
| 5998 | 2014-04-24 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 5999 |
|
|---|
| 6000 | grep: skip checking of multibyte character boundary, reaching at eolbyte
|
|---|
| 6001 | * src/dfa.c (dfaexec): Skip checking of multibyte character boundary,
|
|---|
| 6002 | reaching at eolbyte.
|
|---|
| 6003 |
|
|---|
| 6004 | 2014-04-24 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6005 |
|
|---|
| 6006 | dfa: fix incorrect comment that led to heap overrun
|
|---|
| 6007 | * dfa.c (maybe_realloc): Fix comment to match behavior.
|
|---|
| 6008 |
|
|---|
| 6009 | dfa: minor tuneup of dfamust memory savings patch
|
|---|
| 6010 | * src/dfa.c (allocmust): Use xmalloc, not xzalloc.
|
|---|
| 6011 | Initialize the must completely, so that the caller need not
|
|---|
| 6012 | invoke resetmust. All callers changed.
|
|---|
| 6013 | (dfamust): Omit asserts that aren't needed on typical machines
|
|---|
| 6014 | where dereferencing NULL dumps core. Don't leak memory if the
|
|---|
| 6015 | pattern contains a NUL byte.
|
|---|
| 6016 |
|
|---|
| 6017 | 2014-04-24 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 6018 |
|
|---|
| 6019 | grep: avoid wasting memory for large patterns in dfamust
|
|---|
| 6020 | * src/dfa.c (struct must): New member 'prev'. It points to the
|
|---|
| 6021 | previous must.
|
|---|
| 6022 | (allocmust): New function.
|
|---|
| 6023 | (freemust): New function.
|
|---|
| 6024 | (dfamust): Use it.
|
|---|
| 6025 |
|
|---|
| 6026 | 2014-04-24 Jim Meyering <meyering@fb.com>
|
|---|
| 6027 |
|
|---|
| 6028 | grep: fix new heap write buffer overrun
|
|---|
| 6029 | * src/dfa.c (parse_bracket_exp): Fix off-by-one allocation error.
|
|---|
| 6030 | Exposed by running the tests with an ASAN-enabled binary (i.e.,
|
|---|
| 6031 | created using gcc's -fsanitize=address option). Introduced by
|
|---|
| 6032 | commit v2.18-70-gd3d9612, "dfa: simplify range char allocation".
|
|---|
| 6033 |
|
|---|
| 6034 | 2014-04-24 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6035 |
|
|---|
| 6036 | build: suppress unsafe-loop-optimizations warnings
|
|---|
| 6037 | I ran into one of these while trying out GCC 4.9.0's new
|
|---|
| 6038 | -fsanitize=undefined option. The warning told me that GCC didn't
|
|---|
| 6039 | do an unsafe optimization, but in 'grep' this is not typically a
|
|---|
| 6040 | symptom of a programming error.
|
|---|
| 6041 | * configure.ac (WERROR_CFLAGS): Suppress -Wunsafe-loop-optimizations.
|
|---|
| 6042 |
|
|---|
| 6043 | 2014-04-23 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6044 |
|
|---|
| 6045 | dfa: fix memory leak reintroduced by previous patch
|
|---|
| 6046 | Reported by Norihiro Tanaka in <http://bugs.gnu.org/17328#16>.
|
|---|
| 6047 | * src/dfa.c (dfaexec): Allocate mb_match_lens and mb_follows only
|
|---|
| 6048 | if not already allocated.
|
|---|
| 6049 | (free_mbdata): Null out mb_match_lens to mark it as being freed.
|
|---|
| 6050 |
|
|---|
| 6051 | 2014-04-23 Jim Meyering <meyering@fb.com>
|
|---|
| 6052 |
|
|---|
| 6053 | tests: use consistent spelling for locale name, en_US.UTF-8
|
|---|
| 6054 | * tests/pcre-infloop: Spell locale name, en_US.UTF-8, consistently,
|
|---|
| 6055 | converting this one use from "en_US.utf8", which would provoke a
|
|---|
| 6056 | test failure on OS/X.
|
|---|
| 6057 |
|
|---|
| 6058 | 2014-04-23 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6059 |
|
|---|
| 6060 | dfa: omit static variables that limited dfaexec to one struct dfa
|
|---|
| 6061 | Problem reported by Aharon Robbins in: http://bugs.gnu.org/17328
|
|---|
| 6062 | * src/dfa.c (struct dfa): New member mbs.
|
|---|
| 6063 | mb_follows is now a position_set, not a pointer to one;
|
|---|
| 6064 | this simplifies memory allocation. All uses changed.
|
|---|
| 6065 | (mbs_to_wchar): Put DFA arg at the end, in place of the mbstate_t *arg,
|
|---|
| 6066 | since the DFA now contains an mbstate_t. All uses changed.
|
|---|
| 6067 | (mbs): Remove static variable.
|
|---|
| 6068 | (dfaexec): Remove static bool that attempted to optimize memory
|
|---|
| 6069 | allocation, as this wasn't correct for Gawk. Perhaps we can think
|
|---|
| 6070 | of a better way to optimize memory.
|
|---|
| 6071 |
|
|---|
| 6072 | 2014-04-22 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6073 |
|
|---|
| 6074 | kwset: simplify and speed up Boyer-Moore unibyte -i in some cases
|
|---|
| 6075 | This improves the performance of, for example,
|
|---|
| 6076 | yes jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj | head -10000000 | grep -i jk
|
|---|
| 6077 | in a unibyte locale.
|
|---|
| 6078 | * src/kwset.c (memchr_trans): New function.
|
|---|
| 6079 | (bmexec): Use it. Simplify the code and remove some of the
|
|---|
| 6080 | confusing gotos and breaks and labels. Do not treat glibc memchr
|
|---|
| 6081 | as a special case; if non-glibc memchr is slow, that is lower
|
|---|
| 6082 | priority and I suppose we can try to work around the problem in
|
|---|
| 6083 | gnulib.
|
|---|
| 6084 |
|
|---|
| 6085 | 2014-04-22 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 6086 |
|
|---|
| 6087 | grep: speed-up by using memchr() in Boyer-Moore searching
|
|---|
| 6088 | memchr() of glibc is faster than seeking by delta1 on some platforms.
|
|---|
| 6089 | When there is no chance to match for a while, use it on them.
|
|---|
| 6090 | * src/kwset.c (bmexec): Use memchr() in Boyer-Moore searching.
|
|---|
| 6091 |
|
|---|
| 6092 | 2014-04-22 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6093 |
|
|---|
| 6094 | kwset: simplify Boyer-Moore with unibyte -i
|
|---|
| 6095 | This change doesn't significantly affect performance on my platform,
|
|---|
| 6096 | and should make the code easier to maintain.
|
|---|
| 6097 | * src/kwset.c (BM_DELTA2_SEARCH, LAST_SHIFT, TRANS):
|
|---|
| 6098 | Remove these macros, in favor of ...
|
|---|
| 6099 | (tr, bm_delta2_search): New functions. All uses changed.
|
|---|
| 6100 | The latter function is inline because this improves code size and
|
|---|
| 6101 | runtime CPU slightly on x86-64 with gcc -O2 (GCC 4.9.0).
|
|---|
| 6102 | (bmexec): Prefer tr when that's simpler.
|
|---|
| 6103 |
|
|---|
| 6104 | 2014-04-22 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 6105 |
|
|---|
| 6106 | grep: may also use Boyer-Moore algorithm for case-insensitive matching
|
|---|
| 6107 | * src/kwset.c (BM_DELTA2_SEARCH, LAST_SHIFT, TRANS): New macro.
|
|---|
| 6108 | (bmexec): Use character translation table.
|
|---|
| 6109 | (kwsexec): Call bmexec for case-insensitive matching.
|
|---|
| 6110 | (kwsprep): Change the `if' condition.
|
|---|
| 6111 |
|
|---|
| 6112 | 2014-04-21 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6113 |
|
|---|
| 6114 | grep: -P now rejects invalid input sequences in UTF-8 locales
|
|---|
| 6115 | See <http://bugs.gnu.org/17245> and <http://bugs.exim.org/1468>.
|
|---|
| 6116 | * NEWS: Document this.
|
|---|
| 6117 | * src/pcresearch.c (Pexecute): Do not use PCRE_NO_UTF8_CHECK,
|
|---|
| 6118 | as this leads to undefined behavior when the input is not UTF-8.
|
|---|
| 6119 | * tests/pcre-infloop, tests/pcre-invalid-utf8-input:
|
|---|
| 6120 | Exit status is now 2, not 1, when grep -P is given invalid UTF-8
|
|---|
| 6121 | data in a UTF-8 locale.
|
|---|
| 6122 |
|
|---|
| 6123 | dfa: minor improvements to previous patch
|
|---|
| 6124 | * src/dfa.c (dfamust): Use &=, not if-then.
|
|---|
| 6125 | * src/dfa.h (struct dfamust):
|
|---|
| 6126 | * src/dfasearch.c (begline, hwsmusts): Use bool for boolean.
|
|---|
| 6127 | * src/dfasearch.c (kwsmusts):
|
|---|
| 6128 | * src/kwsearch.c (Fcompile): Prefer decls after statements.
|
|---|
| 6129 | * src/dfasearch.c (kwsmusts): Avoid conditional branch.
|
|---|
| 6130 | * src/kwsearch.c (Fcompile): Unify the two calls to kwsincr.
|
|---|
| 6131 |
|
|---|
| 6132 | 2014-04-21 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 6133 |
|
|---|
| 6134 | grep: speed-up for exact matching with begline and endline constraints.
|
|---|
| 6135 | dfamust turns on the flag when a state exactly matches the proposed one.
|
|---|
| 6136 | However, when the state has begline and/or endline constraints, turns
|
|---|
| 6137 | off it.
|
|---|
| 6138 |
|
|---|
| 6139 | This patch enables to match a state exactly, even if the state has
|
|---|
| 6140 | begline and/or endline constraints. If a exact string has one of their
|
|---|
| 6141 | constrations, the string adding eolbyte to a head and/or foot is pushed
|
|---|
| 6142 | to kwsincr(). In addition, if it has begline constration, start
|
|---|
| 6143 | searching from just before the position of the text.
|
|---|
| 6144 |
|
|---|
| 6145 | * src/dfa.c (variable must): New members `begline' and `endline'.
|
|---|
| 6146 | (dfamust): Consideration of begline and endline constrations.
|
|---|
| 6147 | * src/dfa.h (struct dfamust): New members `begline' and `endline'.
|
|---|
| 6148 | * src/dfasearch.c (kwsmusts): If a exact string has begline constration,
|
|---|
| 6149 | start searching from just before the position of the text.
|
|---|
| 6150 | (EGexecute): Same as above.
|
|---|
| 6151 | * src/kwsearch.c (Fexecute): Same as above.
|
|---|
| 6152 |
|
|---|
| 6153 | 2014-04-20 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6154 |
|
|---|
| 6155 | dfa: fix bug that caused NUL to be mishandled in patterns
|
|---|
| 6156 | This bug was introduced in the early-2012 patches that fixed some
|
|---|
| 6157 | context-handling bugs. Bisecting found commit
|
|---|
| 6158 | d8951d3f4e1bbd564809aa8e713d8333bda2f802 (2012-02-05 18:00:43 +0100),
|
|---|
| 6159 | but it apears the underlying problem was introduced in commit
|
|---|
| 6160 | 8b47c4cf6556933f59226c234b0fe984f6c77dc7 (2012-01-03 11:22:09 +0100).
|
|---|
| 6161 | * NEWS: Mention bug fix.
|
|---|
| 6162 | * src/dfa.c (char_context): Consider NUL to be a newline only if -z.
|
|---|
| 6163 | * tests/Makefile.am (TESTS): Add null-byte.
|
|---|
| 6164 | * tests/null-byte: New file.
|
|---|
| 6165 |
|
|---|
| 6166 | 2014-04-19 Jim Meyering <meyering@fb.com>
|
|---|
| 6167 |
|
|---|
| 6168 | build: reenable some compiler warning options
|
|---|
| 6169 |
|
|---|
| 6170 | 2014-04-18 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6171 |
|
|---|
| 6172 | dfa: fix pointer type conversion bug
|
|---|
| 6173 | The code converted between size_t * and ptrdiff_t *, which wasn't
|
|---|
| 6174 | diagnosed by modern x86-64 GCC but isn't portable. Problem
|
|---|
| 6175 | reported by Norihiro Tanaka in <http://bugs.gnu.org/17136#31>.
|
|---|
| 6176 | * configure.ac (WERROR_CFLAGS): Don't add -Wno-pointer-sign.
|
|---|
| 6177 | We want GCC to diagnose pointer signedness problems, as they
|
|---|
| 6178 | violate the C standard and other compilers no doubt complain too.
|
|---|
| 6179 | * src/dfa.c (struct dfa): Change type of salloc to size_t.
|
|---|
| 6180 | (realloc_trans_if_necessary): Convert signed value to size_t before
|
|---|
| 6181 | passing its address to x2nrealloc. Changing the type of tralloc
|
|---|
| 6182 | to size_t might have led to problems elsewhere.
|
|---|
| 6183 |
|
|---|
| 6184 | 2014-04-18 Jim Meyering <meyering@fb.com>
|
|---|
| 6185 |
|
|---|
| 6186 | maint: Revert "dfa: avoid new NULL dereference"
|
|---|
| 6187 | This reverts commit 5190041fe515743ef4545abf287d243bc025c701.
|
|---|
| 6188 | It was only a bug if one neglected to update to the latest gnulib.
|
|---|
| 6189 | With the newer xn2realloc, there is no problem.
|
|---|
| 6190 |
|
|---|
| 6191 | dfa: avoid new NULL dereference
|
|---|
| 6192 | * src/dfa.c (dfa_charclass_index): Restore a "+ 1" mistakenly omitted
|
|---|
| 6193 | during recent improvements. Introduced in v2.18-66-g6a60fd5.
|
|---|
| 6194 |
|
|---|
| 6195 | 2014-04-17 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6196 |
|
|---|
| 6197 | dfa: minor cleanup
|
|---|
| 6198 | * src/dfa.c (MAX): Remove; no longer used.
|
|---|
| 6199 |
|
|---|
| 6200 | 2014-04-17 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 6201 |
|
|---|
| 6202 | dfa: speed up by checking multibyte characters on demand
|
|---|
| 6203 | If dfaexec() runs in non-UTF8 locales, length and wide character
|
|---|
| 6204 | representation are checked for all characters of a line in a input
|
|---|
| 6205 | string. However, if matched early in the line, results for remaining
|
|---|
| 6206 | characters are wasted.
|
|---|
| 6207 |
|
|---|
| 6208 | This patch checks multibyte characters on demand. It should work
|
|---|
| 6209 | faster for early matches, and reduces memory requirements.
|
|---|
| 6210 |
|
|---|
| 6211 | * src/dfa.c (struct dfa): Remove members mblen_buf, nmblen_buf,
|
|---|
| 6212 | inputwcs, ninputwcs. All uses removed.
|
|---|
| 6213 | (buf_begin, buf_end, prepare_wc_buf): Remove. All uses removed.
|
|---|
| 6214 | (SKIP_REMAINS_MB_IF_INITIAL_STATE): Remove. This is now expanded
|
|---|
| 6215 | when used.
|
|---|
| 6216 | (match_anychar, match_mb_charset, check_matching_with_multibyte_ops):
|
|---|
| 6217 | New arg wc, mbclen. Remove arg idx. All uses changed.
|
|---|
| 6218 | (transit_state_consume_1char): New arg wc. All uses changed.
|
|---|
| 6219 | (transit_state): New arg 'end'. All uses changed.
|
|---|
| 6220 |
|
|---|
| 6221 | 2014-04-17 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6222 |
|
|---|
| 6223 | dfa: trans reallocation microoptimization
|
|---|
| 6224 | * src/dfa.c (realloc_trans_if_necessary):
|
|---|
| 6225 | Help the compiler avoid unnecessary reloads.
|
|---|
| 6226 |
|
|---|
| 6227 | dfa: simplify dfmust initialization
|
|---|
| 6228 | * src/dfa.c (dfamust): Don't initialize musts twice.
|
|---|
| 6229 | Use zcalloc, not xmalloc followed by zeroing.
|
|---|
| 6230 | Make result a const pointer.
|
|---|
| 6231 |
|
|---|
| 6232 | dfa: simplify freelist
|
|---|
| 6233 | * src/dfa.c (freelist): Don't null out array while freeing its
|
|---|
| 6234 | pointers; the caller can do that if needed.
|
|---|
| 6235 | (resetmust): Null out zeroth entry of array.
|
|---|
| 6236 |
|
|---|
| 6237 | dfa: avoid duplicate strlen when allocating memory
|
|---|
| 6238 | * src/dfa.c (dfamust): Use xstrdup, not strlen (twice) + xmemdup.
|
|---|
| 6239 |
|
|---|
| 6240 | dfa: simplify memory allocation
|
|---|
| 6241 | * src/dfa.c (icatalloc, freelist, enlist, comsubs, addlists, inboth)
|
|---|
| 6242 | (dfamust): Don't worry about null arguments or results,
|
|---|
| 6243 | as memory allocators no longer can return null pointers.
|
|---|
| 6244 | (dfamust): Invoke malloc just once when building a concatenated string.
|
|---|
| 6245 |
|
|---|
| 6246 | dfa: simplify position set and element count allocation
|
|---|
| 6247 | * src/dfa.c (dfaanalyze): Allocation position set info all at one go,
|
|---|
| 6248 | and similarly for element count info.
|
|---|
| 6249 |
|
|---|
| 6250 | dfa: simplify multibyte_prop allocation
|
|---|
| 6251 | * src/dfa.c (struct dfa): Simplify by removing nmultibyte_prop;
|
|---|
| 6252 | it should always be the same as talloc. All uses changed.
|
|---|
| 6253 |
|
|---|
| 6254 | dfa: simplify range char allocation
|
|---|
| 6255 | * src/dfa.c (struct dfa): Simplify by allocating one array of ranges
|
|---|
| 6256 | rather than one for range starts and another for range ends.
|
|---|
| 6257 | All uses changed.
|
|---|
| 6258 |
|
|---|
| 6259 | dfa: simplify transition table allocation
|
|---|
| 6260 | * src/dfa.c (struct dfa): Remove member 'realtrans', as it can
|
|---|
| 6261 | be computed from 'trans'. All uses changed.
|
|---|
| 6262 | (realloc_trans_if_necessary): Move earlier, to avoid a forward decl.
|
|---|
| 6263 | Use x2nrealloc to compute new size, rather than doing it by hand,
|
|---|
| 6264 | which omits a check for unlikely overflow.
|
|---|
| 6265 | (realloc_trans_if_necessary, dfafree): Adjust to the fact that
|
|---|
| 6266 | d->trans now might be either NULL, or 1 + the pointer to free.
|
|---|
| 6267 | (build_state, build_state_zero): Use realloc_trans_if_necessary
|
|---|
| 6268 | instead of duplicating its code.
|
|---|
| 6269 |
|
|---|
| 6270 | dfa: better size-overflow check
|
|---|
| 6271 | * src/dfa.c (dfasuperset): Let xnmalloc do the multiplication,
|
|---|
| 6272 | to check for size arithmetic overflow better.
|
|---|
| 6273 |
|
|---|
| 6274 | dfa: avoid unnecessary work and other initialization
|
|---|
| 6275 | * src/dfa.c (dfaanalyze, dfainit):
|
|---|
| 6276 | Don't bother allocating when x2nrealloc will do it for us.
|
|---|
| 6277 | (dfastate): Allocate grps and labels on the stack, as their
|
|---|
| 6278 | size is known at compile time.
|
|---|
| 6279 | (build_state): Use xmalloc, not xnmalloc, since the multiplication
|
|---|
| 6280 | can be done at compile-time.
|
|---|
| 6281 |
|
|---|
| 6282 | dfa: clarify memory allocation and port to IRIX
|
|---|
| 6283 | This change was prompted by a porting problem:
|
|---|
| 6284 | IRIX defines its own MALLOC macro, which clashes with ours.
|
|---|
| 6285 | More generally, the MALLOC etc. macros are confusing, as they
|
|---|
| 6286 | look like functions but do not have C-function semantics.
|
|---|
| 6287 | A functional style makes the code easier to read, and though
|
|---|
| 6288 | it lengthens the code a bit here it'll make other
|
|---|
| 6289 | simplifications easier.
|
|---|
| 6290 | * src/dfa.c (XNMALLOC, XCALLOC, CALLOC, MALLOC, REALLOC): Remove.
|
|---|
| 6291 | All uses replaced by xnmalloc etc.
|
|---|
| 6292 | (REALLOC_IF_NECESSARY): Remove; all uses replaced by ....
|
|---|
| 6293 | (maybe_realloc): New function.
|
|---|
| 6294 | (copy, merge): Free and allocate rather than realloc, as we
|
|---|
| 6295 | needn't save the contents.
|
|---|
| 6296 |
|
|---|
| 6297 | 2014-04-14 Jim Meyering <meyering@fb.com>
|
|---|
| 6298 |
|
|---|
| 6299 | tests: detect an infloop-inducing bug in grep -P (pcre-8.35)
|
|---|
| 6300 | * tests/pcre-infloop: New test.
|
|---|
| 6301 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 6302 |
|
|---|
| 6303 | 2014-04-12 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6304 |
|
|---|
| 6305 | build: update gnulib submodule to latest
|
|---|
| 6306 |
|
|---|
| 6307 | 2014-04-11 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6308 |
|
|---|
| 6309 | grep: improvements for the open-CSET patch
|
|---|
| 6310 | * src/dfa.c (dfamust): Simplify by removing some duplicate code.
|
|---|
| 6311 | Optimize patterns like [aaa] even when not case-folding.
|
|---|
| 6312 | Avoid an unnecessary copy of the charclass.
|
|---|
| 6313 |
|
|---|
| 6314 | 2014-04-11 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 6315 |
|
|---|
| 6316 | grep: open CSET and transform into uppercase when MB_CUR_MAX == 1
|
|---|
| 6317 | In unibyte locales with -i, kwset matching isn't helpful, because
|
|---|
| 6318 | dfamust doesn't extract the CSET entries. Fix dmamust so that it
|
|---|
| 6319 | does that, and makes it possible to take out a longer fixed string
|
|---|
| 6320 | from tokens.
|
|---|
| 6321 | * src/dfa.c (dfamust): open CSET and transform into uppercase
|
|---|
| 6322 | when MB_CUR_MAX == 1.
|
|---|
| 6323 |
|
|---|
| 6324 | 2014-04-11 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6325 |
|
|---|
| 6326 | grep: cleanup for HAS_DOS_FILE_CONTENTS issue
|
|---|
| 6327 | While cleaning up the empty-string fix, I noticed that one part of
|
|---|
| 6328 | the code worried about CRLF in pattern files whereas another part
|
|---|
| 6329 | did not. Fix this by using the same approach in both places,
|
|---|
| 6330 | and make the CRLF code more modular in the process.
|
|---|
| 6331 | * src/dosbuf.c (dos_binary, dos_unix_byte_offsets): New functions.
|
|---|
| 6332 | (undossify_input, dossified_pos): Do nothing if ! O_BINARY.
|
|---|
| 6333 | * src/grep.c: Always include dosbuf.c so that the code is
|
|---|
| 6334 | checked statically even on non-DOS hosts.
|
|---|
| 6335 | (dos_binary, dos_unix_byte_offsets): New decls.
|
|---|
| 6336 | (undossify_input): Declare unconditionally.
|
|---|
| 6337 | * src/grep.c (fillbuf, print_line_head, main):
|
|---|
| 6338 | * src/kwsearch.c (Fcompile):
|
|---|
| 6339 | Simplify by not worrying about HAVE_DOS_FILE_CONTENTS.
|
|---|
| 6340 | * src/grep.c (main): fopen with "rt" if O_TEXT; this is simpler
|
|---|
| 6341 | than worrying about HAVE_DOS_FILE_CONTENTS elsewhere.
|
|---|
| 6342 | * src/system.h (HAVE_DOS_FILE_CONTENTS): Remove.
|
|---|
| 6343 |
|
|---|
| 6344 | grep: cleanup for empty-string fix
|
|---|
| 6345 | * NEWS: Document it.
|
|---|
| 6346 | * src/dfasearch.c (GEAcompile):
|
|---|
| 6347 | * src/kwsearch.c (Fcompile):
|
|---|
| 6348 | Use C99-style decls to simplify. Avoid duplicate code.
|
|---|
| 6349 | * tests/empty-line: Add some more tests like this.
|
|---|
| 6350 |
|
|---|
| 6351 | 2014-04-11 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 6352 |
|
|---|
| 6353 | grep: no match for the empty string included in multiple patterns
|
|---|
| 6354 | * src/dfasearch.c (EGAcompile): Fix it.
|
|---|
| 6355 | * src/kwsearch.c (Fcompile): Fix it.
|
|---|
| 6356 |
|
|---|
| 6357 | 2014-04-08 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6358 |
|
|---|
| 6359 | grep: remove bool_bf
|
|---|
| 6360 | The extra complexity of this microoptimization wasn't ever much help,
|
|---|
| 6361 | and currently it generated bigger code with gcc -O2 (x86-64).
|
|---|
| 6362 | * src/dfa.c (bool_bf): Remove. All uses replaced by plain 'bool',
|
|---|
| 6363 | without a bitfield.
|
|---|
| 6364 |
|
|---|
| 6365 | 2014-04-08 Jim Meyering <meyering@fb.com>
|
|---|
| 6366 |
|
|---|
| 6367 | maint: avoid sc_po_check syntax-check failure (kwset.c)
|
|---|
| 6368 | * po/POTFILES.in: Remove kwset.c from this list, since it
|
|---|
| 6369 | no longer contains a translatable diagnostic.
|
|---|
| 6370 |
|
|---|
| 6371 | 2014-04-08 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6372 |
|
|---|
| 6373 | grep: port better to hosts with nonstandard nl_langinfo
|
|---|
| 6374 | On some hosts, nl_langinfo returns strings other than "UTF-8" when
|
|---|
| 6375 | UTF-8 is used, and (worse) return "UTF-8" even if the encoding is
|
|---|
| 6376 | single-byte. Work around these problems by trying a sample
|
|---|
| 6377 | character instead.
|
|---|
| 6378 | * src/dfa.c, src/pcresearch.c, src/searchutils.c:
|
|---|
| 6379 | Don't include <langinfo.h>.
|
|---|
| 6380 | * src/dfa.c (using_utf8): Test for UTF-8 by trying a character
|
|---|
| 6381 | rather than by invoking nl_langinfo (CODESET); this is more
|
|---|
| 6382 | portable in practice, and removes a dependency on
|
|---|
| 6383 | HAVE_LANGINFO_CODESET.
|
|---|
| 6384 | * src/pcresearch.c: Include dfa.h, for using_utf8.
|
|---|
| 6385 | (Pcompile): Use using_utf8 rather than nl_langinfo.
|
|---|
| 6386 |
|
|---|
| 6387 | 2014-04-07 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6388 |
|
|---|
| 6389 | grep: prefer bool in DFA internals
|
|---|
| 6390 | * src/dfa.c (bool_bf): New type.
|
|---|
| 6391 | (dfa_state): Use it, as this seems to generate slightly better
|
|---|
| 6392 | code with GCC.
|
|---|
| 6393 | (struct mb_char_classes, struct dfa, equal, case_fold, dfasyntax)
|
|---|
| 6394 | (laststart, parse_bracket_exp, lex, dfaparse, dfaanalyze, dfastate)
|
|---|
| 6395 | (match_mb_charset, dfamust):
|
|---|
| 6396 | Use bool for boolean.
|
|---|
| 6397 | (using_utf8) [!HAVE_LANGINFO_CODESET]: Tune.
|
|---|
| 6398 | (dfaanalyze): Prefer & to && and | to || on booleans; it's simpler here.
|
|---|
| 6399 | (dfastate): Simplify charclass nonzero testing. Redo has_mbcset
|
|---|
| 6400 | test so that the compiler's more likely to optimize it.
|
|---|
| 6401 |
|
|---|
| 6402 | 2014-04-07 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 6403 |
|
|---|
| 6404 | grep: prefer regex to DFA for ANYCHAR in multibyte locales
|
|---|
| 6405 | * src/dfa.c (dfa_state): New member has_mbcset.
|
|---|
| 6406 | Rename backref to has_backref, and make it of type bool too.
|
|---|
| 6407 | All uses changed.
|
|---|
| 6408 | (state_index, dfastate): Initialize new member.
|
|---|
| 6409 | (dfaexec): Prefer regex to DFA for ANYCHAR in multibyte locales.
|
|---|
| 6410 |
|
|---|
| 6411 | 2014-04-07 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6412 |
|
|---|
| 6413 | grep: remove trival_case_ignore
|
|---|
| 6414 | This optimization is no longer needed, given the other
|
|---|
| 6415 | optimizations recently installed. Derived from a patch by
|
|---|
| 6416 | Norihiro Tanaka; see <http://bugs.gnu.org/17019>.
|
|---|
| 6417 | * bootstrap.conf (gnulib_modules): Remove assert-h.
|
|---|
| 6418 | * src/dfa.c (CASE_FOLDED_BUFSIZE): Move here from dfa.h.
|
|---|
| 6419 | Remove now-unnecessary static assert.
|
|---|
| 6420 | (case_folded_counterparts): Now static.
|
|---|
| 6421 | * src/dfa.h (CASE_FOLDED_BUFSIZE, case_folded_counterparts):
|
|---|
| 6422 | Remove decls; no longer public.
|
|---|
| 6423 | * src/dfasearch.c (kwsmusts): Use kwset even if fill MB_CUR_MAX > 1
|
|---|
| 6424 | and case-insensitive.
|
|---|
| 6425 | * src/grep.c (MBRTOWC, WCRTOMB): Remove.
|
|---|
| 6426 | (fgrep_to_grep_pattern): Use mbrtowc, not MBRTOWC.
|
|---|
| 6427 | (trivial_case_ignore): Remove; this optimization is no longer needed.
|
|---|
| 6428 | All uses removed.
|
|---|
| 6429 |
|
|---|
| 6430 | grep: simplify memory allocation in kwset
|
|---|
| 6431 | * src/kwset.c: Include kwset.h first, to check its prereqs.
|
|---|
| 6432 | Include xalloc.h, for xmalloc.
|
|---|
| 6433 | (kwsalloc): Use xmalloc, not malloc, so that the caller need not
|
|---|
| 6434 | worry about memory allocation failure.
|
|---|
| 6435 | (kwsalloc, kwsincr, kwsprep): Do not worry about obstack_alloc
|
|---|
| 6436 | returning NULL, as that's not possible.
|
|---|
| 6437 | (kwsalloc, kwsincr, kwsprep, bmexec, cwexec, kwsexec, kwsfree):
|
|---|
| 6438 | Omit unnecessary conversion between struct kwset * and kwset_t.
|
|---|
| 6439 | (kwsincr, kwsprep): Return void since memory-allocation failure is
|
|---|
| 6440 | not possible now. All uses changed.
|
|---|
| 6441 | * src/kwset.h: Include <stddef.h>, for size_t, so that this
|
|---|
| 6442 | include file doesn't require other files to be included first.
|
|---|
| 6443 |
|
|---|
| 6444 | grep: minor cleanups for Galil speedups
|
|---|
| 6445 | * src/kwset.c: Update citations.
|
|---|
| 6446 | Include stdbool.h.
|
|---|
| 6447 | (kwsincr, kwsprep): Clarify by using C99 decls after statements.
|
|---|
| 6448 | (kwsprep): Clarify by using MIN. Avoid a couple of buffer copies
|
|---|
| 6449 | when !TRANS.
|
|---|
| 6450 | (bmexec): Use bool for boolean. Prefer "continue;" to ";".
|
|---|
| 6451 |
|
|---|
| 6452 | 2014-04-07 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 6453 |
|
|---|
| 6454 | grep: use the Galil rule for Boyer-Moore algorithm in KWSet
|
|---|
| 6455 | The Boyer-Moore algorithm is O(m*n), which means it may be much
|
|---|
| 6456 | slower than the DFA. Its Galil rule variant is O(n) and increases
|
|---|
| 6457 | efficiency in the typical case; it skips sections that are known
|
|---|
| 6458 | to match and does not compare more than once for a position in the text.
|
|---|
| 6459 | To use the Galil rule, look for the delta2 shift at each position
|
|---|
| 6460 | from the trie instead of the 'mind2' value.
|
|---|
| 6461 | * src/kwset.c (struct kwset): Replace member 'mind2' with 'shift'.
|
|---|
| 6462 | (kwsprep): Look for the delta2 shift.
|
|---|
| 6463 | (bmexec): Use it.
|
|---|
| 6464 |
|
|---|
| 6465 | 2014-04-06 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6466 |
|
|---|
| 6467 | grep: cleanup DFA superset optimization
|
|---|
| 6468 | * src/dfa.c (dfa_charclass_index): New function, with body of
|
|---|
| 6469 | old dfa_charclass but with an extra parameter D.
|
|---|
| 6470 | (charclass_index): Reimplement in terms of dfa_charclass_index.
|
|---|
| 6471 | (dfahint): Clarify.
|
|---|
| 6472 | (dfasuperset): Do not assign to 'dfa' static variable. Instead,
|
|---|
| 6473 | use a local, and use the new dfa_charclass_index function. This
|
|---|
| 6474 | doesn't fix any bugs, but it's clearer. Initialize a few more
|
|---|
| 6475 | members, to simplify dfafree. Copy the charclasses with
|
|---|
| 6476 | just one memcpy call. Don't assign nonnull to D->superset until
|
|---|
| 6477 | it's known to be valid; that's simpler.
|
|---|
| 6478 | (dfafree, dfaalloc): Simplify based on dfasuperset initializations.
|
|---|
| 6479 | * src/dfa.h (dfahint): Add comment.
|
|---|
| 6480 | * src/dfasearch.c (EGexecute): Simplify use of memchr.
|
|---|
| 6481 | Simplify by using memrchr. Fix typo that could cause a buffer
|
|---|
| 6482 | read overrun.
|
|---|
| 6483 |
|
|---|
| 6484 | 2014-04-06 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 6485 |
|
|---|
| 6486 | grep: optimization with the superset of DFA
|
|---|
| 6487 | The superset of a DFA is like the DFA, except that for speed
|
|---|
| 6488 | ANYCHAR, MBCSET and BACKREF are replaced by (CSET full bits) STAR,
|
|---|
| 6489 | and mb_cur_max is 1. For example, for 'a\(b\)c\1':
|
|---|
| 6490 | original: a b CAT c CAT BACKREF CAT
|
|---|
| 6491 | superset: a b CAT c CAT CSET STAR CAT (The CSET has all bits set.)
|
|---|
| 6492 | If a string matches a DFA, it matches the DFA's superset.
|
|---|
| 6493 | Using the superset to filter can dramatically improve performance,
|
|---|
| 6494 | over 200x in some cases. See <http://bugs.gnu.org/16966>.
|
|---|
| 6495 | * src/dfa.c (struct dfa): New member 'superset'.
|
|---|
| 6496 | (dfahint, dfasuperset): New functions.
|
|---|
| 6497 | (dfacomp): Create and analyze the superset.
|
|---|
| 6498 | (dfafree): Free only non-NULL items.
|
|---|
| 6499 | (dfaalloc): Initialize superset member.
|
|---|
| 6500 | (dfaoptimize): If succeed in optimization for UTF-8 locale, don't use
|
|---|
| 6501 | the superset.
|
|---|
| 6502 | * src/dfa.h (dfahint): New decl.
|
|---|
| 6503 | * src/dfasearch.c (EGexecute): Use dfahint.
|
|---|
| 6504 |
|
|---|
| 6505 | 2014-04-06 Jim Meyering <meyering@fb.com>
|
|---|
| 6506 |
|
|---|
| 6507 | build: avoid OS X 10.8.5 build failure due to lack of static_assert
|
|---|
| 6508 | * bootstrap.conf (gnulib_modules): Add assert-h, to accommodate the
|
|---|
| 6509 | new use of static_assert on systems lacking support for that construct.
|
|---|
| 6510 | Without this change, compilation of dfa.c failed on OS X 10.8.5 with
|
|---|
| 6511 | gcc-4.9.0 20140324. We should be using gnulib's assert-h module,
|
|---|
| 6512 | regardless, for its nominal improved portability, since grep includes
|
|---|
| 6513 | assert.h and uses assert.
|
|---|
| 6514 |
|
|---|
| 6515 | 2014-04-05 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 6516 |
|
|---|
| 6517 | grep: fix performance bug with regex in line-by-line mode
|
|---|
| 6518 | * src/dfasearch.c (EGexecute): Match line-by-line with regex.
|
|---|
| 6519 |
|
|---|
| 6520 | 2014-04-05 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6521 |
|
|---|
| 6522 | grep: minor improvements to previous patch
|
|---|
| 6523 | * src/dfa.c (MAX): New macro.
|
|---|
| 6524 | (match_anychar, match_mb_charset, transit_state_consume_1char):
|
|---|
| 6525 | Use it to simplify assignments.
|
|---|
| 6526 | (SKIP_REMAINS_MB_IF_INITIAL_STATE): Prefer != 0 for unsigned.
|
|---|
| 6527 | (free_mbdata): Omit an unnecessary 'free'.
|
|---|
| 6528 |
|
|---|
| 6529 | 2014-04-05 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 6530 |
|
|---|
| 6531 | grep: reuse multibyte DFA buffers in non-UTF8 locales
|
|---|
| 6532 | * src/dfa.c (struct dfa): New members 'mblen_buf', 'nmblen_buf',
|
|---|
| 6533 | 'inputwcs', 'ninputwcs', 'mb_follows' and 'mb_match_lens'.
|
|---|
| 6534 | (mblen_buf, inputwcs): Remove static vars.
|
|---|
| 6535 | (SKIP_REMAINS_MB_IF_INITIAL_STATE, match_anychar, match_mb_charset)
|
|---|
| 6536 | (transit_state_consume_1char, transit_state, prepare_wc_buf):
|
|---|
| 6537 | Use new members instead of global variables.
|
|---|
| 6538 | (check_matching_with_multibyte_ops): Use new members
|
|---|
| 6539 | instead of new allocation.
|
|---|
| 6540 | (dfaexec): Initialize new members.
|
|---|
| 6541 | (free_mbdata): Free new members.
|
|---|
| 6542 |
|
|---|
| 6543 | 2014-04-05 Paul Eggert <eggert@penguin.cs.ucla.edu>
|
|---|
| 6544 |
|
|---|
| 6545 | grep: simplify dfa.c by having it not include mbsupport.h directly
|
|---|
| 6546 | * src/mbsupport.h: Remove.
|
|---|
| 6547 | * src/Makefile.am (noinst_HEADERS): Remove mbsupport.h.
|
|---|
| 6548 | * src/dfa.c, src/grep.c, src/search.h: Don't include mbsupport.h.
|
|---|
| 6549 | * src/dfa.c: Include wchar.h and wctype.h unconditionally, as
|
|---|
| 6550 | this simplifies the use of dfa.c in grep, and it does no harm
|
|---|
| 6551 | in gawk.
|
|---|
| 6552 | (setlocale, static_assert): Remove gawk-specific hacks, as
|
|---|
| 6553 | gawk now does these itself.
|
|---|
| 6554 | (struct dfa, dfambcache, mbs_to_wchar)
|
|---|
| 6555 | (is_valid_unibyte_character, setbit_wc, using_utf8, FETCH_WC)
|
|---|
| 6556 | (addtok_wc, add_utf8_anychar, atom, state_index, epsclosure)
|
|---|
| 6557 | (dfaanalyze, dfastate, prepare_wc_buf, dfaoptimize, dfafree, dfamust):
|
|---|
| 6558 | * src/dfasearch.c (EGexecute):
|
|---|
| 6559 | * src/grep.c (main):
|
|---|
| 6560 | * src/searchutils.c (mbtoupper):
|
|---|
| 6561 | Assume MBS_SUPPORT.
|
|---|
| 6562 |
|
|---|
| 6563 | 2014-04-01 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 6564 |
|
|---|
| 6565 | dfa: avoid re-building a state built previously
|
|---|
| 6566 | * src/dfa.c (dfaexec): Avoid to re-build a state built previously.
|
|---|
| 6567 |
|
|---|
| 6568 | 2014-03-28 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6569 |
|
|---|
| 6570 | dfa: improve port to freestanding DJGPP
|
|---|
| 6571 | Suggested by Aharon Robbins (Bug#17056).
|
|---|
| 6572 | * src/dfa.c (setlocale) [!LC_ALL]: Return NULL, not "C",
|
|---|
| 6573 | reverting part of a recent change.
|
|---|
| 6574 | (using_simple_locale): Return true if setlocale returns null.
|
|---|
| 6575 |
|
|---|
| 6576 | 2014-03-28 Jim Meyering <meyering@fb.com>
|
|---|
| 6577 |
|
|---|
| 6578 | tests: placate "make syntax-check" re compare arg ordering
|
|---|
| 6579 | * tests/euc-mb: Reverse order of arguments to compare.
|
|---|
| 6580 | Be consistent in ordering compare arguments: expected followed
|
|---|
| 6581 | by actual.
|
|---|
| 6582 |
|
|---|
| 6583 | 2014-03-28 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6584 |
|
|---|
| 6585 | dfa: avoid an indirection and port wint_t usage
|
|---|
| 6586 | * src/dfa.c (struct dfa): Put mbrtowc_cache directly into struct dfa
|
|---|
| 6587 | rather than having a pointer; this saves a malloc and an indirection.
|
|---|
| 6588 | All uses changed.
|
|---|
| 6589 | (dfambcache): Port to hosts where wint_t * can't be cast to wchar_t *.
|
|---|
| 6590 |
|
|---|
| 6591 | 2014-03-28 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 6592 |
|
|---|
| 6593 | grep: take mbrtowc_cache into new member of struct dfa
|
|---|
| 6594 | When struct dfa more than one are used at the same time, mbrtowc cache
|
|---|
| 6595 | may be conflict. So, take mbrtowc_cache into new member of struct dfa,
|
|---|
| 6596 | and define each mbrtowc cache for them.
|
|---|
| 6597 |
|
|---|
| 6598 | * src/dfa.c (struct dfa): New member `mbrtowc_cache'.
|
|---|
| 6599 | (dfambcache): Rename from build_mbrtowc_cache. Add dependency on struct dfa.
|
|---|
| 6600 | (mbs_to_wchar): Add dependency on struct dfa.
|
|---|
| 6601 | (FETCH_WC): Use it.
|
|---|
| 6602 | (prepare_wc_buf): Use it. Add dependency on struct dfa.
|
|---|
| 6603 | (dfacomp): Call it.
|
|---|
| 6604 | (dfafree): Release it.
|
|---|
| 6605 |
|
|---|
| 6606 | 2014-03-28 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6607 |
|
|---|
| 6608 | dfa: cache results of mbrtowc for speed
|
|---|
| 6609 | Idea suggested by Norihiro Tanaka in Bug#16842.
|
|---|
| 6610 | * src/dfa.c (mbrtowc_cache): New static var.
|
|---|
| 6611 | (build_mbrtowc_cache, mbs_to_wchar): New functions.
|
|---|
| 6612 | (FETCH_WC) [MBS_SUPPORT]: Speed up by using mbs_to_wchar
|
|---|
| 6613 | instead of mbrtowc and wctob.
|
|---|
| 6614 | (FETCH_WC) [!MBS_SUPPORT]: Rewrite in terms of old FETCH macro.
|
|---|
| 6615 | (FETCH): Remove; no longer used.
|
|---|
| 6616 | (lex): Simplify by avoiding the need for FETCH.
|
|---|
| 6617 | (prepare_wc_buf) [MBS_SUPPORT]: Speed up by using mbs_to_wchar.
|
|---|
| 6618 | Simplify the loop.
|
|---|
| 6619 | (dfacomp): Initialize the cache.
|
|---|
| 6620 |
|
|---|
| 6621 | 2014-03-27 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 6622 |
|
|---|
| 6623 | grep: perform the kwset-helping DFA match in narrower range
|
|---|
| 6624 | When kwsexec gives us the offset of a potential match, we compute
|
|---|
| 6625 | line begin/end and then run the DFA matcher to see if there really
|
|---|
| 6626 | is a match on that line. When the beginning of the line, BEG, is
|
|---|
| 6627 | not on a multibyte character boundary, advance BEG until it on such
|
|---|
| 6628 | a boundary, before running the DFA search.
|
|---|
| 6629 | * src/dfasearch.c (EGexecute): As above. Add a comment.
|
|---|
| 6630 | * tests/euc-mb: Add a test case that exercises this code.
|
|---|
| 6631 | This addresses http://debbugs.gnu.org/17095.
|
|---|
| 6632 |
|
|---|
| 6633 | 2014-03-26 Jim Meyering <meyering@fb.com>
|
|---|
| 6634 |
|
|---|
| 6635 | maint: fix "make dist"
|
|---|
| 6636 | * src/Makefile.am (egrep fgrep): Specify egrep.sh via
|
|---|
| 6637 | $(srcdir)/egrep.sh, so non-srcdir builds work once again.
|
|---|
| 6638 |
|
|---|
| 6639 | 2014-03-26 Paul Eggert <eggert@penguin.cs.ucla.edu>
|
|---|
| 6640 |
|
|---|
| 6641 | dfa: improve port to freestanding DJGPP
|
|---|
| 6642 | * src/dfa.c (setlocale) [!LC_ALL]: Return "C", not NULL (Bug#17056).
|
|---|
| 6643 | (using_simple_locale): Store setlocale result in a ptr-to-const.
|
|---|
| 6644 |
|
|---|
| 6645 | egrep, fgrep: improve diagnostics from shell scripts
|
|---|
| 6646 | This should fix Bug#17098.
|
|---|
| 6647 | * src/Makefile.am (EXTRA_DIST): Add egrep.sh.
|
|---|
| 6648 | (egrep fgrep): Depend on egrep.sh and Makefile.
|
|---|
| 6649 | Build from new file egrep.sh, as this makes the build process
|
|---|
| 6650 | easier to follow. Arrange for $0 to look nicer in subgrep.
|
|---|
| 6651 | * src/egrep.sh: New file.
|
|---|
| 6652 |
|
|---|
| 6653 | 2014-03-23 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6654 |
|
|---|
| 6655 | dfa: avoid undefined behavior
|
|---|
| 6656 | * src/dfa.c (FETCH_WC, addtok_wc): Don't rely on undefined behavior
|
|---|
| 6657 | when converting an out-of-range value to 'int'.
|
|---|
| 6658 | (FETCH_WC, prepare_wc_buf): Don't rely on conversion state after
|
|---|
| 6659 | mbrtowc returns a special value, as it's undefined for (size_t) -1.
|
|---|
| 6660 | (prepare_wc_buf): Simplify test for valid character.
|
|---|
| 6661 |
|
|---|
| 6662 | grep: fix and simplify grep -iF optimization
|
|---|
| 6663 | * src/grep.c (check_any_alphabets): Remove.
|
|---|
| 6664 | (fgrep_to_grep_pattern): Fix problems when mbrtowc returns -1 or -2.
|
|---|
| 6665 | Simplify a bit.
|
|---|
| 6666 | (main): Don't bother optimizing 'grep -iF PAT' when PAT contains no
|
|---|
| 6667 | alphabetics; it's so rare it's not worth the complexity.
|
|---|
| 6668 |
|
|---|
| 6669 | 2014-03-23 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 6670 |
|
|---|
| 6671 | grep: optimization for fgrep with changing the macher to grep macher.
|
|---|
| 6672 | fgrep macher is only use kwset engine. However, it's very slow for
|
|---|
| 6673 | case-insensitive matching in multibyte locales.
|
|---|
| 6674 |
|
|---|
| 6675 | And so, if the matcher is fgrep and case-insensitive and keys including
|
|---|
| 6676 | any alphabets, change it into grep matcher by escape of keys. OTOH, if
|
|---|
| 6677 | keys include no alphabet, turn match_icase flag off.
|
|---|
| 6678 |
|
|---|
| 6679 | I prepare following string to measure the performance.
|
|---|
| 6680 |
|
|---|
| 6681 | yes $(printf '%078dm' 0)| head -1000000 | tr 0 a > in
|
|---|
| 6682 | A=`printf '\xef\xbc\xa1'` # FULLWIDTH LATIN CAPITAL LETTER A
|
|---|
| 6683 |
|
|---|
| 6684 | I run three tests with this patch (best-of-5 trials):
|
|---|
| 6685 |
|
|---|
| 6686 | env LC_ALL=en_US.UTF-8 time -p src/fgrep -i "$A" in
|
|---|
| 6687 | real 8.54 user 7.13 sys 1.16
|
|---|
| 6688 |
|
|---|
| 6689 | Back out that commit (temporarily), recompile, and rerun the experiment:
|
|---|
| 6690 |
|
|---|
| 6691 | env LC_ALL=en_US.UTF-8 time -p src/fgrep -i "$A" in
|
|---|
| 6692 | real 0.07 user 0.02 sys 0.05
|
|---|
| 6693 |
|
|---|
| 6694 | * src/fgrep.c (Gcompile) New function.
|
|---|
| 6695 | * src/main.c (check_any_alphabets) New function.
|
|---|
| 6696 | (fgrep_to_grep_pattern) New function.
|
|---|
| 6697 | (main) Use them.
|
|---|
| 6698 |
|
|---|
| 6699 | 2014-03-23 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6700 |
|
|---|
| 6701 | egrep, fgrep: go back to shell scripts
|
|---|
| 6702 | Although egrep's and fgrep's switch from shell scripts to
|
|---|
| 6703 | executables may have made sense in 2005, it complicated
|
|---|
| 6704 | maintenance and recently has caused subtle performance bugs.
|
|---|
| 6705 | Go back to the old way of doing things, as it's simpler and more
|
|---|
| 6706 | easily separated from the mainstream implementation. This should
|
|---|
| 6707 | be good enough nowadays, as POSIX has withdrawn egrep/fgrep and
|
|---|
| 6708 | portable applications should be using -E/-F anyway.
|
|---|
| 6709 | * po/POTFILES.in: Remove src/egrep.c, src/fgrep.c, src/main.c.
|
|---|
| 6710 | * src/Makefile.am (bin_PROGRAMS): Remove egrep, fgrep.
|
|---|
| 6711 | (bin_SCRIPTS): New macro.
|
|---|
| 6712 | (grep_SOURCES): Move searchutils.c, dfa.c, dfasearch.c, kwset.c,
|
|---|
| 6713 | kwsearch.c, pcresearch.c here from libgrep_a_SOURCES.
|
|---|
| 6714 | (egrep_SOURCES, fgrep_SOURCES, noinst_LIBRARIES, libgrep_a_SOURCES):
|
|---|
| 6715 | Remove.
|
|---|
| 6716 | (LDADD): Remove libgrep.a.
|
|---|
| 6717 | (egrep, fgrep): New rules.
|
|---|
| 6718 | (CLEANFILES): New macro.
|
|---|
| 6719 | * src/grep.c: Rename from src/main.c.
|
|---|
| 6720 | (usage, setmatcher, main):
|
|---|
| 6721 | Simplify, since there's now just one executable.
|
|---|
| 6722 | (Gcompile, Ecompile, Acompile, GAcompile, PAcompile, matchers):
|
|---|
| 6723 | Move here from the (removed) src/grep.c.
|
|---|
| 6724 | (compile_fp_t, execute_fp_t, struct matcher, matchers):
|
|---|
| 6725 | Move here from src/grep.h, as they no longer need to be public.
|
|---|
| 6726 | (struct matcher.name): Avoid one level of indirection/relocation.
|
|---|
| 6727 | (do_execute, main): Fix a performance bug when it was compiled
|
|---|
| 6728 | as 'fgrep', due to confusion about which matcher was which.
|
|---|
| 6729 | (main): Fix a performance bug with -P, likewise.
|
|---|
| 6730 | * src/grep.h (before_options, after_options): Remove.
|
|---|
| 6731 | * src/egrep.c, src/fgrep.c, src/grep.c: Remove.
|
|---|
| 6732 |
|
|---|
| 6733 | dfa: port to freestanding DJGPP (Bug#17056)
|
|---|
| 6734 | * src/dfa.c (setlocale) [!LC_ALL]: Define a dummy.
|
|---|
| 6735 |
|
|---|
| 6736 | 2014-03-16 Jim Meyering <meyering@fb.com>
|
|---|
| 6737 |
|
|---|
| 6738 | tests: avoid false-positive failure on some AMD CPUs
|
|---|
| 6739 | * tests/mb-non-UTF8-performance: Avoid false-positive failure
|
|---|
| 6740 | when run on certain AMD processors.
|
|---|
| 6741 |
|
|---|
| 6742 | 2014-03-10 Jim Meyering <meyering@fb.com>
|
|---|
| 6743 |
|
|---|
| 6744 | tests: make a performance-measuring test less system-sensitive
|
|---|
| 6745 | Andreas Schwab reported in http://debbugs.gnu.org/16941
|
|---|
| 6746 | that this test would timeout and fail on m68k-suse-linux.
|
|---|
| 6747 | Rather than testing absolute duration with a limit tuned
|
|---|
| 6748 | to today's hardware, compare performance of grep with LC_ALL=C
|
|---|
| 6749 | against that same command using LC_ALL=ja_JP.eucJP.
|
|---|
| 6750 | * tests/init.cfg (require_hi_res_time_): New function.
|
|---|
| 6751 | * tests/mb-non-UTF8-performance: Rewrite to use it:
|
|---|
| 6752 | record absolute duration D of the first (normally much faster)
|
|---|
| 6753 | command, and set a timeout of 8*D for the command running in
|
|---|
| 6754 | an affected locale.
|
|---|
| 6755 |
|
|---|
| 6756 | 2014-03-09 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6757 |
|
|---|
| 6758 | maint: pacify 'make dist'
|
|---|
| 6759 | * src/dfa.c (parse_bracket_exp): Reindent with spaces.
|
|---|
| 6760 | * src/dfa.h (case_folded_counterparts): Prefix decl with 'extern'.
|
|---|
| 6761 | * src/main.c: Don't include assert.h.
|
|---|
| 6762 |
|
|---|
| 6763 | 2014-03-07 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6764 |
|
|---|
| 6765 | fgrep: fix case-fold incompatibility with plain 'grep'
|
|---|
| 6766 | fgrep converted to lowercase, whereas the regex code converted
|
|---|
| 6767 | to uppercase. The resulting behaviors don't agree in offbeat
|
|---|
| 6768 | cases like Greek sigmas and Turkish Is. Fix this by changing
|
|---|
| 6769 | fgrep to agree with the regex code.
|
|---|
| 6770 | * src/kwsearch.c (Fcompile, Fexecute):
|
|---|
| 6771 | * src/searchutils.c (kwsinit, mbtoupper):
|
|---|
| 6772 | Convert to uppercase, not to lowercase, for compatibility with
|
|---|
| 6773 | plain 'grep'.
|
|---|
| 6774 | * src/search.h, src/searchutils.c (mbtoupper):
|
|---|
| 6775 | Rename from mbtolower, since it now converts to uppercase.
|
|---|
| 6776 | All uses changed.
|
|---|
| 6777 | * tests/case-fold-titlecase: Add tests for this.
|
|---|
| 6778 |
|
|---|
| 6779 | grep: fix case-fold mismatches between DFA and regex
|
|---|
| 6780 | The DFA code and the regex code didn't use the same semantics for
|
|---|
| 6781 | case-folding. The regex code says that the data char d matches
|
|---|
| 6782 | the pattern char p if uc (d) == uc (p). POSIX is unclear in this
|
|---|
| 6783 | area; the simplest fix for now is to change the DFA code to agree
|
|---|
| 6784 | with the regex code. See <http://bugs.gnu.org/16919>.
|
|---|
| 6785 | * src/dfa.c (static_assert): New macro, if not already defined.
|
|---|
| 6786 | (setbit_case_fold_c): Assume MB_CUR_MAX is 1 and that case_fold
|
|---|
| 6787 | is nonzero; all callers changed.
|
|---|
| 6788 | (setbit_case_fold_c, parse_bracket_exp, lex, atom):
|
|---|
| 6789 | Case-fold like the regex code does.
|
|---|
| 6790 | (lonesome_lower): New constant.
|
|---|
| 6791 | (case_folded_counterparts): New function.
|
|---|
| 6792 | (parse_bracket_exp): Prefer plain setbit when case-folding is
|
|---|
| 6793 | not needed.
|
|---|
| 6794 | * src/dfa.h (CASE_FOLDED_BUFSIZE): New constant.
|
|---|
| 6795 | (case_folded_counterparts): New function decl.
|
|---|
| 6796 | * src/main.c (trivial_case_ignore): Case-fold like the regex code does.
|
|---|
| 6797 | (main): Try to improve comment re trivial_case_ignore.
|
|---|
| 6798 | * tests/case-fold-titlecase: Add lots more test cases.
|
|---|
| 6799 |
|
|---|
| 6800 | 2014-03-06 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6801 |
|
|---|
| 6802 | build: update gnulib submodule to latest
|
|---|
| 6803 |
|
|---|
| 6804 | doc: do not overpromise --ignore-case's behavior
|
|---|
| 6805 | * NEWS: Omit vague statement about titlecase that could be
|
|---|
| 6806 | misinterpreted, and is more trouble than it's worth.
|
|---|
| 6807 | * doc/grep.texi: Add @documentencoding. Fix copyright range to
|
|---|
| 6808 | use endash not hyphen.
|
|---|
| 6809 | (Matching Control): Do not overpromise what --ignore-case will do.
|
|---|
| 6810 | Give examples of corner cases where the documentation does not
|
|---|
| 6811 | specify behavior.
|
|---|
| 6812 |
|
|---|
| 6813 | 2014-03-05 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6814 |
|
|---|
| 6815 | maint: remove differences from gnulib regex code
|
|---|
| 6816 | These don't seem to be needed with GCC 4.8.2, and are making
|
|---|
| 6817 | maintenance harder. If we need to disable warnings with older
|
|---|
| 6818 | compilers, we can add pragmas to the gnulib versions. See
|
|---|
| 6819 | <http://bugs.gnu.org/16911#24>.
|
|---|
| 6820 | * gl/lib/regcomp.c.diff, gl/lib/regex_internal.c.diff:
|
|---|
| 6821 | * gl/lib/regex_internal.h.diff, gl/lib/regexec.c.diff:
|
|---|
| 6822 | Remove.
|
|---|
| 6823 | * cfg.mk (exclude_file_name_regexp--sc_prohibit_tab_based_indentation):
|
|---|
| 6824 | Don't mention gl/* files.
|
|---|
| 6825 |
|
|---|
| 6826 | 2014-03-03 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6827 |
|
|---|
| 6828 | grep: fix comment
|
|---|
| 6829 | * src/main.c (trivial_case_ignore): Fix comment typo.
|
|---|
| 6830 |
|
|---|
| 6831 | 2014-03-03 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 6832 |
|
|---|
| 6833 | grep: avoid to add same character to a bracket expression
|
|---|
| 6834 | * src/main.c (trivial_ignore_case): Only when uppercase and/or
|
|---|
| 6835 | lowercase is different from original character, add it to new pattern.
|
|---|
| 6836 |
|
|---|
| 6837 | 2014-03-02 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6838 |
|
|---|
| 6839 | grep: fix some unlikely bugs in trivial_case_ignore
|
|---|
| 6840 | * src/main.c (MBRTOWC, WCRTOMB): Reformat as per usual GNU style.
|
|---|
| 6841 | (trivial_case_ignore): Don't overrun buffer in the unusual case
|
|---|
| 6842 | when a character has both lowercase and uppercase counterparts.
|
|---|
| 6843 | Don't rely on undefined behavior when assigning out-of-range value
|
|---|
| 6844 | to an 'int'. Simplify by avoiding unnecessary buffer copies.
|
|---|
| 6845 | Work even with shift encodings, by using mbsinit to
|
|---|
| 6846 | disable the optimization if we are not in the initial state
|
|---|
| 6847 | when we replace B by [BCD].
|
|---|
| 6848 |
|
|---|
| 6849 | 2014-03-02 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 6850 |
|
|---|
| 6851 | grep: revert removal of trivial_case_ignore
|
|---|
| 6852 | Revive trivial_case_ignore function in order to be able to use kwset.
|
|---|
| 6853 |
|
|---|
| 6854 | * src/main.c (MBRTOWC, WCRTOMB): New macros.
|
|---|
| 6855 | (trivial_case_ignore): New function.
|
|---|
| 6856 | (main): Use it.
|
|---|
| 6857 |
|
|---|
| 6858 | 2014-03-02 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 6859 |
|
|---|
| 6860 | grep: optimization of bracket expression for non-UTF8 locales
|
|---|
| 6861 | * src/dfa.c (addtok): Replace an MBCSET with a CSET even in
|
|---|
| 6862 | non-UTF8 locales, and even when it has individual characters.
|
|---|
| 6863 |
|
|---|
| 6864 | 2014-03-01 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6865 |
|
|---|
| 6866 | doc: describe titlecase fix better
|
|---|
| 6867 | * NEWS: Document behavior on lowercase text too.
|
|---|
| 6868 | Suggested by Eric Blake in <http://bugs.gnu.org/16911#10>.
|
|---|
| 6869 | * doc/grep.texi (Matching Control): Specify behavior of -i
|
|---|
| 6870 | more precisely.
|
|---|
| 6871 |
|
|---|
| 6872 | 2014-02-28 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6873 |
|
|---|
| 6874 | grep: minor tuning for mb_case_map_apply
|
|---|
| 6875 | * src/kwsearch.c (mb_case_map_apply): Avoid unnecessary widening of
|
|---|
| 6876 | size_t to intmax_t. Avoid unnecessary reinitialization of k.
|
|---|
| 6877 |
|
|---|
| 6878 | grep: avoid 'inline' when it doesn't matter
|
|---|
| 6879 | These days, compilers generally do just fine without advice from
|
|---|
| 6880 | users about 'inline', and there's little need for 'static inline',
|
|---|
| 6881 | just as there's little need for 'register'.
|
|---|
| 6882 | * src/dfa.c (to_uchar):
|
|---|
| 6883 | * src/dosbuf.c (guess_type, undossify_input, dossified_pos):
|
|---|
| 6884 | * src/main.c (undossify_input):
|
|---|
| 6885 | No longer inline.
|
|---|
| 6886 | * src/search.h (mb_case_map_apply): Move from here ...
|
|---|
| 6887 | * src/kwsearch.c (mb_case_map_apply): ... to here, and
|
|---|
| 6888 | make it no longer 'inline'.
|
|---|
| 6889 |
|
|---|
| 6890 | grep: fix bugs with -i and titlecase
|
|---|
| 6891 | * NEWS: Document this.
|
|---|
| 6892 | * src/dfa.c (setbit_wc): Simplify.
|
|---|
| 6893 | (setbit_c): Remove; no longer used.
|
|---|
| 6894 | (setbit_case_fold_c, parse_bracket_exp, atom):
|
|---|
| 6895 | Don't mishandle titlecase. For 'atom', this removes the need for
|
|---|
| 6896 | the refactoring of Bug#16729.
|
|---|
| 6897 | (lex): Use the slower approach only for letters that have a
|
|---|
| 6898 | differing case.
|
|---|
| 6899 | * tests/case-fold-titlecase: New file.
|
|---|
| 6900 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 6901 |
|
|---|
| 6902 | grep: remove lint
|
|---|
| 6903 | * src/main.c (MBRTOWC, WCRTOMB): Remove no-longer-used macros.
|
|---|
| 6904 |
|
|---|
| 6905 | 2014-02-28 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 6906 |
|
|---|
| 6907 | grep: remove trivial_case_ignore
|
|---|
| 6908 | * src/main.c (trivial_case_ignore): Remove.
|
|---|
| 6909 | (main): Remove its use; this optimization is no longer needed.
|
|---|
| 6910 |
|
|---|
| 6911 | grep: don't match line-by-line for case-insensitive with grep and awk
|
|---|
| 6912 | * src/main.c (matcher): Move decl up.
|
|---|
| 6913 | (do_execute): With the grep or awk matchers,
|
|---|
| 6914 | no need to match line by line.
|
|---|
| 6915 |
|
|---|
| 6916 | 2014-02-27 Jim Meyering <meyering@fb.com>
|
|---|
| 6917 |
|
|---|
| 6918 | maint: dfa: pass NULL, not 0, as 2nd arg to setlocale
|
|---|
| 6919 | * src/dfa.c (using_simple_locale): Use NULL, not 0.
|
|---|
| 6920 |
|
|---|
| 6921 | 2014-02-27 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 6922 |
|
|---|
| 6923 | * src/dfa.c (prednames): POSIX allows [[:xdigit:]] to match multibyte chars.
|
|---|
| 6924 |
|
|---|
| 6925 | * src/dfa.c (parse_bracket_exp): Parenthesize.
|
|---|
| 6926 |
|
|---|
| 6927 | grep: fix multiple bugs with bracket expressions
|
|---|
| 6928 | * NEWS: Document this.
|
|---|
| 6929 | * src/dfa.c (using_simple_locale): New function.
|
|---|
| 6930 | (parse_bracket_exp): Handle bracket expressions like [a-[.z.]]
|
|---|
| 6931 | correctly. Don't assume that dfaexec handles expressions like
|
|---|
| 6932 | [^a-z] correctly, as they can match multiple characters in some
|
|---|
| 6933 | locales.
|
|---|
| 6934 | * tests/posix-bracket: New file.
|
|---|
| 6935 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 6936 |
|
|---|
| 6937 | 2014-02-25 Stephane Chazelas <stephane.chazelas@gmail.com>
|
|---|
| 6938 |
|
|---|
| 6939 | align grep -Pw with grep -w
|
|---|
| 6940 | For the -w option, with -P, we used to look for the pattern surrounded by
|
|---|
| 6941 | word boundaries. That's different from what grep -w does and what the
|
|---|
| 6942 | documentation describes. Now align with grep -w and the documentation by
|
|---|
| 6943 | using PCRE look-behind and look-ahead operators to match the pattern if
|
|---|
| 6944 | it is not surrounded by word constituents.
|
|---|
| 6945 | * src/pcresearch.c (Pcompile): Use (?<!\w)(?:...)(?!\w) rather than
|
|---|
| 6946 | \b(?:...)\b.
|
|---|
| 6947 | * NEWS (Bug fixes): Mention it.
|
|---|
| 6948 | * tests/pcre-w: New file.
|
|---|
| 6949 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 6950 | This complements the fix for http://debbugs.gnu.org/16865
|
|---|
| 6951 |
|
|---|
| 6952 | 2014-02-24 Stephane Chazelas <stephane.chazelas@gmail.com>
|
|---|
| 6953 |
|
|---|
| 6954 | grep -P: fix it so backreferences now work with -w and -x
|
|---|
| 6955 | To implement -w and -x, we bracket the search term with parentheses.
|
|---|
| 6956 | However, that set of parentheses had the default semantics of
|
|---|
| 6957 | "capturing", i.e., creating a backreferenceable matched quantity.
|
|---|
| 6958 | Instead, use (?:...), to create a non-capturing group.
|
|---|
| 6959 | * src/pcresearch.c (Pcompile): Use (?:...) rather than (...).
|
|---|
| 6960 | * NEWS (Bug fixes): Mention it.
|
|---|
| 6961 | * tests/pcre-wx-backref: New file.
|
|---|
| 6962 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 6963 | This addresses http://debbugs.gnu.org/16865
|
|---|
| 6964 |
|
|---|
| 6965 | 2014-02-20 Jim Meyering <meyering@fb.com>
|
|---|
| 6966 |
|
|---|
| 6967 | maint: post-release administrivia
|
|---|
| 6968 | * NEWS: Add header line for next release.
|
|---|
| 6969 | * .prev-version: Record previous version.
|
|---|
| 6970 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 6971 |
|
|---|
| 6972 | version 2.18
|
|---|
| 6973 | * NEWS: Record release date.
|
|---|
| 6974 |
|
|---|
| 6975 | tests: test for the non-UTF8 multi-byte performance regression
|
|---|
| 6976 | Test for the just-fixed performance regression.
|
|---|
| 6977 | With a 100-200x differential, it is reasonable to expect that
|
|---|
| 6978 | a very slow system will be able to complete the designated
|
|---|
| 6979 | task in a few seconds, while with the bug, even a very fast
|
|---|
| 6980 | system would exceed the timeout.
|
|---|
| 6981 | * tests/mb-non-UTF8-performance: New file.
|
|---|
| 6982 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 6983 | * tests/init.cfg (require_JP_EUC_locale_): New function.
|
|---|
| 6984 |
|
|---|
| 6985 | grep -i: avoid a performance regression in multibyte non-UTF8 locales
|
|---|
| 6986 | * src/main.c: Include dfa.h.
|
|---|
| 6987 | (trivial_case_ignore): Perform this optimization only for UTF8 locales.
|
|---|
| 6988 | This rectifies a 100-200x performance regression in non-UTF8 multi-byte
|
|---|
| 6989 | locales like ja_JP.eucJP. The regression was introduced by the 10x
|
|---|
| 6990 | UTF8/grep-i speedup, commit v2.16-4-g97318f5.
|
|---|
| 6991 | * NEWS (Bug fixes): Mention it.
|
|---|
| 6992 | Reported by Norihiro Tanaka in http://debbugs.gnu.org/16232#50
|
|---|
| 6993 |
|
|---|
| 6994 | maint: give dfa.c's using_utf8 function external scope
|
|---|
| 6995 | * src/dfa.c (using_utf8): Remove "static inline".
|
|---|
| 6996 | * src/dfa.h (using_utf8): Declare it.
|
|---|
| 6997 | * src/searchutils.c (is_mb_middle): Use using_utf8 rather than
|
|---|
| 6998 | rolling our own.
|
|---|
| 6999 |
|
|---|
| 7000 | 2014-02-20 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 7001 |
|
|---|
| 7002 | tests: test [^^-^] in unibyte locales
|
|---|
| 7003 | This is a bug in the current dfa.c, which was reintroduced by the
|
|---|
| 7004 | recent reversion from RRI.
|
|---|
| 7005 | * tests/unibyte-negated-circumflex: New file.
|
|---|
| 7006 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 7007 | * tests/init.cfg (require_unibyte_locale): New function.
|
|---|
| 7008 |
|
|---|
| 7009 | grep: fix bug with patterns like [^^-~] in unibyte locales
|
|---|
| 7010 | * NEWS: Document this.
|
|---|
| 7011 | * src/dfa.c (parse_bracket_exp): Escape patterns like [^^-~], or
|
|---|
| 7012 | Awk patterns like [\^-\]], so that they are not misinterpreted by
|
|---|
| 7013 | the system regex library. Check for system regex failure due to
|
|---|
| 7014 | memory exhaustion.
|
|---|
| 7015 |
|
|---|
| 7016 | 2014-02-17 Jim Meyering <meyering@fb.com>
|
|---|
| 7017 |
|
|---|
| 7018 | maint: post-release administrivia
|
|---|
| 7019 | * NEWS: Add header line for next release.
|
|---|
| 7020 | * .prev-version: Record previous version.
|
|---|
| 7021 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 7022 |
|
|---|
| 7023 | version 2.17
|
|---|
| 7024 | * NEWS: Record release date.
|
|---|
| 7025 |
|
|---|
| 7026 | 2014-02-17 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 7027 |
|
|---|
| 7028 | revert "grep: DFA now uses rational ranges in unibyte locales"
|
|---|
| 7029 | The correct course of action for grep is to defer range interpretation
|
|---|
| 7030 | to regex, because otherwise you can get mismatches between regexes with
|
|---|
| 7031 | backreferences and those without.
|
|---|
| 7032 |
|
|---|
| 7033 | For example, [A-Z]. will use RRI but ([A-Z])\1 won't, with the confusing
|
|---|
| 7034 | result that the first regex won't match a superset of the language
|
|---|
| 7035 | described by the second regex.
|
|---|
| 7036 |
|
|---|
| 7037 | The source of the confusion is that, even though grep's dfa.c was changed
|
|---|
| 7038 | to use range checking instead of strcoll, that code is only invoked if
|
|---|
| 7039 | dfaexec is called with backref = NULL, and that never happens for grep!
|
|---|
| 7040 |
|
|---|
| 7041 | In the end, all that's needed for RRI is compiling --with-included-regex,
|
|---|
| 7042 | and in that case the patch is almost a no-op. Almost, because there
|
|---|
| 7043 | are corner cases that aren't handled correctly (e.g. [a-[.e.]], or
|
|---|
| 7044 | regular expressions that include a NUL character), but this can be
|
|---|
| 7045 | handled separately.
|
|---|
| 7046 |
|
|---|
| 7047 | * NEWS: Revert paragraph introduced by commit v2.16-7-g1078b64.
|
|---|
| 7048 | * src/dfa.c (parse_bracket_exp): Revert back to regcomp/regexec.
|
|---|
| 7049 |
|
|---|
| 7050 | 2014-02-16 Mike Frysinger <vapier@gentoo.org>
|
|---|
| 7051 |
|
|---|
| 7052 | maint: ignore configure.lineno
|
|---|
| 7053 | * .gitignore: Add configure.lineno.
|
|---|
| 7054 |
|
|---|
| 7055 | 2014-02-11 Benno Schulenberg <bensberg@justemail.net>
|
|---|
| 7056 |
|
|---|
| 7057 | help: remove surplus newline
|
|---|
| 7058 | * src/main.c (usage): Remove inconsistent \n introduced by previous
|
|---|
| 7059 | patch.
|
|---|
| 7060 |
|
|---|
| 7061 | 2014-02-10 Benno Schulenberg <bensberg@justemail.net>
|
|---|
| 7062 |
|
|---|
| 7063 | help: fix a line ending, and use the same word for similar things
|
|---|
| 7064 | * src/main.c (usage): Change a stray 'n' to a newline, and use
|
|---|
| 7065 | the word "display" for showing version info as for help text.
|
|---|
| 7066 |
|
|---|
| 7067 | 2014-02-09 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 7068 |
|
|---|
| 7069 | speed up mb-boundary-detection after each preliminary match
|
|---|
| 7070 | After each kwsexec or dfaexec match, we must determine whether
|
|---|
| 7071 | the tentative match falls in the middle of a multi-byte character.
|
|---|
| 7072 | That is what our is_mb_middle function does, but it was expensive,
|
|---|
| 7073 | even when most input consisted of single-byte characters. The main
|
|---|
| 7074 | cost was for each call to mbrlen. This change constructs and uses
|
|---|
| 7075 | a cache of the lengths returned by mbrlen for unibyte values.
|
|---|
| 7076 | The largest speed-up (3x to 7x, CPU-dependent) is when most
|
|---|
| 7077 | lines contain a match, yet few are printed, e.g., when using
|
|---|
| 7078 | grep -v common-pattern ... to filter out all but a few lines.
|
|---|
| 7079 |
|
|---|
| 7080 | * src/search.h (build_mbclen_cache): Declare it.
|
|---|
| 7081 | * src/main.c: Include "search.h".
|
|---|
| 7082 | [MBS_SUPPORT] (main): Call build_mbclen_cache in a multibyte locale.
|
|---|
| 7083 | * src/searchutils.c [HAVE_LANGINFO_CODESET]: Include <langinfo.h>.
|
|---|
| 7084 | (mbclen_cache): New global.
|
|---|
| 7085 | (build_mbclen_cache): New function.
|
|---|
| 7086 | (is_mb_middle) [HAVE_LANGINFO_CODESET]: Use it.
|
|---|
| 7087 | * NEWS (Improvements): Mention it.
|
|---|
| 7088 |
|
|---|
| 7089 | 2014-02-01 Jim Meyering <meyering@fb.com>
|
|---|
| 7090 |
|
|---|
| 7091 | maint: use to_uchar function rather than explicit casts
|
|---|
| 7092 | * src/system.h (to_uchar): Define function.
|
|---|
| 7093 | * src/kwsearch.c (Fexecute): Use to_uchar twice in place of casts.
|
|---|
| 7094 | * src/dfasearch.c (EGexecute): Likewise.
|
|---|
| 7095 | * src/main.c (prepend_args): Likewise.
|
|---|
| 7096 | * src/kwset.c (U): Define in terms of to_uchar.
|
|---|
| 7097 | * src/dfa.c (match_mb_charset): Use to_uchar, not an explicit cast.
|
|---|
| 7098 |
|
|---|
| 7099 | 2014-01-27 Jim Meyering <meyering@fb.com>
|
|---|
| 7100 |
|
|---|
| 7101 | maint: remove vestiges of support for long-disabled --mmap option
|
|---|
| 7102 | This option was disabled in March of 2010, and began to elicit a
|
|---|
| 7103 | warning in January of 2012. Its time has come.
|
|---|
| 7104 | * doc/grep.in.1: Remove mention.
|
|---|
| 7105 | * doc/grep.texi: Likewise.
|
|---|
| 7106 | * src/main.c (GROUP_SEPARATOR_OPTION, usage, MMAP_OPTION)
|
|---|
| 7107 | (long_options, main): Remove all traces.
|
|---|
| 7108 | * tests/Makefile.am (check_PROGRAMS): Remove mention of ignore-mmap.
|
|---|
| 7109 | * tests/ignore-mmap: Remove file.
|
|---|
| 7110 | * NEWS (Maintenance): Mention it.
|
|---|
| 7111 |
|
|---|
| 7112 | 2014-01-26 Jim Meyering <meyering@fb.com>
|
|---|
| 7113 |
|
|---|
| 7114 | maint: move two local variable declarations
|
|---|
| 7115 | * src/dfasearch.c (kwsmusts): Move one declaration down to the point
|
|---|
| 7116 | of definition. Move another into the sole scope where it is used.
|
|---|
| 7117 |
|
|---|
| 7118 | 2014-01-26 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 7119 |
|
|---|
| 7120 | dfasearch: skip kwset optimization when multi-byte+case-insensitive
|
|---|
| 7121 | Now that DFA searching works with multi-byte locales, the only remaining
|
|---|
| 7122 | reason to case-convert the searched input is the kwset optimization.
|
|---|
| 7123 | But multi-byte case-conversion is so expensive that it's not
|
|---|
| 7124 | worthwhile even to attempt that optimization.
|
|---|
| 7125 |
|
|---|
| 7126 | * src/dfasearch.c (kwsmusts): Skip this function in ignore-case mode
|
|---|
| 7127 | when the locale is multi-byte.
|
|---|
| 7128 | (EGexecute): Now that this code need not handle multi-byte case-ignoring
|
|---|
| 7129 | matches, remove the expensive copy/case-conversion code.
|
|---|
| 7130 | With no case-converted buffer, there is no longer any need to call
|
|---|
| 7131 | mb_case_map_apply, so remove it and associated code.
|
|---|
| 7132 | (kwsincr_case): Remove function. Now, every use of this function
|
|---|
| 7133 | is equivalent to a use of kwsincr. Replace all uses.
|
|---|
| 7134 | * tests/turkish-eyes: Test all of -E, -F and -G.
|
|---|
| 7135 |
|
|---|
| 7136 | 2014-01-25 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 7137 |
|
|---|
| 7138 | dfa: remove GREP-ifdef'd code in favor of code used by gawk
|
|---|
| 7139 | For many years, gawk and grep have used different #ifdef'd bits of
|
|---|
| 7140 | code relating to how the DFA matcher matches multibyte characters.
|
|---|
| 7141 | Remove the GREP-specific code in favor of the code gawk uses. This
|
|---|
| 7142 | permits us to avoid still more cases in which grep must resort to
|
|---|
| 7143 | the expensive process of copying/case-converting each input line
|
|---|
| 7144 | before matching against a case-converted regexp.
|
|---|
| 7145 | * src/dfa.c (parse_bracket_exp, atom): As above.
|
|---|
| 7146 |
|
|---|
| 7147 | 2014-01-25 Jim Meyering <meyering@fb.com>
|
|---|
| 7148 |
|
|---|
| 7149 | gnulib: update to latest
|
|---|
| 7150 |
|
|---|
| 7151 | 2014-01-17 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 7152 |
|
|---|
| 7153 | grep: DFA now uses rational ranges in unibyte locales
|
|---|
| 7154 | Problem reported by Aharon Robbins in <http://bugs.gnu.org/16481>.
|
|---|
| 7155 | * NEWS:
|
|---|
| 7156 | * doc/grep.texi (Environment Variables)
|
|---|
| 7157 | (Character Classes and Bracket Expressions):
|
|---|
| 7158 | Document this.
|
|---|
| 7159 | * src/dfa.c (parse_bracket_exp): Treat unibyte locales like multibyte.
|
|---|
| 7160 |
|
|---|
| 7161 | 2014-01-17 Aharon Robbins <arnold@skeeve.com>
|
|---|
| 7162 |
|
|---|
| 7163 | grep: add undocumented '-X gawk' and '-X posixawk' options
|
|---|
| 7164 | See <http://bugs.gnu.org/16481>.
|
|---|
| 7165 | * src/grep.c (GAcompile, PAcompile): New functions.
|
|---|
| 7166 | (const): Use them.
|
|---|
| 7167 |
|
|---|
| 7168 | 2014-01-10 Pádraig Brady <P@draigBrady.com>
|
|---|
| 7169 |
|
|---|
| 7170 | tests: remove superfluous uses of printf
|
|---|
| 7171 | * tests/turkish-eyes: Remove unnecessary uses of printf.
|
|---|
| 7172 |
|
|---|
| 7173 | 2014-01-09 Jim Meyering <meyering@fb.com>
|
|---|
| 7174 |
|
|---|
| 7175 | grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales
|
|---|
| 7176 | These days, nearly everyone uses a multibyte locale, and grep is often
|
|---|
| 7177 | used with the --ignore-case (-i) option, but that option imposes a very
|
|---|
| 7178 | high cost in order to handle some unusual cases in just a few multibyte
|
|---|
| 7179 | locales. This change gets most of the performance of using LC_ALL=C
|
|---|
| 7180 | without eliminating the ability to search for multibyte strings.
|
|---|
| 7181 |
|
|---|
| 7182 | With the following example, I see an 11x speed-up with a 2.3GHz i7:
|
|---|
| 7183 | Generate a 10M-line file, with each line consisting of 40 'j's:
|
|---|
| 7184 |
|
|---|
| 7185 | yes jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj | head -10000000 > k
|
|---|
| 7186 |
|
|---|
| 7187 | Time searching it for the simple/noexistent string "foobar",
|
|---|
| 7188 | first with this patch (best-of-5 trials):
|
|---|
| 7189 |
|
|---|
| 7190 | LC_ALL=en_US.UTF-8 env time src/grep -i foobar k
|
|---|
| 7191 | 1.10 real 1.03 user 0.07 sys
|
|---|
| 7192 |
|
|---|
| 7193 | Back out that commit (temporarily), recompile, and rerun the experiment:
|
|---|
| 7194 |
|
|---|
| 7195 | git log -1 -p|patch -R -p1; make
|
|---|
| 7196 | LC_ALL=en_US.UTF-8 env time src/grep -i foobar k
|
|---|
| 7197 | 12.50 real 12.41 user 0.08 sys
|
|---|
| 7198 |
|
|---|
| 7199 | The trick is to realize that for some search strings, it is easy
|
|---|
| 7200 | to convert to an equivalent one that is handled much more efficiently.
|
|---|
| 7201 | E.g., convert this command:
|
|---|
| 7202 |
|
|---|
| 7203 | grep -i foobar k
|
|---|
| 7204 |
|
|---|
| 7205 | to this:
|
|---|
| 7206 |
|
|---|
| 7207 | grep '[fF][oO][oO][bB][aA][rR]' k
|
|---|
| 7208 |
|
|---|
| 7209 | That allows the matcher to search in buffer mode, rather than having to
|
|---|
| 7210 | extract/case-convert/search each line separately. Currently, we perform
|
|---|
| 7211 | this conversion only when search strings contain neither '\' nor '['.
|
|---|
| 7212 | See the comments for more detail.
|
|---|
| 7213 |
|
|---|
| 7214 | * src/main.c (trivial_case_ignore): New function.
|
|---|
| 7215 | (main): When possible, transform the regexp so we can drop the -i.
|
|---|
| 7216 | * tests/turkish-eyes: New file.
|
|---|
| 7217 | * tests/Makefile.am (TESTS): Use it.
|
|---|
| 7218 | * NEWS (Improvements): Mention it.
|
|---|
| 7219 |
|
|---|
| 7220 | 2014-01-07 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 7221 |
|
|---|
| 7222 | tests: port Solaris 10 /bin/sh patch back to GNU/Linux
|
|---|
| 7223 | Problem reported by Jim Meyering.
|
|---|
| 7224 | * tests/bre, tests/ere, tests/spencer1-locale:
|
|---|
| 7225 | Prefer re_shell, not re_shell_.
|
|---|
| 7226 | * tests/init.sh (re_shell): New var, which is exported instead of
|
|---|
| 7227 | re_shell_.
|
|---|
| 7228 |
|
|---|
| 7229 | Port to Solaris 10 /bin/sh.
|
|---|
| 7230 | Problem reported by Dagobert Michelsen in <http://bugs.gnu.org/16380>.
|
|---|
| 7231 | * tests/bre, tests/ere, tests/spencer1-locale:
|
|---|
| 7232 | Prefer re_shell_ to SHELL, if re_shell_ is set.
|
|---|
| 7233 | * tests/init.sh (re_shell_): Export if it's used.
|
|---|
| 7234 |
|
|---|
| 7235 | 2014-01-01 Jim Meyering <meyering@fb.com>
|
|---|
| 7236 |
|
|---|
| 7237 | maint: post-release administrivia
|
|---|
| 7238 | * NEWS: Add header line for next release.
|
|---|
| 7239 | * .prev-version: Record previous version.
|
|---|
| 7240 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 7241 |
|
|---|
| 7242 | version 2.16
|
|---|
| 7243 | * NEWS: Record release date.
|
|---|
| 7244 |
|
|---|
| 7245 | gnulib: update to latest, for maint.mk fix
|
|---|
| 7246 |
|
|---|
| 7247 | maint: update copyright dates for 2014
|
|---|
| 7248 | Do that by running "make update-copyright".
|
|---|
| 7249 |
|
|---|
| 7250 | gnulib: update to latest
|
|---|
| 7251 |
|
|---|
| 7252 | 2013-12-31 Jim Meyering <meyering@fb.com>
|
|---|
| 7253 |
|
|---|
| 7254 | pcre: use PCRE_NO_UTF8_CHECK properly
|
|---|
| 7255 | In order to obtain the behavior we want, i.e., to disable
|
|---|
| 7256 | error-on-invalid-UTF-in-input, apply this PCRE option in
|
|---|
| 7257 | pcre_exec, not when compiling.
|
|---|
| 7258 | * src/pcresearch.c (Pexecute): Use PCRE_NO_UTF8_CHECK here, ...
|
|---|
| 7259 | (Pcompile): ...rather than here.
|
|---|
| 7260 | * tests/pcre-invalid-utf8-input: Adjust test case to test for this.
|
|---|
| 7261 |
|
|---|
| 7262 | 2013-12-26 Jim Meyering <meyering@fb.com>
|
|---|
| 7263 |
|
|---|
| 7264 | maint: fix inconsistent spacing in expression
|
|---|
| 7265 | * src/main.c (prline): Fix inconsistent spacing in expression:
|
|---|
| 7266 | s/ / /.
|
|---|
| 7267 |
|
|---|
| 7268 | 2013-12-26 behoffski <behoffski@grouse.com.au>
|
|---|
| 7269 |
|
|---|
| 7270 | maint: fix a garbled comment
|
|---|
| 7271 | * src/dfa.c (XNMALLOC, etc.): Fix garbled comment wording.
|
|---|
| 7272 |
|
|---|
| 7273 | 2013-12-23 Jim Meyering <meyering@fb.com>
|
|---|
| 7274 |
|
|---|
| 7275 | maint: fix/improve a comment
|
|---|
| 7276 | * src/main.c (prline): Replace untrue FIXME comment with one
|
|---|
| 7277 | telling how the hard-to-reach code can be exercised.
|
|---|
| 7278 |
|
|---|
| 7279 | 2013-12-21 Santiago Ruano Rincón <santiago@debian.org>
|
|---|
| 7280 |
|
|---|
| 7281 | pcre: tell grep -P to relax its stance on invalid multibyte chars
|
|---|
| 7282 | Do not exit-2 for invalid UTF-8 characters. Just prior to this
|
|---|
| 7283 | change, this command would match no lines and fail like this:
|
|---|
| 7284 | $ printf 'j\x82\nj\n'|LC_ALL=en_US.UTF-8 grep -P j|cat -A; echo $?
|
|---|
| 7285 | grep: invalid UTF-8 byte sequence in input
|
|---|
| 7286 | 2
|
|---|
| 7287 | After this change, the same command matches both lines, and succeeds:
|
|---|
| 7288 | jM-^B$
|
|---|
| 7289 | j$
|
|---|
| 7290 | 0
|
|---|
| 7291 | * src/pcresearch.c (Pcompile): Use PCRE_NO_UTF8_CHECK, too, and
|
|---|
| 7292 | add a comment.
|
|---|
| 7293 | * tests/pcre-utf8: Add a test and a comment.
|
|---|
| 7294 | This change did not work with Debian unstable pcre-8.31-2
|
|---|
| 7295 | or with some 8.33 and 8.34-based versions, but does work with
|
|---|
| 7296 | Fedora 20's 8.33 and with a built-from-latest source library.
|
|---|
| 7297 | Based on a patch by Santiago Ruano Rincón.
|
|---|
| 7298 | See http://bugs.gnu.org/15758/
|
|---|
| 7299 |
|
|---|
| 7300 | 2013-12-21 Jim Meyering <meyering@fb.com>
|
|---|
| 7301 |
|
|---|
| 7302 | tests: avoid FP failure due to exhausted memory
|
|---|
| 7303 | * tests/long-line-vs-2GiB-read: Don't declare the test "failed"
|
|---|
| 7304 | when running out of memory. In that case, skip it.
|
|---|
| 7305 |
|
|---|
| 7306 | 2013-12-18 Jim Meyering <meyering@fb.com>
|
|---|
| 7307 |
|
|---|
| 7308 | maint: add comments and split some long lines
|
|---|
| 7309 | * src/main.c (do_execute): Add a comment.
|
|---|
| 7310 | Split some lines longer than 80 bytes.
|
|---|
| 7311 |
|
|---|
| 7312 | pcre: avoid a nominal leak
|
|---|
| 7313 | * src/pcresearch.c (Pcompile)[HAVE_LIBPCRE && !PCRE_STUDY_JIT_COMPILE]:
|
|---|
| 7314 | We would leak "re" if built with HAVE_LIBPCRE but without
|
|---|
| 7315 | PCRE_STUDY_JIT_COMPILE. Move the free out one level.
|
|---|
| 7316 |
|
|---|
| 7317 | maint: indent cpp directives to reflect nesting
|
|---|
| 7318 | * src/pcresearch.c: Insert spaces after a few "#", to indent
|
|---|
| 7319 | cpp directives to reflect their nesting.
|
|---|
| 7320 |
|
|---|
| 7321 | grep: handle lines longer than INT_MAX on more systems
|
|---|
| 7322 | When trying to exercize some long-line-handling code, I ran these
|
|---|
| 7323 | commands:
|
|---|
| 7324 | $ dd bs=1 seek=2G of=big < /dev/null; grep -l x big; echo $?
|
|---|
| 7325 | grep: big: Invalid argument
|
|---|
| 7326 | 2
|
|---|
| 7327 | grep should not have issued that diagnostic, and it should
|
|---|
| 7328 | have exited with status 1, not 2. What happened?
|
|---|
| 7329 | grep read the 2GiB of NULs, doubled its buffer size,
|
|---|
| 7330 | copied the 2GiB into the new 4GiB buffer, and proceeded
|
|---|
| 7331 | to call "read" with a byte-count argument of 2^32.
|
|---|
| 7332 | On at least Darwin 12.5.0, that makes read fail with EINVAL.
|
|---|
| 7333 | The solution is to use gnulib's safe_read wrapper.
|
|---|
| 7334 | * src/main.c: Include "safe-read.h"
|
|---|
| 7335 | (fillbuf): Use safe_read, rather than bare read. The latter
|
|---|
| 7336 | cannot handle a read size of 2^32 on some systems.
|
|---|
| 7337 | * bootstrap.conf (gnulib_modules): Add safe-read.
|
|---|
| 7338 | * tests/long-line-vs-2GiB-read: New file.
|
|---|
| 7339 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 7340 | * NEWS (Bug fixes): Mention it.
|
|---|
| 7341 |
|
|---|
| 7342 | 2013-11-25 Jim Meyering <meyering@fb.com>
|
|---|
| 7343 |
|
|---|
| 7344 | tests: port to non-GNU sed
|
|---|
| 7345 | * tests/multibyte-white-space (utf8_space_characters): The generation
|
|---|
| 7346 | of test inputs relied on GNU sed's interpretation of \<, but that is
|
|---|
| 7347 | not portable, and caused spurious test failures. Adjust the sed regexp
|
|---|
| 7348 | to work on all versions.
|
|---|
| 7349 | Reported by Karl Dubost in http://bugs.gnu.org/15953.
|
|---|
| 7350 |
|
|---|
| 7351 | 2013-11-22 Jim Meyering <meyering@fb.com>
|
|---|
| 7352 |
|
|---|
| 7353 | maint: minor cleanup: xmalloc+strcpy -> xmemdup
|
|---|
| 7354 | * src/main.c (main): Replace an xmalloc+strcpy combination
|
|---|
| 7355 | with an equivalent use of xmemdup.
|
|---|
| 7356 |
|
|---|
| 7357 | 2013-11-21 Jim Meyering <meyering@fb.com>
|
|---|
| 7358 | Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 7359 |
|
|---|
| 7360 | dfa: avoid undefined behavior of "1 << 31"
|
|---|
| 7361 | * src/dfa.c (charclass): Change type from "int" to "unsigned int".
|
|---|
| 7362 | (tstbit): Rather than shifting "1" left to form a mask, shift the
|
|---|
| 7363 | LHS bits the right and use "1" as the mask. Also, return bool, rather
|
|---|
| 7364 | than "int".
|
|---|
| 7365 | (setbit, clrbit, dfastate): Don't shift "1" (aka (int)1) left by 31 bits.
|
|---|
| 7366 | Instead, use "1U" as the operand, to avoid undefined behavior.
|
|---|
| 7367 | Spotted by gcc's new -fsanitize=undefined.
|
|---|
| 7368 |
|
|---|
| 7369 | 2013-11-02 Jim Meyering <meyering@fb.com>
|
|---|
| 7370 |
|
|---|
| 7371 | grep: fix regression with -P vs. invalid UTF-8 input
|
|---|
| 7372 | * src/pcresearch.c (Pexecute): Don't abort upon unexpected
|
|---|
| 7373 | PCRE-specific error code. Explicitly handle PCRE_ERROR_BADUTF8,
|
|---|
| 7374 | and change the default to print a diagnostic including the unhandled
|
|---|
| 7375 | integer PCRE error code and exit with status 2.
|
|---|
| 7376 | * tests/pcre-invalid-utf8-input: New file.
|
|---|
| 7377 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 7378 | * NEWS (Bug fixes): Mention it.
|
|---|
| 7379 | * THANKS: Update.
|
|---|
| 7380 | Reported by Dave Reisner in http://bugs.gnu.org/15758.
|
|---|
| 7381 |
|
|---|
| 7382 | grep: fix regression involving \s and \S
|
|---|
| 7383 | Commit v2.14-40-g01ec90b made \s and \S work with multi-byte
|
|---|
| 7384 | characters, but it made it so any use like \s*, \s+, \s?, \s{3}
|
|---|
| 7385 | would malfunction in a multi-byte locale.
|
|---|
| 7386 | * src/dfa.c (lex): Also reset laststart.
|
|---|
| 7387 | * tests/backslash-s-and-repetition-operators: New file.
|
|---|
| 7388 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 7389 | * NEWS (Bug fixes): Mention it.
|
|---|
| 7390 | * THANKS: Update.
|
|---|
| 7391 | Reported by Mirraz Mirraz in http://bugs.gnu.org/15773.
|
|---|
| 7392 |
|
|---|
| 7393 | 2013-11-01 Jim Meyering <meyering@fb.com>
|
|---|
| 7394 |
|
|---|
| 7395 | maint: NEWS: document a release-related bug fix
|
|---|
| 7396 | * NEWS (Bug fixes): Add an entry for a fix pulled from gnulib.
|
|---|
| 7397 |
|
|---|
| 7398 | 2013-10-26 Jim Meyering <meyering@fb.com>
|
|---|
| 7399 |
|
|---|
| 7400 | build: update gnulib submodule to latest
|
|---|
| 7401 | This pulls in a gnulib fix for maint.mk that ensures the procedure
|
|---|
| 7402 | described in README-release actually does what we want. Before this
|
|---|
| 7403 | change, that procedure resulted in a grep-2.15 tarball that would
|
|---|
| 7404 | lead to a grep binary whose --version- reported version number was
|
|---|
| 7405 | 2.14.51... rather than the expected 2.15.
|
|---|
| 7406 |
|
|---|
| 7407 | maint: avoid automake deprecation warning re ACLOCAL_AMFLAGS
|
|---|
| 7408 | * Makefile.am (ACLOCAL_AMFLAGS): Don't use this deprecated variable.
|
|---|
| 7409 | * configure.ac (AC_CONFIG_MACRO_DIRS): Use this instead.
|
|---|
| 7410 | (AUTOMAKE_OPTIONS): Require automake-1.12.
|
|---|
| 7411 |
|
|---|
| 7412 | maint: post-release administrivia
|
|---|
| 7413 | * NEWS: Add header line for next release.
|
|---|
| 7414 | * .prev-version: Record previous version.
|
|---|
| 7415 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 7416 |
|
|---|
| 7417 | version 2.15
|
|---|
| 7418 | * NEWS: Record release date.
|
|---|
| 7419 |
|
|---|
| 7420 | 2013-10-25 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 7421 |
|
|---|
| 7422 | build: port to AIX
|
|---|
| 7423 | Problem reported by Pavel Kharitonov in <http://bugs.gnu.org/15690#68>.
|
|---|
| 7424 | * src/Makefile.am (LDADD): Add $(LIBTHREAD).
|
|---|
| 7425 |
|
|---|
| 7426 | build: avoid duplicate -funit-at-a-time etc. options
|
|---|
| 7427 | * configure.ac (WERROR_CFLAGS): Don't add -fdiagnostics-show-option
|
|---|
| 7428 | and -funit-at-a-time, as Gnulib does that for us now, and we're
|
|---|
| 7429 | merely piling on duplicats.
|
|---|
| 7430 |
|
|---|
| 7431 | 2013-10-24 Jim Meyering <meyering@fb.com>
|
|---|
| 7432 |
|
|---|
| 7433 | tests: port more tests to bourne shells with hex-challenged printf
|
|---|
| 7434 | * tests/pcre-utf8: Convert the hex \xHH literals for the euro symbol
|
|---|
| 7435 | to octal \OOO.
|
|---|
| 7436 | * tests/turkish-I: Likewise for "I with dot".
|
|---|
| 7437 | * tests/turkish-I-without-dot: Likewise for another Turkish I: U+0131.
|
|---|
| 7438 |
|
|---|
| 7439 | maint: clean up an ugly 'while' condition
|
|---|
| 7440 | * src/main.c (get_nondigit_option): Separate a slightly baroque
|
|---|
| 7441 | "while" expression into two separate statements, both inside the loop.
|
|---|
| 7442 |
|
|---|
| 7443 | 2013-10-23 Jim Meyering <meyering@fb.com>
|
|---|
| 7444 |
|
|---|
| 7445 | tests: port to bourne shells whose printf doesn't grok hex
|
|---|
| 7446 | Use octal escapes, not hex, in printf(1) format strings,
|
|---|
| 7447 | and in one case, use $AWK's printf so we can continue
|
|---|
| 7448 | to use the table of hex values.
|
|---|
| 7449 | * tests/char-class-multibyte: Use printf octal escapes, not hex,
|
|---|
| 7450 | for portability to shells like dash and Solaris 10's /bin/sh.
|
|---|
| 7451 | * tests/backslash-s-vs-invalid-multitype: Likewise.
|
|---|
| 7452 | * tests/surrogate-pair: Likewise.
|
|---|
| 7453 | * tests/unibyte-bracket-expr: Count in decimal and convert to octal.
|
|---|
| 7454 | * tests/multibyte-white-space (hex_printf): New function.
|
|---|
| 7455 | Use it in place of printf so we can retain the table of hex digits
|
|---|
| 7456 | without hitting the limitation of some bourne shells.
|
|---|
| 7457 | Reported by Paul Eggert in http://bugs.gnu.org/15690#11
|
|---|
| 7458 |
|
|---|
| 7459 | 2013-10-21 Jim Meyering <meyering@fb.com>
|
|---|
| 7460 |
|
|---|
| 7461 | gnulib: update to latest
|
|---|
| 7462 |
|
|---|
| 7463 | maint: remove now-unused wcscoll module
|
|---|
| 7464 | * bootstrap.conf (gnulib_modules): Remove wcscoll; no longer used.
|
|---|
| 7465 |
|
|---|
| 7466 | 2013-10-20 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 7467 |
|
|---|
| 7468 | build: avoid chatter from Automake 1.14
|
|---|
| 7469 | * configure.ac (AM_INIT_AUTOMAKE): Add subdir-objects.
|
|---|
| 7470 |
|
|---|
| 7471 | build: port shell pattern to Solaris 10
|
|---|
| 7472 | * configure.ac: Don't use unquoted '^' in a pattern, as this
|
|---|
| 7473 | breaks 'configure' on Solaris 10, whose /bin/sh complains about it,
|
|---|
| 7474 | which causes 'configure' to exit even before it finds a decent shell.
|
|---|
| 7475 | Unix 7th edition shell accepted '^' as an alias for '|'.
|
|---|
| 7476 |
|
|---|
| 7477 | build: port to platforms that predefine _FORTIFY_SOURCE
|
|---|
| 7478 | Problem reported by Brenton Hoff (Bug#15663).
|
|---|
| 7479 | * configure.ac (_FORTIFY_SOURCE): Don't define if already defined.
|
|---|
| 7480 | This is what Emacs does.
|
|---|
| 7481 |
|
|---|
| 7482 | 2013-10-20 Jim Meyering <meyering@fb.com>
|
|---|
| 7483 |
|
|---|
| 7484 | build: update gnulib submodule to latest
|
|---|
| 7485 |
|
|---|
| 7486 | 2013-10-19 Jim Meyering <meyering@fb.com>
|
|---|
| 7487 |
|
|---|
| 7488 | tests: extend the multibyte-white-space test
|
|---|
| 7489 | * tests/multibyte-white-space (utf8_space_characters): Add more
|
|---|
| 7490 | single-byte whitespace characters. Align RHS hex values and
|
|---|
| 7491 | make the sed substitution less rigid, to accommodate.
|
|---|
| 7492 | Also, ensure that grep '\S' exits with status 1.
|
|---|
| 7493 |
|
|---|
| 7494 | maint: update bootstrap to latest from gnulib
|
|---|
| 7495 | * bootstrap: Update from gnulib.
|
|---|
| 7496 |
|
|---|
| 7497 | maint: fix typo in NEWS
|
|---|
| 7498 | * NEWS: Fix/improve example commands in most recent entry.
|
|---|
| 7499 | The LC_ALL envvar setting goes before grep, not before printf.
|
|---|
| 7500 | Don't reference src/ in the second example command, and do specify
|
|---|
| 7501 | the locale.
|
|---|
| 7502 |
|
|---|
| 7503 | 2013-10-09 Jim Meyering <meyering@fb.com>
|
|---|
| 7504 |
|
|---|
| 7505 | tests: add a test for better coverage of some tricky code
|
|---|
| 7506 | * tests/spencer1.tests: Add a non-range bracket expression representing the
|
|---|
| 7507 | same regexp, to cover the alternate code path, the one that does not require
|
|---|
| 7508 | a regcomp/exec call to interpret the regexp.
|
|---|
| 7509 |
|
|---|
| 7510 | 2013-10-01 Jim Meyering <meyering@fb.com>
|
|---|
| 7511 |
|
|---|
| 7512 | tests: ensure neither \s nor \S matches an invalid multibyte character
|
|---|
| 7513 | * tests/backslash-S-vs-invalid-multitype: New file.
|
|---|
| 7514 | Prompted by the bug report from Roman at
|
|---|
| 7515 | http://savannah.gnu.org/bugs/?40009
|
|---|
| 7516 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 7517 |
|
|---|
| 7518 | dfa: fix \s and \S to work for multibyte
|
|---|
| 7519 | * src/dfa.c (lex): In multibyte mode, we can't treat \s and \S as we do
|
|---|
| 7520 | in single-byte mode. Map them to [[:space:]] and [^[:space:]] respectively,
|
|---|
| 7521 | to make the DFA matcher use the regex-matcher for this term.
|
|---|
| 7522 | * tests/multibyte-white-space: New file. Test for the bug.
|
|---|
| 7523 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 7524 | This bug was introduced with the addition of DFA support
|
|---|
| 7525 | for \s and \S in commit v2.5.4-112-gf979ca0.
|
|---|
| 7526 |
|
|---|
| 7527 | 2013-09-30 Jim Meyering <meyering@fb.com>
|
|---|
| 7528 |
|
|---|
| 7529 | maint: change all references: s/POSIX\.2/POSIX/
|
|---|
| 7530 | There is no longer any point in referring to POSIX.N.
|
|---|
| 7531 | POSIX is sufficient.
|
|---|
| 7532 | * doc/grep.in.1: As above.
|
|---|
| 7533 | * src/main.c (main): Likewise.
|
|---|
| 7534 | * tests/file: Likewise.
|
|---|
| 7535 | * tests/options: Likewise.
|
|---|
| 7536 | * ChangeLog: Likewise.
|
|---|
| 7537 | * NEWS: Likewise.
|
|---|
| 7538 | * cfg.mk: Update, to match changed NEWS.
|
|---|
| 7539 | Inspired by Glenn Golden's suggestion in http://bugs.gnu.org/15486
|
|---|
| 7540 |
|
|---|
| 7541 | 2013-09-22 Jim Meyering <meyering@fb.com>
|
|---|
| 7542 |
|
|---|
| 7543 | dfa: remove dead disjunct
|
|---|
| 7544 | * src/dfa.c (parse_bracket_exp): Remove dead disjunct.
|
|---|
| 7545 | At that point, we know MB_CUR_MAX <= 1, so the test,
|
|---|
| 7546 | MB_CUR_MAX > 1 && ... is always false. Remove the disjunct.
|
|---|
| 7547 |
|
|---|
| 7548 | maint: dfa: improve comments and formatting
|
|---|
| 7549 | * src/dfa.c (add_utf8_anychar): Correct wording/alignment of a comment.
|
|---|
| 7550 | (dfaexec): Add curly braces around multi-line while statement within
|
|---|
| 7551 | a "then" block.
|
|---|
| 7552 | (ANYCHAR): Clarify comment: "." does not match an invalid UTF8 character.
|
|---|
| 7553 | (parse_bracket_exp) Improve comment.
|
|---|
| 7554 |
|
|---|
| 7555 | 2013-09-08 Jim Meyering <meyering@fb.com>
|
|---|
| 7556 |
|
|---|
| 7557 | dfa: appease a static analyzer, and save 95 stack bytes
|
|---|
| 7558 | * src/dfa.c (MAX_BRACKET_STRING_LEN): Rename from BRACKET_BUFFER_SIZE
|
|---|
| 7559 | and decrease from 128 to 32.
|
|---|
| 7560 | (parse_bracket_exp): Add one byte more than MAX_BRACKET_STRING_LEN
|
|---|
| 7561 | to the length of "str" buffer, to avoid appearance that we may store
|
|---|
| 7562 | the trailing NUL beyond the end of buffer. A string of length 32
|
|---|
| 7563 | or greater is rejected by earlier processing, so would never reach
|
|---|
| 7564 | this code. Addresses http://bugs.gnu.org/15307
|
|---|
| 7565 |
|
|---|
| 7566 | 2013-09-01 Corinna Vinschen <vinschen@redhat.com>
|
|---|
| 7567 |
|
|---|
| 7568 | fix Cygwin UTF-16 surrogate-pair handling with -i
|
|---|
| 7569 | grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin)
|
|---|
| 7570 | when converting an input string containing certain 4-byte UTF-8
|
|---|
| 7571 | sequences to lower case. The conversions to wchar_t and back to
|
|---|
| 7572 | a UTF-8 multibyte string did not take surrogate pairs into account.
|
|---|
| 7573 | * src/searchutils.c (mbtolower) [__CYGWIN__]: Detect and handle
|
|---|
| 7574 | surrogate pairs when converting.
|
|---|
| 7575 | * NEWS (Bug fixes): Mention it.
|
|---|
| 7576 | * tests/surrogate-pair: New test.
|
|---|
| 7577 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 7578 | Reported by: Jim Burwell
|
|---|
| 7579 |
|
|---|
| 7580 | 2013-08-19 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 7581 |
|
|---|
| 7582 | doc: mention how to use the latest gnulib
|
|---|
| 7583 | * README-hacking: Steal some text from coreutils/README-hacking.
|
|---|
| 7584 |
|
|---|
| 7585 | 2013-08-10 Jim Meyering <meyering@fb.com>
|
|---|
| 7586 |
|
|---|
| 7587 | build: update gnulib-related code
|
|---|
| 7588 | * gnulib: Update submodule to latest.
|
|---|
| 7589 | * bootstrap: Update from gnulib.
|
|---|
| 7590 | * gl/lib/regex_internal.h.diff: Update to reflect gnulib changes.
|
|---|
| 7591 | * bootstrap.conf: Partial sync from coreutils.
|
|---|
| 7592 |
|
|---|
| 7593 | 2013-08-09 Jim Meyering <meyering@fb.com>
|
|---|
| 7594 |
|
|---|
| 7595 | tests: simplify and factor newest test
|
|---|
| 7596 | * tests/char-class-multibyte2: Simplify file names.
|
|---|
| 7597 | Factor out $e_acute, so that the grep argument representation
|
|---|
| 7598 | is ascii (though the value is still UTF8).
|
|---|
| 7599 |
|
|---|
| 7600 | doc: NEWS: mention the DFA segfault fix
|
|---|
| 7601 | * NEWS (Bug fixes): List the DFA segfault fix.
|
|---|
| 7602 |
|
|---|
| 7603 | 2013-07-05 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 7604 |
|
|---|
| 7605 | Redo comments and white space to better approach GNU style.
|
|---|
| 7606 |
|
|---|
| 7607 | 2013-07-05 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 7608 |
|
|---|
| 7609 | tests: add testcase for previous change
|
|---|
| 7610 | * tests/Makefile.am (TESTS): add char-class-multibyte2.
|
|---|
| 7611 | * tests/char-class-multibyte2: New file.
|
|---|
| 7612 |
|
|---|
| 7613 | 2013-07-05 Mike Haertel <mike@ducky.net>
|
|---|
| 7614 |
|
|---|
| 7615 | dfa: fix multibyte character in brackets with repetition
|
|---|
| 7616 | Let FOO stand for any multibyte (e.g. CJK character) in the regexp.
|
|---|
| 7617 | It turns out the following much simpler regexp:
|
|---|
| 7618 | ([^.]*[FOO]){1,2}
|
|---|
| 7619 | is sufficient to cause the crash.
|
|---|
| 7620 |
|
|---|
| 7621 | In the first step of its parsing, DFA transforms regexp from human
|
|---|
| 7622 | readable syntax into reverse-polish form. For regexps of the form a{m,n}
|
|---|
| 7623 | repeat counts, it simply builds repeated copies of the representation
|
|---|
| 7624 | of a, with appropriate inserted CAT and QMARK operators. For the above
|
|---|
| 7625 | example with a regexp of the form a{1,2} it would build:
|
|---|
| 7626 |
|
|---|
| 7627 | <RPN representation for a>
|
|---|
| 7628 | <RPN representation for a>
|
|---|
| 7629 | QMARK
|
|---|
| 7630 | CAT
|
|---|
| 7631 |
|
|---|
| 7632 | When building repeated copies of RPN representations, additional
|
|---|
| 7633 | copies of the RPN representations are made by calling a function
|
|---|
| 7634 | copytoks() with arguments consisting of the start position and
|
|---|
| 7635 | length of the original copy.
|
|---|
| 7636 |
|
|---|
| 7637 | The problem is that the current code for copytoks() is simply
|
|---|
| 7638 | incorrect. It operates by calling addtok() for each individual
|
|---|
| 7639 | token in the source range being copied. But, in the particular
|
|---|
| 7640 | case that the token being added is MBCSET, addtok():
|
|---|
| 7641 |
|
|---|
| 7642 | (1) incorrectly assumes that the character set being added to be added
|
|---|
| 7643 | is the one most (addtok has no argument to indicate which cset is
|
|---|
| 7644 | being added, so it just uses the latest one)
|
|---|
| 7645 |
|
|---|
| 7646 | (2) attempts to do some token sequence expansion into more primitive
|
|---|
| 7647 | operators so things like [FOO] are matched efficiently.
|
|---|
| 7648 |
|
|---|
| 7649 | Both of these assumptions are incorrect in the case that addtok()
|
|---|
| 7650 | is being called from copytoks(): (1) is simply not true, and
|
|---|
| 7651 | (2) is redundant--the expansion has already been done token sequence
|
|---|
| 7652 | being copied, so there is no need to do the expansion again.
|
|---|
| 7653 |
|
|---|
| 7654 | The correct function to add exactly one token, without further expansion,
|
|---|
| 7655 | is addtok_mb(). So here is my proposed fix, which is that copytoks()
|
|---|
| 7656 | should never call addtok(), but instead directly call addtok_mb()
|
|---|
| 7657 | (which is what addtok() eventually calls).
|
|---|
| 7658 |
|
|---|
| 7659 | * src/dfa.c (copytoks): Rewrite using addtok_mb directly.
|
|---|
| 7660 |
|
|---|
| 7661 | 2013-05-28 Jim Meyering <meyering@fb.com>
|
|---|
| 7662 |
|
|---|
| 7663 | maint: align backslashes consistently
|
|---|
| 7664 | * tests/Makefile.am: Most backslashes were aligned with TABs,
|
|---|
| 7665 | so adjust the few that used spaces to conform.
|
|---|
| 7666 |
|
|---|
| 7667 | grep -F: avoid an infinite loop with invalid multi-byte search string
|
|---|
| 7668 | * src/kwsearch.c (Fexecute): Avoid an infinite loop when processing
|
|---|
| 7669 | a fixed (-F) multibyte search string that is an invalid byte sequence
|
|---|
| 7670 | in the current locale and that matches the bytes of the input twice
|
|---|
| 7671 | on a line. Reported by Daisuke GOTO in
|
|---|
| 7672 | http://thread.gmane.org/gmane.comp.gnu.grep.bugs/4773
|
|---|
| 7673 | * tests/invalid-multibyte-infloop: New test.
|
|---|
| 7674 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 7675 | * NEWS (Bug fixes): Mention it.
|
|---|
| 7676 |
|
|---|
| 7677 | 2013-04-18 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 7678 |
|
|---|
| 7679 | * cfg.mk (old_NEWS_hash): Update.
|
|---|
| 7680 |
|
|---|
| 7681 | doc: document EREs like a{,10}
|
|---|
| 7682 | Problem reported by Eric Blake in
|
|---|
| 7683 | <http://lists.gnu.org/archive/html/bug-grep/2013-04/msg00005.html>.
|
|---|
| 7684 | * NEWS: Document the bug fix.
|
|---|
| 7685 | * doc/grep.in.1: Restore documentation for this feature, but mention
|
|---|
| 7686 | that it is a GNU extension.
|
|---|
| 7687 | * doc/grep.texi (Fundamental Structure): Mention that this feature
|
|---|
| 7688 | is a GNU extension.
|
|---|
| 7689 |
|
|---|
| 7690 | 2013-04-02 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 7691 |
|
|---|
| 7692 | build: make dfa.c closer to Gawk's
|
|---|
| 7693 | * src/dfa.c: Include <stddef.h>, not <sys/types.h>.
|
|---|
| 7694 | stddef.h is smaller and is all we need and is portable nowadays.
|
|---|
| 7695 | Include <wchar.h> and <wctype.h> only if MBS_SUPPORT.
|
|---|
| 7696 |
|
|---|
| 7697 | 2013-01-15 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 7698 |
|
|---|
| 7699 | grep: make dfa.h standalone
|
|---|
| 7700 | Problem reported by Aharon Robbins in
|
|---|
| 7701 | <http://lists.gnu.org/archive/html/bug-grep/2013-01/msg00007.html>.
|
|---|
| 7702 | * src/dfa.c: Include dfa.h first, so that it's tested standalone.
|
|---|
| 7703 | No need to include <regex.h>, since we are in charge of dfa.h and
|
|---|
| 7704 | know that it includes <regex.h>.
|
|---|
| 7705 | * src/dfa.h: Include <regex.h> and <stddef.h>, so that it's standalone.
|
|---|
| 7706 |
|
|---|
| 7707 | 2013-01-11 Stefano Lattarini <stefano.lattarini@gmail.com>
|
|---|
| 7708 |
|
|---|
| 7709 | build: update gettext version to 0.18.2
|
|---|
| 7710 | * configure.ac (AM_GNU_GETTEXT_VERSION): Update to 0.18.2.
|
|---|
| 7711 | This is necessary to have the gettext-provided m4 files to use
|
|---|
| 7712 | AC_PROG_MKDIR_P rather than AM_PROG_MKDIR_P. This latter macro,
|
|---|
| 7713 | planned to disappear in Automake 1.14, has already been removed
|
|---|
| 7714 | in the development version of Automake, so that, without this
|
|---|
| 7715 | change, grep fails to bootstrap with bleeding-edge Automake.
|
|---|
| 7716 |
|
|---|
| 7717 | 2013-01-11 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 7718 |
|
|---|
| 7719 | build: update gnulib submodule to latest
|
|---|
| 7720 |
|
|---|
| 7721 | 2013-01-11 Stefano Lattarini <stefano.lattarini@gmail.com>
|
|---|
| 7722 |
|
|---|
| 7723 | build: remove redundant use of $(INCLUDES)
|
|---|
| 7724 | * lib/Makefile.am (INCLUDES): Remove. Automake automatically adds
|
|---|
| 7725 | $(srcdir) and $(top_builddir) to the C preprocessor search path.
|
|---|
| 7726 | INCLUDES is deprecated in Automake 1.13 (causing a runtime
|
|---|
| 7727 | warning), and will be removed in Automake 1.14.
|
|---|
| 7728 |
|
|---|
| 7729 | 2013-01-04 Jim Meyering <jim@meyering.net>
|
|---|
| 7730 |
|
|---|
| 7731 | build: update gnulib submodule to latest
|
|---|
| 7732 |
|
|---|
| 7733 | maint: update all copyright year number ranges
|
|---|
| 7734 | Run "make update-copyright".
|
|---|
| 7735 |
|
|---|
| 7736 | 2012-11-20 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 7737 |
|
|---|
| 7738 | grep: normalize diagnostics
|
|---|
| 7739 | * src/pcresearch.c (Pcompile): Use similar format diagnostics
|
|---|
| 7740 | as elsewhere, and translate them.
|
|---|
| 7741 |
|
|---|
| 7742 | 2012-11-19 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 7743 |
|
|---|
| 7744 | grep: diagnose read errors from -f dir, porting to Solaris
|
|---|
| 7745 | Problem reported by Dennis Clarke for Solaris 10 in
|
|---|
| 7746 | <http://lists.gnu.org/archive/html/bug-grep/2012-11/msg00009.html>.
|
|---|
| 7747 | * src/main.c (main): For -f F, diagnose any read errors
|
|---|
| 7748 | encountered when reading F.
|
|---|
| 7749 | * tests/Makefile.am (XFAIL_TESTS): Remove grep-dir.
|
|---|
| 7750 | * tests/grep-dir: Don't assume that directories cannot be read
|
|---|
| 7751 | via fread, as POSIX allows this and it can happen on Solaris.
|
|---|
| 7752 |
|
|---|
| 7753 | 2012-11-09 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 7754 |
|
|---|
| 7755 | pcre: add PCRE-JIT support for grep
|
|---|
| 7756 | * NEWS: Document new feature.
|
|---|
| 7757 | * src/pcresearch.c [PCRE_STUDY_JIT_COMPILE] (jit_stack): New.
|
|---|
| 7758 | [PCRE_STUDY_JIT_COMPILE] (Pcompile): JIT-compile the regular expression
|
|---|
| 7759 | and allocate a stack for it. Based on a patch from Zoltan Herczeg.
|
|---|
| 7760 | * THANKS: Add Zoltan to the list.
|
|---|
| 7761 |
|
|---|
| 7762 | 2012-10-24 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 7763 |
|
|---|
| 7764 | build: go back to AC_PROG_CC
|
|---|
| 7765 | * configure.ac: Go back to using AC_PROG_CC rather than AC_PROG_CC_STDC,
|
|---|
| 7766 | as the latter is obsolescent and the Autoconf bug involving the former
|
|---|
| 7767 | has been fixed.
|
|---|
| 7768 |
|
|---|
| 7769 | 2012-10-24 Jim Meyering <jim@meyering.net>
|
|---|
| 7770 |
|
|---|
| 7771 | build: use AC_PROG_CC_STDC rather than AC_PROG_CC
|
|---|
| 7772 | * configure.ac: Use AC_PROG_CC_STDC rather than AC_PROG_CC,
|
|---|
| 7773 | to accommodate autoconf-2.69-37+.
|
|---|
| 7774 |
|
|---|
| 7775 | build: update gnulib submodule to latest
|
|---|
| 7776 |
|
|---|
| 7777 | 2012-10-23 Eric Blake <eblake@redhat.com>
|
|---|
| 7778 |
|
|---|
| 7779 | build: default to --enable-gcc-warnings in a git tree
|
|---|
| 7780 | Anyone building from cloned sources can be assumed to have a new
|
|---|
| 7781 | enough environment, such that enabling gcc warnings by default will
|
|---|
| 7782 | be useful. Tarballs still default to no warnings, and the default
|
|---|
| 7783 | can still be overridden with --disable-gcc-warnings.
|
|---|
| 7784 | * configure.ac (gl_gcc_warnings): Set default based on environment.
|
|---|
| 7785 |
|
|---|
| 7786 | 2012-10-03 Jim Meyering <meyering@redhat.com>
|
|---|
| 7787 |
|
|---|
| 7788 | maint: factor out STREQ definition
|
|---|
| 7789 | * src/main.c (STREQ): Remove definition.
|
|---|
| 7790 | * src/pcresearch.c: (STREQ): Likewise.
|
|---|
| 7791 | * src/system.h (STREQ): Define it here instead.
|
|---|
| 7792 |
|
|---|
| 7793 | maint: correct syntax-check failures; adjust NEWS
|
|---|
| 7794 | * tests/pcre-utf8: Reverse order of compare arguments.
|
|---|
| 7795 | Remove all copyright year numbers except 2012.
|
|---|
| 7796 | Use skip_ "diagnostic...", rather than a bare "exit 77".
|
|---|
| 7797 | * NEWS: Start with a concise description of the bug.
|
|---|
| 7798 | * src/pcresearch.c (STREQ): Define, so that we can...
|
|---|
| 7799 | (Pcompile): use STREQ, not strcmp.
|
|---|
| 7800 |
|
|---|
| 7801 | 2012-10-03 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 7802 |
|
|---|
| 7803 | tests: include UTF-8 testcases for grep -P
|
|---|
| 7804 | * tests/Makefile.am (TESTS): Add pcre-utf8.
|
|---|
| 7805 | * tests/pcre-utf8: New file.
|
|---|
| 7806 |
|
|---|
| 7807 | 2012-10-03 Petr Pisar <ppisar@redhat.com>
|
|---|
| 7808 |
|
|---|
| 7809 | pcresearch: set UTF-8 flag correctly for UTF-8 locales
|
|---|
| 7810 | Otherwise, Unicode properties (\p{XXX}) do not work with characters
|
|---|
| 7811 | outside the 7-bit ASCII character set.
|
|---|
| 7812 |
|
|---|
| 7813 | * src/pcresearch.c (Pcompile): Look for UTF-8 locales and set PCRE_UTF8
|
|---|
| 7814 | if one is found.
|
|---|
| 7815 |
|
|---|
| 7816 | 2012-10-03 Jaroslav Škarvada <jskarvad@redhat.com>
|
|---|
| 7817 |
|
|---|
| 7818 | doc: fix a formatting bug in grep.1 template
|
|---|
| 7819 | * doc/grep.in.1: Insert .TP before the paragraph describing
|
|---|
| 7820 | --dereference-recursive (-R).
|
|---|
| 7821 |
|
|---|
| 7822 | 2012-10-03 Jim Meyering <meyering@redhat.com>
|
|---|
| 7823 |
|
|---|
| 7824 | maint: placate gcc's -Wjump-misses-init warning
|
|---|
| 7825 | * src/kwsearch.c (Fexecute): Replace a "goto" and "return" with
|
|---|
| 7826 | a simple return statement, eliminating the label, since that was
|
|---|
| 7827 | the sole use.
|
|---|
| 7828 | * src/dfasearch.c (EGexecute): Likewise.
|
|---|
| 7829 |
|
|---|
| 7830 | 2012-09-01 Jim Meyering <meyering@redhat.com>
|
|---|
| 7831 |
|
|---|
| 7832 | build: update gnulib submodule to latest
|
|---|
| 7833 |
|
|---|
| 7834 | 2012-09-01 Eric Blake <eblake@redhat.com>
|
|---|
| 7835 |
|
|---|
| 7836 | build: work with new glibc when not optimizing
|
|---|
| 7837 | Starting with glibc 2.15, the system headers refuse to compile
|
|---|
| 7838 | unconditional use of FORTIFY_SOURCE if optimization is disabled
|
|---|
| 7839 | but -Werror is in effect.
|
|---|
| 7840 |
|
|---|
| 7841 | * configure.ac (FORTIFY_SOURCE): Make conditional.
|
|---|
| 7842 |
|
|---|
| 7843 | 2012-08-19 Jim Meyering <meyering@redhat.com>
|
|---|
| 7844 |
|
|---|
| 7845 | maint: post-release administrivia
|
|---|
| 7846 | * NEWS: Add header line for next release.
|
|---|
| 7847 | * .prev-version: Record previous version.
|
|---|
| 7848 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 7849 |
|
|---|
| 7850 | version 2.14
|
|---|
| 7851 | * NEWS: Record release date.
|
|---|
| 7852 |
|
|---|
| 7853 | 2012-08-07 Jim Meyering <meyering@redhat.com>
|
|---|
| 7854 |
|
|---|
| 7855 | build: update gnulib and bootstrap
|
|---|
| 7856 |
|
|---|
| 7857 | tests: test for bug with -i and ^$ in a multi-byte locale
|
|---|
| 7858 | * tests/empty-line-mb: New file.
|
|---|
| 7859 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 7860 |
|
|---|
| 7861 | grep -i '^$' in a multi-byte locale could report a false match
|
|---|
| 7862 | * src/dfasearch.c (EGexecute): Do not match the sentinel "newline"
|
|---|
| 7863 | that is appended to each buffer.
|
|---|
| 7864 | This bug may sound like a big deal (it certainly surprised me), but
|
|---|
| 7865 | realize that only the empty-line-matching regular expression '^$'
|
|---|
| 7866 | can trigger it, and then only when you add the unnecessary (and
|
|---|
| 7867 | arguably superfluous) -i, *and* run the command in a multi-byte
|
|---|
| 7868 | locale. Using a multi-byte locale for such a regular expression
|
|---|
| 7869 | is also pointless, and hurts performance.
|
|---|
| 7870 | * NEWS (Bug fixes): Mention it.
|
|---|
| 7871 | Reported by Alexander Katassonov <katasso@gmx.de>
|
|---|
| 7872 |
|
|---|
| 7873 | 2012-08-06 Jim Meyering <meyering@redhat.com>
|
|---|
| 7874 |
|
|---|
| 7875 | tests: fix a skip diagnostic that mentioned the wrong locale
|
|---|
| 7876 | * tests/init.cfg (require_tr_utf8_locale_): s/en_US/tr_TR/
|
|---|
| 7877 |
|
|---|
| 7878 | 2012-08-02 Jim Meyering <meyering@redhat.com>
|
|---|
| 7879 |
|
|---|
| 7880 | tests: skip failing test on FS/system that lack SEEK_HOLE support
|
|---|
| 7881 | * tests/big-hole: Test for SEEK_HOLE support. If not available,
|
|---|
| 7882 | skip this test. Hence, this test is now skipped on linux-3.5.0 with
|
|---|
| 7883 | ext4 or tmpfs. The test runs (and passes) with at least btrfs, xfs,
|
|---|
| 7884 | or ocfs2.
|
|---|
| 7885 | * bootstrap.conf (gnulib_modules): Use the perl module.
|
|---|
| 7886 |
|
|---|
| 7887 | 2012-07-30 Jim Meyering <meyering@redhat.com>
|
|---|
| 7888 |
|
|---|
| 7889 | maint: optimize long-line processing
|
|---|
| 7890 | * src/main.c (grep): Use memrchr rather than an open-coded loop,
|
|---|
| 7891 | reducing the cost of the replaced code by 50% when processing very
|
|---|
| 7892 | long lines. If there were a rawmemrchr function (analogous to glibc's
|
|---|
| 7893 | rawmemchr), then the performance improvement would be even greater.
|
|---|
| 7894 |
|
|---|
| 7895 | 2012-07-27 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 7896 |
|
|---|
| 7897 | maint: remove stat-size
|
|---|
| 7898 | * bootstrap.conf (gnulib_modules): Remove stat-size.
|
|---|
| 7899 | * src/main.c: Don't include stat-size.h; no longer needed.
|
|---|
| 7900 |
|
|---|
| 7901 | grep: don't falsely report compressed text files as binary
|
|---|
| 7902 | * NEWS: Document this.
|
|---|
| 7903 | * src/main.c (file_is_binary): Remove the heuristic based on
|
|---|
| 7904 | st_blocks, as it does not work for compressed file systems.
|
|---|
| 7905 | On Solaris, it'd be cheap to test whether the file system is known
|
|---|
| 7906 | to be uncompressed, which allow the heuristic, but Solaris has
|
|---|
| 7907 | SEEK_HOLE so there's little point.
|
|---|
| 7908 |
|
|---|
| 7909 | grep: don't falsely report tiny text files as binary
|
|---|
| 7910 | * NEWS: Document this.
|
|---|
| 7911 | * src/main.c (file_is_binary): When we are already at apparent
|
|---|
| 7912 | EOF, skip the file-size check, as some servers use zero blocks
|
|---|
| 7913 | to store binary files. Reported by Martin Carroll in
|
|---|
| 7914 | <http://lists.gnu.org/archive/html/bug-grep/2012-07/msg00016.html>.
|
|---|
| 7915 |
|
|---|
| 7916 | 2012-07-26 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 7917 |
|
|---|
| 7918 | doc: document -r/-R in man page
|
|---|
| 7919 | * doc/grep.in.1: Document -r vs. -R.
|
|---|
| 7920 |
|
|---|
| 7921 | 2012-07-21 Jim Meyering <meyering@redhat.com>
|
|---|
| 7922 |
|
|---|
| 7923 | tests: avoid false positive upon kernel OOM-kill
|
|---|
| 7924 | * tests/big-match (skip_diagnostic): Handle case of 139 (SIGKILL)
|
|---|
| 7925 | with no diagnostic.
|
|---|
| 7926 |
|
|---|
| 7927 | build: update gnulib and bootstrap
|
|---|
| 7928 |
|
|---|
| 7929 | maint: fix misspellings in old ChangeLog
|
|---|
| 7930 | * ChangeLog-2009: Fix typos.
|
|---|
| 7931 |
|
|---|
| 7932 | 2012-07-19 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 7933 |
|
|---|
| 7934 | grep: fix ptrdiff/size_t clash
|
|---|
| 7935 | Reported by Jaroslav Škarvada in <http://savannah.gnu.org/bugs/?36883>.
|
|---|
| 7936 | * src/dfasearch.c (EGexecute): Use size_t, not ptrdiff_t, for lengths.
|
|---|
| 7937 | Use regoff_t to store re_match's output, and test it before converting
|
|---|
| 7938 | it to size_t.
|
|---|
| 7939 |
|
|---|
| 7940 | 2012-07-06 Jim Meyering <meyering@redhat.com>
|
|---|
| 7941 |
|
|---|
| 7942 | maint: correct log typo, to reflect in generated ChangeLog
|
|---|
| 7943 | * Makefile.am (gen-ChangeLog): Use --amend, now that we must
|
|---|
| 7944 | make our first log correction.
|
|---|
| 7945 | * build-aux/git-log-fix: New file.
|
|---|
| 7946 |
|
|---|
| 7947 | 2012-07-04 Jim Meyering <meyering@redhat.com>
|
|---|
| 7948 |
|
|---|
| 7949 | maint: post-release administrivia
|
|---|
| 7950 | * NEWS: Add header line for next release.
|
|---|
| 7951 | * .prev-version: Record previous version.
|
|---|
| 7952 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 7953 |
|
|---|
| 7954 | version 2.13
|
|---|
| 7955 | * NEWS: Record release date.
|
|---|
| 7956 |
|
|---|
| 7957 | build: update gnulib submodule, bootstrap, init.sh
|
|---|
| 7958 |
|
|---|
| 7959 | 2012-06-17 Jim Meyering <meyering@redhat.com>
|
|---|
| 7960 |
|
|---|
| 7961 | tests: add another turkish-I-related test case
|
|---|
| 7962 | * tests/turkish-I-without-dot: Also exercise the case in which
|
|---|
| 7963 | the original string and the lower-case buffer have precisely
|
|---|
| 7964 | the same length (22 bytes here), yet internal offsets do differ.
|
|---|
| 7965 |
|
|---|
| 7966 | 2012-06-16 Jim Meyering <meyering@redhat.com>
|
|---|
| 7967 |
|
|---|
| 7968 | grep -i: work also when converting to lower-case inflates byte count
|
|---|
| 7969 | Commit v2.12-16-g7aa698d addressed the case in which the lower-case
|
|---|
| 7970 | representation of an input byte occupies fewer bytes than the original.
|
|---|
| 7971 | However, even with commit v2.12-20-g074842d, grep -i would still
|
|---|
| 7972 | misbehave when converting a character to lower-case increased its
|
|---|
| 7973 | byte count. The map-manipulation code assumed that the case conversion
|
|---|
| 7974 | could only shrink the byte count. With the consideration that it may
|
|---|
| 7975 | also inflate it, the deltas recorded in the map array must be signed,
|
|---|
| 7976 | and we must account for the one-to-two-or-more mapping when the
|
|---|
| 7977 | original-to-lower-case conversion causes the byte count to increase.
|
|---|
| 7978 | * src/searchutils.c (mbtolower): When a lower-case character occupies
|
|---|
| 7979 | more than one byte, set its remaining map slots to zero. Change the
|
|---|
| 7980 | type of the map to be signed, and compute the change in character
|
|---|
| 7981 | byte count as new_length - old_length.
|
|---|
| 7982 | * src/search.h: Include <stdint.h>, for decl of intmax_t.
|
|---|
| 7983 | (mb_case_map_apply): Adjust for signed increments:
|
|---|
| 7984 | each map entry is now signed.
|
|---|
| 7985 | (mb_len_map_t): Define type. Thanks to Paul Eggert for noticing
|
|---|
| 7986 | in review that using a bare "char" as the base type would be wrong on
|
|---|
| 7987 | systems for which it is a signed type (as with gcc's -funsigned-char).
|
|---|
| 7988 | * src/kwsearch.c (Fcompile, Fexecute): Likewise.
|
|---|
| 7989 | * src/dfasearch.c (kwsincr_case, EGexecute): Likewise.
|
|---|
| 7990 | * tests/turkish-I-without-dot: New test. Thanks to Paolo Bonzini
|
|---|
| 7991 | for the tip that in the tr_TR.utf8 locale, mapping "I" to lower case
|
|---|
| 7992 | increases the character's byte count.
|
|---|
| 7993 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 7994 | * tests/init.cfg (require_tr_utf8_locale_): New function.
|
|---|
| 7995 | * NEWS (Bug fixes): Expand the existing entry.
|
|---|
| 7996 |
|
|---|
| 7997 | 2012-06-12 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 7998 |
|
|---|
| 7999 | grep: handle -i when chars differ in length but line does not
|
|---|
| 8000 | * src/searchutils.c (mbtolower): Return the map back to the caller
|
|---|
| 8001 | if any input character's length differs from the corresponding output
|
|---|
| 8002 | character's, not merely if the total string length differs.
|
|---|
| 8003 | Problem reported by Johannes Meixner in
|
|---|
| 8004 | <http://lists.gnu.org/archive/html/bug-grep/2012-06/msg00029.html>.
|
|---|
| 8005 |
|
|---|
| 8006 | 2012-06-07 Jim Meyering <meyering@redhat.com>
|
|---|
| 8007 |
|
|---|
| 8008 | tests: extend coverage of dfa.c's match_mb_charset
|
|---|
| 8009 | Add a test case to increase test coverage of part of dfa.c (the DFA
|
|---|
| 8010 | matcher used by grep and gawk). While thinking about removing the few
|
|---|
| 8011 | remaining uses of strncpy in dfa.c, I found that none of the existing
|
|---|
| 8012 | tests covered the 40+ lines of code at the end of match_mb_charset,
|
|---|
| 8013 | so constructed this test case to demonstrate that it's not dead code
|
|---|
| 8014 | * tests/dfa-coverage: New test, for improved coverage.
|
|---|
| 8015 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 8016 |
|
|---|
| 8017 | 2012-06-05 Jim Meyering <meyering@redhat.com>
|
|---|
| 8018 |
|
|---|
| 8019 | build: fix a subtly twisted "make distcheck" failure
|
|---|
| 8020 | "make distcheck" would fail when, during a test build,
|
|---|
| 8021 | an attempt to overwrite the deliberately-write-protected
|
|---|
| 8022 | $(srcdir)/grep.pot file would fail.
|
|---|
| 8023 | * bootstrap.conf (bootstrap_epilogue): Don't let the existence of
|
|---|
| 8024 | a large sparse file in the build directory induce "make distcheck"
|
|---|
| 8025 | failure. The existence of a large sparse test file named 8T-or-so
|
|---|
| 8026 | would make po/Makefile.in.in's use of grep (to search for "GNU grep"
|
|---|
| 8027 | as an indication that this is a GNU package) exit 2 without generating
|
|---|
| 8028 | any output, which made the first xgettext use --package-name=grep,
|
|---|
| 8029 | while that same search for "GNU grep" would succeed when run
|
|---|
| 8030 | from a pristine from-tarball build, thus making the second
|
|---|
| 8031 | xgettext invocation use --package-name='GNU grep'.
|
|---|
| 8032 | That mismatch:
|
|---|
| 8033 | -"Project-Id-Version: grep 2.12.18-1080\n"
|
|---|
| 8034 | +"Project-Id-Version: GNU grep 2.12.18-1080\n"
|
|---|
| 8035 | led to the attempt by Makefile.in.in's grep.pot-update rule to
|
|---|
| 8036 | overwrite ../../grep.pot in the read-only po/ source directory.
|
|---|
| 8037 |
|
|---|
| 8038 | 2012-06-03 Jim Meyering <meyering@redhat.com>
|
|---|
| 8039 |
|
|---|
| 8040 | build: update gnulib submodule, bootstrap and init.sh
|
|---|
| 8041 | cfg.mk: Exempt dfa.c from the new no-strncpy test, for now.
|
|---|
| 8042 |
|
|---|
| 8043 | 2012-06-02 Jim Meyering <meyering@redhat.com>
|
|---|
| 8044 |
|
|---|
| 8045 | grep: fix how -i works with a match containing the Turkish I-with-dot
|
|---|
| 8046 | Fix a long-standing problem in the way grep's -i interacts with
|
|---|
| 8047 | data whose byte count changes when we convert it to lower case.
|
|---|
| 8048 | For example, the UTF-8 Turkish I-with-dot (İ) occupies two bytes,
|
|---|
| 8049 | but its lower case analog, i, occupies just one byte. The code
|
|---|
| 8050 | converts both search string and the haystack data to lower case,
|
|---|
| 8051 | and then searches for the modified string in the modified buffer.
|
|---|
| 8052 | The trouble arose when using a lowercase buffer <offset,length>
|
|---|
| 8053 | pair to manipulate the original (longer) buffer.
|
|---|
| 8054 |
|
|---|
| 8055 | The solution is to change mbtolower to return additional information:
|
|---|
| 8056 | a malloc'd mapping vector. With that, the caller maps the lowercase-
|
|---|
| 8057 | relative <offset,length> to numbers that refer to the original buffer.
|
|---|
| 8058 | This mapping is used only when lengths actually differ, so the cost
|
|---|
| 8059 | in general should be small.
|
|---|
| 8060 |
|
|---|
| 8061 | * src/searchutils.c (mbtolower): Add the new map parameter.
|
|---|
| 8062 | * src/search.h (mb_case_map_apply): New function.
|
|---|
| 8063 | * src/kwsearch.c (Fexecute): Update mbtolower caller, and upon
|
|---|
| 8064 | success, apply the new map.
|
|---|
| 8065 | * src/dfasearch.c (EGexecute): Likewise.
|
|---|
| 8066 | * tests/Makefile.am (XFAIL_TESTS): Remove turkish-I from this list;
|
|---|
| 8067 | that test is no longer expected to fail.
|
|---|
| 8068 | * NEWS (Bug fixes): Mention it.
|
|---|
| 8069 | Reported by Ilya Basin in
|
|---|
| 8070 | http://thread.gmane.org/gmane.comp.gnu.grep.bugs/3413 and later
|
|---|
| 8071 | by Strahinja Kustudic in http://savannah.gnu.org/bugs/?36567
|
|---|
| 8072 |
|
|---|
| 8073 | 2012-06-01 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 8074 |
|
|---|
| 8075 | grep: remove unnecessary "what-if-signal?" code
|
|---|
| 8076 | * src/main.c (fillbuf): Don't worry about EINTR when closing --
|
|---|
| 8077 | not possible, since we're not catching signals.
|
|---|
| 8078 |
|
|---|
| 8079 | 2012-05-16 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 8080 |
|
|---|
| 8081 | grep: avoid nominal integer overflow
|
|---|
| 8082 | * src/dfa.c (add_utf8_anychar): Avoid signed integer overflow.
|
|---|
| 8083 | Although this works on all platforms we know about, strictly
|
|---|
| 8084 | speaking the behavior is undefined, and Sun C 5.8 warns about it.
|
|---|
| 8085 |
|
|---|
| 8086 | 2012-05-15 Jim Meyering <meyering@redhat.com>
|
|---|
| 8087 |
|
|---|
| 8088 | maint: avoid nit-picky syntax-check test failure; tweak big-hole test
|
|---|
| 8089 | * NEWS: Restore deleted newline in "old" NEWS, to fix a syntax-check
|
|---|
| 8090 | test failure.
|
|---|
| 8091 | * tests/big-hole: Use awk, rather than a shell loop: saves 3000 lines
|
|---|
| 8092 | of verbose shell output in the .log file.
|
|---|
| 8093 |
|
|---|
| 8094 | 2012-05-15 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 8095 |
|
|---|
| 8096 | grep: sparse files are now considered binary
|
|---|
| 8097 | * NEWS: Document this.
|
|---|
| 8098 | * doc/grep.texi (File and Directory Selection): Likewise.
|
|---|
| 8099 | * bootstrap.conf (gnulib_modules): Add stat-size.
|
|---|
| 8100 | * src/main.c: Include stat-size.h.
|
|---|
| 8101 | (usable_st_size): New function, mostly stolen from coreutils.
|
|---|
| 8102 | (fillbuf): Use it.
|
|---|
| 8103 | (file_is_binary): New function, which looks for holes too.
|
|---|
| 8104 | (grep): Use it.
|
|---|
| 8105 | * tests/Makefile.am (TESTS): Add big-hole.
|
|---|
| 8106 | * tests/big-hole: New file.
|
|---|
| 8107 |
|
|---|
| 8108 | 2012-05-06 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 8109 |
|
|---|
| 8110 | maint: quote 'like this' or "like this", not `like this'
|
|---|
| 8111 | See <http://lists.gnu.org/archive/html/bug-grep/2012-01/msg00125.html>.
|
|---|
| 8112 | * ChangeLog-2009, HACKING, NEWS, README-hacking, cfg.mk, configure.ac:
|
|---|
| 8113 | * lib/colorize-w32.c, m4/pcre.m4:
|
|---|
| 8114 | * src/Makefile.am, src/dfa.c, src/dosbuf.c, src/main.c:
|
|---|
| 8115 | * tests/backref, tests/help-version, tests/tests:
|
|---|
| 8116 | In commentary, quote 'like this' or "like this" rather than
|
|---|
| 8117 | `like this' or ``like this''.
|
|---|
| 8118 | * cfg.mk (old_NEWS_hash): Update due to changed old NEWS.
|
|---|
| 8119 | * doc/grep.texi (General Output Control): Quote sample text
|
|---|
| 8120 | with @samp, not with `...'.
|
|---|
| 8121 | * src/main.c (usage):
|
|---|
| 8122 | * tests/help-version: Quote 'like this' rather than `like this'
|
|---|
| 8123 | in diagnostics.
|
|---|
| 8124 |
|
|---|
| 8125 | exclude: process exclude and include directives in order
|
|---|
| 8126 | Also, change exclude and include directives so that they apply to
|
|---|
| 8127 | command-line arguments too. This restores the pre-2.6 behavior,
|
|---|
| 8128 | and fixes a bug reported by Quentin Arce in
|
|---|
| 8129 | <http://lists.gnu.org/archive/html/bug-grep/2012-04/msg00056.html>.
|
|---|
| 8130 | * NEWS: Document this.
|
|---|
| 8131 | * src/main.c (included_patterns): Remove. All uses removed.
|
|---|
| 8132 | (skipped_file): New function.
|
|---|
| 8133 | (grepdirent): New arg command_line; all callers changed. This is
|
|---|
| 8134 | needed because non-command-line files can invoke fts_open, and
|
|---|
| 8135 | their directory entries need to be distinguished from top-level
|
|---|
| 8136 | directory entries. Move code into the new skipped_file function.
|
|---|
| 8137 | (grepdesc): Check whether a command-line argument should be skipped.
|
|---|
| 8138 | (main): --include and --exclude options now share excluded_patterns
|
|---|
| 8139 | rather than having separate variables included_patterns and
|
|---|
| 8140 | excluded_patterns.
|
|---|
| 8141 | * tests/include-exclude: Add a test to detect the fixed bug.
|
|---|
| 8142 |
|
|---|
| 8143 | build: update gnulib submodule to latest
|
|---|
| 8144 |
|
|---|
| 8145 | 2012-04-30 Jim Meyering <meyering@redhat.com>
|
|---|
| 8146 |
|
|---|
| 8147 | cosmetic: binary operator goes *after* the newline, when split
|
|---|
| 8148 | * src/dfa.c (match_mb_charset): Join split lines.
|
|---|
| 8149 | (parse_bracket_exp): Move "||" from end of first split line
|
|---|
| 8150 | to the beginning of the continued line.
|
|---|
| 8151 | * src/dosbuf.c (dossified_pos): Likewise, but for "&&".
|
|---|
| 8152 |
|
|---|
| 8153 | grep: -K is not an option: remove it from list
|
|---|
| 8154 | The presence of "K" in the short-option string meant that
|
|---|
| 8155 | an erroneous "grep -K ..." would fail with a bare Usage/Try...
|
|---|
| 8156 | message, without the usual "invalid option -- 'K'". With this
|
|---|
| 8157 | removal, now grep prints the expected invalid option diagnostic.
|
|---|
| 8158 | * src/main.c (short_options): Remove "K".
|
|---|
| 8159 | Reported by Петр Досычев in
|
|---|
| 8160 | http://thread.gmane.org/gmane.comp.gnu.grep.bugs/4488
|
|---|
| 8161 |
|
|---|
| 8162 | 2012-04-29 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 8163 |
|
|---|
| 8164 | dfa: small fixes to single-byte range computation
|
|---|
| 8165 | * src/dfa.c (parse_bracket_exp): Do not call regexec with an invalid
|
|---|
| 8166 | subject. Move declarations before all statements.
|
|---|
| 8167 |
|
|---|
| 8168 | 2012-04-27 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 8169 |
|
|---|
| 8170 | dfa: do not use hard-locale
|
|---|
| 8171 | * bootstrap.conf (gnulib_modules): Remove hard-locale.
|
|---|
| 8172 | * src/dfa.c (hard_LC_COLLATE): Remove.
|
|---|
| 8173 | (dfaparse): Do not initialize it.
|
|---|
| 8174 | (parse_bracket_exp): Always go through system regex matcher to find
|
|---|
| 8175 | single byte characters matching a range.
|
|---|
| 8176 |
|
|---|
| 8177 | drop support for Makefile.boot
|
|---|
| 8178 | * Makefile.am: Do not distribute README-boot and Makefile.boot.
|
|---|
| 8179 | * NEWS: Mention this change.
|
|---|
| 8180 | * README-alpha: Do not mention README-boot and Makefile.boot.
|
|---|
| 8181 | * Makefile.boot: Remove.
|
|---|
| 8182 | * README-boot: Remove.
|
|---|
| 8183 |
|
|---|
| 8184 | 2012-04-27 Aharon Robbins <arnold@skeeve.com>
|
|---|
| 8185 |
|
|---|
| 8186 | dfa: do not use strcoll to match multibyte characters in ranges
|
|---|
| 8187 | This does not affect the behavior of grep, which always defers
|
|---|
| 8188 | to glibc or gnulib when matching ranges.
|
|---|
| 8189 | * src/dfa.c (match_mb_charset): Compare wc directly to the range
|
|---|
| 8190 | endpoints.
|
|---|
| 8191 |
|
|---|
| 8192 | dfa: include stdbool.h explicitly
|
|---|
| 8193 | * src/dfa.c: Include stdbool.h explicitly
|
|---|
| 8194 |
|
|---|
| 8195 | 2012-04-23 Jim Meyering <meyering@redhat.com>
|
|---|
| 8196 |
|
|---|
| 8197 | maint: post-release administrivia
|
|---|
| 8198 | * NEWS: Add header line for next release.
|
|---|
| 8199 | * .prev-version: Record previous version.
|
|---|
| 8200 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 8201 |
|
|---|
| 8202 | version 2.12
|
|---|
| 8203 | * NEWS: Record release date.
|
|---|
| 8204 |
|
|---|
| 8205 | build: update gnulib submodule to latest
|
|---|
| 8206 |
|
|---|
| 8207 | tests: skip annoyingly long gnulib lock tests
|
|---|
| 8208 | * bootstrap.conf (avoided_gnulib_modules): Define.
|
|---|
| 8209 | (gnulib_tool_option_extras): Use it.
|
|---|
| 8210 |
|
|---|
| 8211 | 2012-04-22 Jim Meyering <meyering@redhat.com>
|
|---|
| 8212 |
|
|---|
| 8213 | tests: avoid spurious quote-mismatch failure on OS/X
|
|---|
| 8214 | * tests/in-eq-out-infloop: Simplify expected error output, eliminating
|
|---|
| 8215 | expected quotes altogether, thus avoiding spurious OS/X-specific
|
|---|
| 8216 | failure due to mismatch of multi-byte vs. single-byte quotes.
|
|---|
| 8217 |
|
|---|
| 8218 | 2012-04-17 Jim Meyering <meyering@redhat.com>
|
|---|
| 8219 |
|
|---|
| 8220 | build: update gnulib submodule to latest
|
|---|
| 8221 | * bootstrap: Also update this file.
|
|---|
| 8222 |
|
|---|
| 8223 | 2012-04-17 Jim Meyering <meyering@redhat.com>
|
|---|
| 8224 |
|
|---|
| 8225 | grep: fix --devices=ACTION (-D) so stdin is once again exempt
|
|---|
| 8226 | An oversight in the 2.11 changes made it so "echo x|grep x" would
|
|---|
| 8227 | fail for those who set GREP_OPTIONS=--devices=skip.
|
|---|
| 8228 |
|
|---|
| 8229 | * src/main.c (grepdesc): Ignore skip-related options when reading
|
|---|
| 8230 | from standard input.
|
|---|
| 8231 | * tests/skip-device: New file. Test for the above.
|
|---|
| 8232 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 8233 | * doc/grep.texi (File and Directory Selection): Clarify this point,
|
|---|
| 8234 | documenting the stdin exemption.
|
|---|
| 8235 | * NEWS (Bug fixes): Mention it, and add a few "[fixed in ...] notes.
|
|---|
| 8236 | Reported by Tino Keitel in http://bugs.debian.org/669084,
|
|---|
| 8237 | and forwarded to bug-grep by Aníbal Monsalve Salazar.
|
|---|
| 8238 |
|
|---|
| 8239 | 2012-04-13 Jim Meyering <meyering@redhat.com>
|
|---|
| 8240 |
|
|---|
| 8241 | maint: dfa: correct bogus formatting
|
|---|
| 8242 | * src/dfa.c (transit_state, dfaexec): s/++ * VAR/++*VAR/
|
|---|
| 8243 |
|
|---|
| 8244 | maint: dfa: add/improve comments
|
|---|
| 8245 | * src/dfa.c (transit_state_consume_1char): Note always-ignored
|
|---|
| 8246 | return value.
|
|---|
| 8247 | Fix typos: s/equivalent class/equivalence class/.
|
|---|
| 8248 |
|
|---|
| 8249 | maint: dfa: avoid unnecessary uses of strcpy/strncpy
|
|---|
| 8250 | * src/dfa.c (icatalloc): Use memcpy, not strcpy, given the length.
|
|---|
| 8251 | (dfamust): Combine MALLOC+strcpy into cleaner xmemdup.
|
|---|
| 8252 | (parse_bracket_exp): Likewise, but replace a use of strncpy.
|
|---|
| 8253 |
|
|---|
| 8254 | grep: handle symlinked directory loops as usual
|
|---|
| 8255 | * src/main.c (grepfile): Treat EMLINK just like ELOOP, for
|
|---|
| 8256 | systems like FreeBSD 9.0 on which we would otherwise report
|
|---|
| 8257 | "Too many links" rather than ignoring that type of failure.
|
|---|
| 8258 | E.g., "mkdir d; cd d; ln -s . a; grep -r ^" would print
|
|---|
| 8259 | grep: a: Too many links and would exit with status 2.
|
|---|
| 8260 | Now, it prints nothing and exits with status 1, as before.
|
|---|
| 8261 | Reported by Nelson H. F. Beebe.
|
|---|
| 8262 |
|
|---|
| 8263 | tests: avoid spurious failure of the symlink test
|
|---|
| 8264 | * tests/symlink: Ignore spurious "Binary file d matches" on
|
|---|
| 8265 | systems for which reading from a directory actually succeeds.
|
|---|
| 8266 | Reported by Bruno Haible and Nelson Beebe.
|
|---|
| 8267 |
|
|---|
| 8268 | 2012-04-09 Jim Meyering <meyering@redhat.com>
|
|---|
| 8269 |
|
|---|
| 8270 | tests: avoid syntax-check failure: reverse compare arguments
|
|---|
| 8271 | * tests/repetition-overflow: Fix reversed compare arguments.
|
|---|
| 8272 |
|
|---|
| 8273 | build: update gnulib submodule to latest
|
|---|
| 8274 |
|
|---|
| 8275 | 2012-03-18 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 8276 |
|
|---|
| 8277 | grep: report overflow for ERE a{1000000000}
|
|---|
| 8278 | * NEWS: Document this.
|
|---|
| 8279 | * src/dfa.c (MIN): New macro.
|
|---|
| 8280 | (lex): Lexically analyze the repeat-count operator once, not
|
|---|
| 8281 | twice; the double-scan complicated the code and made it harder to
|
|---|
| 8282 | understand and fix. Adjust the repeat-count parsing so that it
|
|---|
| 8283 | better matches the behavior of the regex code, in three ways:
|
|---|
| 8284 | 1. Diagnose too-large repeat counts rather than treating them as
|
|---|
| 8285 | literal characters. 2. Use RE_INVALID_INTERVAL_ORD, not
|
|---|
| 8286 | RE_NO_BK_BRACES, to decide whether to treat invalid-syntax {...}s
|
|---|
| 8287 | as literals. 3. Use the same wording for {...}-related
|
|---|
| 8288 | diagnostics that the regex code uses.
|
|---|
| 8289 | * tests/bre.tests, tests/ere.tests, tests/repetition-overflow:
|
|---|
| 8290 | Adjust to match new behavior, and add a few tests.
|
|---|
| 8291 | * cfg.mk (exclude_file_name_regexp--sc_error_message_uppercase):
|
|---|
| 8292 | New macro, since the diagnostics start with uppercase letters.
|
|---|
| 8293 |
|
|---|
| 8294 | 2012-03-14 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 8295 |
|
|---|
| 8296 | grep: -r no longer follows symlinks; use fts
|
|---|
| 8297 | Change -r to follow only command-line symlinks, and by default to
|
|---|
| 8298 | read only devices named on the command line. This is a simple
|
|---|
| 8299 | way to get a more-useful behavior when searching random
|
|---|
| 8300 | directories; the idea is to use 'find' if you want something fancy.
|
|---|
| 8301 | -R acts as before and gets a new alias --dereference-recursive.
|
|---|
| 8302 | The code now uses fts internally, so it is more robust and
|
|---|
| 8303 | faster with large hierarchies.
|
|---|
| 8304 | * .gitignore: Remove lib/savedir.c, lib/savedir.h.
|
|---|
| 8305 | * tests/symlink: New file
|
|---|
| 8306 | * Makefile.boot (LIB_OBJS_core): Remove isdir.o, savedir.o.
|
|---|
| 8307 | Perhaps other changes are needed too, but I'm not sure what
|
|---|
| 8308 | this makefile is for.
|
|---|
| 8309 | * NEWS: Document changes.
|
|---|
| 8310 | * doc/grep.texi (File and Directory Selection): Likewise.
|
|---|
| 8311 | * bootstrap.conf (gnulib_modules): Remove dirent, dirname, isdir, open.
|
|---|
| 8312 | Add fstatat, fts, openat-safer.
|
|---|
| 8313 | * lib/Makefile.am (libgreputils_a_SOURCES): Remove savedir.c, savedir.h.
|
|---|
| 8314 | * lib/savedir.c, lib/savedir.h: Remove.
|
|---|
| 8315 | * po/POTFILES.in: Add lib/openat-die.c.
|
|---|
| 8316 | * src/main.c: Include fcntl-safer.h, fts_.h. Don't include
|
|---|
| 8317 | isdir.h, savedir.h.
|
|---|
| 8318 | (struct stats, stats_base): Remove.
|
|---|
| 8319 | (long_options, usage, main): Add --dereference-recursive and
|
|---|
| 8320 | implement -r vs -R.
|
|---|
| 8321 | (filename_prefix_len, fts_options): New static vars.
|
|---|
| 8322 | (basic_fts_options, READ_COMMAND_LINE_DEVICES): New constants.
|
|---|
| 8323 | (devices): Now defaults to READ_COMMAND_LINE_DEVICES.
|
|---|
| 8324 | (reset, grep): Now takes just struct stat rather than file name and
|
|---|
| 8325 | struct stats. All callers changed.
|
|---|
| 8326 | (fillbuf): Now takes struct stat reather than struct stats.
|
|---|
| 8327 | All callers changed.
|
|---|
| 8328 | (grep): Don't worry about recursing too deeply; fts and grepdesc
|
|---|
| 8329 | handle this now.
|
|---|
| 8330 | (is_device_mode, grepdirent, grepdesc, grep_command_line_args):
|
|---|
| 8331 | New functions.
|
|---|
| 8332 | (grepfile): New args DIRDESC, FOLLOW, COMMAND_LINE. Remove struct stats
|
|---|
| 8333 | arg. All callers changed. Use openat_safer rather than open.
|
|---|
| 8334 | Use desc == STDIN_FILENO to tell whether we're reading "-".
|
|---|
| 8335 | Don't worry about EINTR when closing -- not possible, since we're
|
|---|
| 8336 | not catching signals.
|
|---|
| 8337 | * tests/Makefile.am (TESTS): Add symlink.
|
|---|
| 8338 | * tests/symlink: New file.
|
|---|
| 8339 |
|
|---|
| 8340 | 2012-03-12 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 8341 |
|
|---|
| 8342 | tests: port big-match to non-GNU dd
|
|---|
| 8343 | * tests/big-match: Don't assume GNU dd extension "bs=1M".
|
|---|
| 8344 |
|
|---|
| 8345 | tests: test for bug with -r --exclude-dir and no file operand
|
|---|
| 8346 | * tests/include-exclude: Test for the bug and fix.
|
|---|
| 8347 |
|
|---|
| 8348 | 2012-03-12 Allan McRae <allan@archlinux.org>
|
|---|
| 8349 |
|
|---|
| 8350 | grep: fix segfault with -r --exclude-dir and no file operand
|
|---|
| 8351 | * src/main.c (grepdir): Don't invoke excluded_file_name on NULL.
|
|---|
| 8352 | * NEWS (Bug fixes): Mention it.
|
|---|
| 8353 |
|
|---|
| 8354 | 2012-03-09 Jim Meyering <meyering@redhat.com>
|
|---|
| 8355 |
|
|---|
| 8356 | tests: exercise two recently-fixed bugs
|
|---|
| 8357 | * tests/repetition-overflow: New test for bugs fixed by commit
|
|---|
| 8358 | v2.10-82-gcbbc1a4.
|
|---|
| 8359 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 8360 |
|
|---|
| 8361 | 2012-03-03 Jim Meyering <meyering@redhat.com>
|
|---|
| 8362 |
|
|---|
| 8363 | maint: use an optimal-for-grep xz compression setting
|
|---|
| 8364 | * cfg.mk (XZ_OPT): Use -6e (determined empirically, see comments).
|
|---|
| 8365 | This sacrifices a meager 60 bytes of compressed tarball size for a
|
|---|
| 8366 | 55-MiB decrease in the memory required during decompression. I.e.,
|
|---|
| 8367 | using -9e would shave off only 60 bytes from the tar.xz file, yet
|
|---|
| 8368 | would force every decompression process to use 55 MiB more memory.
|
|---|
| 8369 |
|
|---|
| 8370 | build: update gnulib submodule to latest
|
|---|
| 8371 |
|
|---|
| 8372 | 2012-03-02 Jim Meyering <meyering@redhat.com>
|
|---|
| 8373 |
|
|---|
| 8374 | maint: post-release administrivia
|
|---|
| 8375 | * NEWS: Add header line for next release.
|
|---|
| 8376 | * .prev-version: Record previous version.
|
|---|
| 8377 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 8378 |
|
|---|
| 8379 | version 2.11
|
|---|
| 8380 | * NEWS: Record release date.
|
|---|
| 8381 |
|
|---|
| 8382 | tests: avoid failure when using Solaris 10's sed
|
|---|
| 8383 | * tests/reversed-range-endpoints: Use a simpler sed expression to
|
|---|
| 8384 | sanitize actual output, so it also works with Solaris 10's /bin/sed.
|
|---|
| 8385 |
|
|---|
| 8386 | 2012-03-01 Jim Meyering <meyering@redhat.com>
|
|---|
| 8387 |
|
|---|
| 8388 | maint: manually correct formatting in dfa.c's cpp definitions
|
|---|
| 8389 | * src/dfa.c: Adjust formatting in cpp definitions.
|
|---|
| 8390 |
|
|---|
| 8391 | maint: indent dfa.c
|
|---|
| 8392 | * src/dfa.c: Filter through indent like this:
|
|---|
| 8393 | HOME=. indent -Tsize_t -l79 --leave-preprocessor-space \
|
|---|
| 8394 | --dont-format-comments --no-tabs < dfa.c > k && mv k dfa.c
|
|---|
| 8395 |
|
|---|
| 8396 | doc: correct grep.1's descriptions of \w and \W (they omitted "_")
|
|---|
| 8397 | * doc/grep.in.1: Fix descriptions of \w and \W.
|
|---|
| 8398 | They did not mention "_".
|
|---|
| 8399 | * doc/grep.texi (The Backslash Character and Special Expressions):
|
|---|
| 8400 | [\w, \W]: List the "_" before the char class, not after: [_[:alnum:]],
|
|---|
| 8401 | for readability and to be consistent with the man page.
|
|---|
| 8402 |
|
|---|
| 8403 | 2012-03-01 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 8404 |
|
|---|
| 8405 | maint: spelling fixes
|
|---|
| 8406 |
|
|---|
| 8407 | grep: fix integer-overflow issues in main program
|
|---|
| 8408 | * NEWS: Document this.
|
|---|
| 8409 | * bootstrap.conf (gnulib_modules): Add inttypes, xstrtoimax.
|
|---|
| 8410 | Remove xstrtoumax.
|
|---|
| 8411 | * src/main.c: Include <inttypes.h>, for INTMAX_MAX, PRIdMAX.
|
|---|
| 8412 | (context_length_arg, prtext, grepbuf, grep, grepfile)
|
|---|
| 8413 | (get_nondigit_option, main):
|
|---|
| 8414 | Use intmax_t, not int, for line counts.
|
|---|
| 8415 | (context_length_arg, main): Silently ceiling line counts
|
|---|
| 8416 | to maximum value, since there's no practical difference between
|
|---|
| 8417 | doing that and using infinite-precision arithmetic.
|
|---|
| 8418 | (out_before, out_after, pending): Now intmax_t, not int.
|
|---|
| 8419 | (max_count, outleft): Now intmax_t, not off_t.
|
|---|
| 8420 | (prepend_args, prepend_default_options, main):
|
|---|
| 8421 | Use size_t, not int, for sizes.
|
|---|
| 8422 | (prepend_default_options): Check for int and size_t overflow.
|
|---|
| 8423 |
|
|---|
| 8424 | grep: avoid mishandling of long lines
|
|---|
| 8425 | * src/pcresearch.c (Pexecute): Do not pass a line longer than
|
|---|
| 8426 | INT_MAX to pcre_exec, since its API does not permit that.
|
|---|
| 8427 |
|
|---|
| 8428 | grep: remove no-longer-used setrlimit code
|
|---|
| 8429 | This code has been unused and obsolescent ever since the regex
|
|---|
| 8430 | code stopped using the stack for large regular expressions.
|
|---|
| 8431 | * src/main.c [HAVE_SETRLIMIT]: Do not include <sys/time.h> or
|
|---|
| 8432 | or <sys/resource.h>; no longer needed.
|
|---|
| 8433 | (set_rlimits): Remove. All callers changed.
|
|---|
| 8434 |
|
|---|
| 8435 | grep: fix some core dumps with long lines etc.
|
|---|
| 8436 | These problems mostly occur because the code attempts to stuff
|
|---|
| 8437 | sizes into int or into unsigned int; this doesn't work on most
|
|---|
| 8438 | 64-bit hosts and the errors can lead to core dumps.
|
|---|
| 8439 | * NEWS: Document this.
|
|---|
| 8440 | * src/dfa.c (token): Typedef to ptrdiff_t, since the enum's
|
|---|
| 8441 | range could be as small as -128 .. 127 on practical hosts.
|
|---|
| 8442 | (position.index): Now size_t, not unsigned int.
|
|---|
| 8443 | (leaf_set.elems): Now size_t *, not unsigned int *.
|
|---|
| 8444 | (dfa_state.hash, struct mb_char_classes.nchars, .nch_classes)
|
|---|
| 8445 | (.nranges, .nequivs, .ncoll_elems, struct dfa.cindex, .calloc, .tindex)
|
|---|
| 8446 | (.talloc, .depth, .nleaves, .nregexps, .nmultibyte_prop, .nmbcsets):
|
|---|
| 8447 | (.mbcsets_alloc): Now size_t, not int.
|
|---|
| 8448 | (dfa_state.first_end): Now token, not int.
|
|---|
| 8449 | (state_num): New type.
|
|---|
| 8450 | (struct mb_char_classes.cset): Now ptrdiff_t, not int.
|
|---|
| 8451 | (struct dfa.utf8_anychar_classes): Now token[5], not int[5].
|
|---|
| 8452 | (struct dfa.sindex, .salloc, .tralloc): Now state_num, not int.
|
|---|
| 8453 | (struct dfa.trans, .realtrans, .fails): Now state_num **, not int **.
|
|---|
| 8454 | (struct dfa.newlines): Now state_num *, not int *.
|
|---|
| 8455 | (prtok): Don't assume 'token' is no wider than int.
|
|---|
| 8456 | (lexleft, parens, depth): Now size_t, not int.
|
|---|
| 8457 | (charclass_index, nsubtoks)
|
|---|
| 8458 | (parse_bracket_exp, addtok, copytoks, closure, insert, merge, delete)
|
|---|
| 8459 | (state_index, epsclosure, state_separate_contexts)
|
|---|
| 8460 | (dfaanalyze, dfastate, build_state, realloc_trans_if_necessary)
|
|---|
| 8461 | (transit_state_singlebyte, match_anychar, match_mb_charset)
|
|---|
| 8462 | (check_matching_with_multibyte_ops, transit_state_consume_1char)
|
|---|
| 8463 | (transit_state, dfaexec, free_mbdata, dfaoptimize, dfafree)
|
|---|
| 8464 | (freelist, enlist, addlists, inboth, dfamust):
|
|---|
| 8465 | Don't assume indexes fit in 'int'.
|
|---|
| 8466 | (lex): Avoid overflow in string-to-{hi,lo} conversions.
|
|---|
| 8467 | (dfaanalyze): Redo indexing so that it works with size_t values,
|
|---|
| 8468 | which cannot go negative.
|
|---|
| 8469 | * src/dfa.h (dfaexec): Count argument is now size_t *, not int *.
|
|---|
| 8470 | (dfastate): State numbers are now ptrdiff_t, not int.
|
|---|
| 8471 | * src/dfasearch.c: Include "intprops.h", for TYPE_MAXIMUM.
|
|---|
| 8472 | (kwset_exact_matches): Now size_t, not int.
|
|---|
| 8473 | (EGexecute): Don't assume indexes fit in 'int'.
|
|---|
| 8474 | Check for overflow before converting a ptrdiff_t to a regoff_t,
|
|---|
| 8475 | as regoff_t is narrower than ptrdiff_t in 64-bit glibc (contra POSIX).
|
|---|
| 8476 | Check for memory exhaustion in re_search rather than treating
|
|---|
| 8477 | it merely as failure to match; use xalloc_die () to report any error.
|
|---|
| 8478 | * src/kwset.c (struct trie.accepting): Now size_t, not unsigned int.
|
|---|
| 8479 | (struct kwset.words): Now ptrdiff_t, not int.
|
|---|
| 8480 | * src/kwset.h (struct kwsmatch.index): Now size_t, not int.
|
|---|
| 8481 |
|
|---|
| 8482 | tests: test for problems with long matches
|
|---|
| 8483 | The new test is expensive, so add a category of expensive tests,
|
|---|
| 8484 | which are normally not run, and put the new test in this new
|
|---|
| 8485 | category. The idea of having expensive tests is taken from coreutils.
|
|---|
| 8486 | * HACKING: Mention RUN_EXPENSIVE_TESTS and similar env vars.
|
|---|
| 8487 | * Makefile.am (check-expensive): New rule.
|
|---|
| 8488 | * tests/Makefile.am (TESTS): Add big-match.
|
|---|
| 8489 | * tests/init.cfg (expensive_): New function, from coreutils.
|
|---|
| 8490 | * tests/big-match: New file.
|
|---|
| 8491 |
|
|---|
| 8492 | 2012-02-29 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 8493 |
|
|---|
| 8494 | maint: use gnulib _Noreturn rather than __attribute__ ((noreturn))
|
|---|
| 8495 | * src/grep.h (__attribute__): Remove.
|
|---|
| 8496 | * src/dfa.h (__attribute__): Likewise.
|
|---|
| 8497 | (dfaerror): Use noreturn rather than __attribute__ ((noreturn)).
|
|---|
| 8498 | * src/main.c (usage): Likewise.
|
|---|
| 8499 |
|
|---|
| 8500 | 2012-02-26 Jim Meyering <meyering@redhat.com>
|
|---|
| 8501 |
|
|---|
| 8502 | build: update submodule, bootstrap, tests/init.sh from gnulib
|
|---|
| 8503 | * gl/lib/regcomp.c.diff: Adjust.
|
|---|
| 8504 | * bootstrap: Update from gnulib.
|
|---|
| 8505 | * tests/init.sh: Update from gnulib.
|
|---|
| 8506 |
|
|---|
| 8507 | 2012-02-26 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 8508 |
|
|---|
| 8509 | dfa: merge calls to SUCCEEDS_IN_CONTEXT
|
|---|
| 8510 | * src/dfa.c (state_index): use a single call to SUCCEEDS_IN_CONTEXT.
|
|---|
| 8511 |
|
|---|
| 8512 | dfa: fix a subtle constraint encoding bug
|
|---|
| 8513 | * src/dfa.c (SUCCEEDS_IN_CONTEXT, PREV_NEWLINE_DEPENDENT,
|
|---|
| 8514 | PREV_LETTER_DEPENDENT): Rewrite to handle all 3*3=9 possible
|
|---|
| 8515 | combinations of previous and next character contexts.
|
|---|
| 8516 | (MATCHES_NEWLINE_CONTEXT, MATCHES_LETTER_CONTEXT): Remove.
|
|---|
| 8517 | (NO_CONSTRAINT, BEGLINE_CONSTRAINT, ENDLINE_CONSTRAINT,
|
|---|
| 8518 | BEGWORD_CONSTRAINT, ENDWORD_CONSTRAINT, LIMWORD_CONSTRAINT,
|
|---|
| 8519 | NOTLIMWORD_CONSTRAINT): Switch to new encoding.
|
|---|
| 8520 | * NEWS: Document resulting bugfix.
|
|---|
| 8521 | * tests/spencer1.tests: Add regression test.
|
|---|
| 8522 |
|
|---|
| 8523 | dfa: do not use MATCHES_*_CONTEXT directly
|
|---|
| 8524 | * src/dfa.c (dfastate): Use SUCCEEDS_IN_CONTEXT.
|
|---|
| 8525 |
|
|---|
| 8526 | dfa: change meaning of a state context
|
|---|
| 8527 | * src/dfa.c (MATCHES_NEWLINE_CONTEXT, MATCHES_LETTER_CONTEXT): New.
|
|---|
| 8528 | (state_separate_contexts): Remove second argument.
|
|---|
| 8529 | (state_index): Do not mask away CTX_NONE.
|
|---|
| 8530 | (dfaanalyze): Adjust call to state_index and state_separate_contexts.
|
|---|
| 8531 | (dfastate): Adjust calls to state_index and state_separate_contexts.
|
|---|
| 8532 |
|
|---|
| 8533 | 2012-02-13 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 8534 |
|
|---|
| 8535 | tests: fix loop in epipe test
|
|---|
| 8536 | * tests/epipe: Don't loop forever if the bug is present.
|
|---|
| 8537 | Problem reported by Jaroslav Skarvada.
|
|---|
| 8538 |
|
|---|
| 8539 | 2012-02-08 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 8540 |
|
|---|
| 8541 | tests: work portably even if SIGPIPE is ignored
|
|---|
| 8542 | * tests/epipe: Don't rely on "trap - PIPE"; that's not portable.
|
|---|
| 8543 | Problem reported by Eric Blake in
|
|---|
| 8544 | <http://lists.gnu.org/archive/html/bug-grep/2012-02/msg00017.html>.
|
|---|
| 8545 | Also, use "ls -al" rather than "echo", in case "echo" is done by a
|
|---|
| 8546 | buggy shell that ignores write errors. And close grep's fd 3, as
|
|---|
| 8547 | a sanity check.
|
|---|
| 8548 |
|
|---|
| 8549 | 2012-02-07 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 8550 |
|
|---|
| 8551 | tests: work even if SIGPIPE is ignored
|
|---|
| 8552 | * tests/epipe: Do not infinite-loop if SIGPIPE is already ignored.
|
|---|
| 8553 | It could be that the invoker of 'make check' ignores SIGPIPE,
|
|---|
| 8554 | for example.
|
|---|
| 8555 |
|
|---|
| 8556 | 2012-02-05 Jim Meyering <meyering@redhat.com>
|
|---|
| 8557 |
|
|---|
| 8558 | build: accommodate -Wshadow and -Werror=suggest-attribute=pure
|
|---|
| 8559 | * src/dfa.c (state_separate_contexts): Add _GL_ATTRIBUTE_PURE.
|
|---|
| 8560 | (dfaexec): Rename parameter, s/newline/allow_nl/, to avoid
|
|---|
| 8561 | shadowing the global.
|
|---|
| 8562 |
|
|---|
| 8563 | 2012-02-05 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 8564 |
|
|---|
| 8565 | dfa: refactor common context computations
|
|---|
| 8566 | * src/dfa.c (CTX_ANY, charclass_context, state_separate_contexts): New.
|
|---|
| 8567 | (dfaanalyze): Use state_separate_contexts.
|
|---|
| 8568 | (dfastate): Use charclass_context and state_separate_contexts. Rename
|
|---|
| 8569 | prev_context to separate_contexts.
|
|---|
| 8570 |
|
|---|
| 8571 | dfa: change newline/letter to a single context value
|
|---|
| 8572 | * src/dfa.c (MATCHES_NEWLINE_CONTEXT, MATCHES_LETTER_CONTEXT,
|
|---|
| 8573 | SUCCEEDS_IN_CONTEXT, ACCEPTS_IN_CONTEXT): Take a single context value
|
|---|
| 8574 | for prev and curr.
|
|---|
| 8575 | (struct dfa_state): Replace newline and letter with context.
|
|---|
| 8576 | (wchar_context): New.
|
|---|
| 8577 | (state_index): Replace newline and letter with context. Compare
|
|---|
| 8578 | context values in the state struct. Adjust calls to pass contexts.
|
|---|
| 8579 | (wants_newline): Replace with wanted_context. Adjust calls to pass
|
|---|
| 8580 | contexts.
|
|---|
| 8581 | (dfastate): Replace wants_newline and wants_letter with wanted_context.
|
|---|
| 8582 | Adjust calls to pass contexts.
|
|---|
| 8583 | (build_state): Adjust calls to pass contexts.
|
|---|
| 8584 | (match_anychar, match_mb_charset, transit_state): Use wchar_context.
|
|---|
| 8585 | Adjust calls to pass contexts.
|
|---|
| 8586 |
|
|---|
| 8587 | 2012-02-05 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 8588 |
|
|---|
| 8589 | dfa: introduce contexts for the values in d->success
|
|---|
| 8590 | Also initialize all tables in a single place in dfasyntax.
|
|---|
| 8591 |
|
|---|
| 8592 | * src/dfa.c (CTX_NONE, CTX_LETTER, CTX_NEWLINE, char_context): New.
|
|---|
| 8593 | (sbit, letters, newline): New.
|
|---|
| 8594 | (dfasyntax): Fill them.
|
|---|
| 8595 | (dfastate): Remove letters, newline, initialized.
|
|---|
| 8596 | (build_state): Use CTX_* constants.
|
|---|
| 8597 | (dfaexec): Remove sbit and sbit_init.
|
|---|
| 8598 |
|
|---|
| 8599 | 2012-02-05 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 8600 |
|
|---|
| 8601 | dfa: remove useless check
|
|---|
| 8602 | * src/dfa.c (state_index): There is nothing that is a newline *and*
|
|---|
| 8603 | a letter. Remove redundant call to SUCCEEDS_IN_CONTEXT.
|
|---|
| 8604 |
|
|---|
| 8605 | 2012-01-22 Jim Meyering <meyering@redhat.com>
|
|---|
| 8606 |
|
|---|
| 8607 | build: update bootstrap from gnulib and adapt
|
|---|
| 8608 | * bootstrap: Update from gnulib.
|
|---|
| 8609 | * tests/init.sh: Update from gnulib.
|
|---|
| 8610 | * bootstrap.conf (bootstrap_epilogue): Remove now-unnecessary,
|
|---|
| 8611 | snippet that edited gnulib-tests/gnulib.mk.
|
|---|
| 8612 | (gnulib_tool_option_extras): Add both --symlink and
|
|---|
| 8613 | --makefile-name=gnulib.mk. Remove use of $bt.
|
|---|
| 8614 | * lib/Makefile.am: Initialize numerous automake variables so that
|
|---|
| 8615 | generated code in gnulib.mk may use += to append to them.
|
|---|
| 8616 |
|
|---|
| 8617 | maint: convert `this' to 'this' quoting style in diagnostics
|
|---|
| 8618 | Now that gnulib's quote and quotearg modules use 'this' style,
|
|---|
| 8619 | change the few explicit uses in diagnostics to conform.
|
|---|
| 8620 | * src/egrep.c (after_options): Use 'this' style of quotes.
|
|---|
| 8621 | * src/fgrep.c (after_options): Likewise.
|
|---|
| 8622 | * src/grep.c (after_options): Likewise.
|
|---|
| 8623 | * src/main.c (usage): Likewise.
|
|---|
| 8624 |
|
|---|
| 8625 | build: update gnulib to latest; adjust quoting in tests
|
|---|
| 8626 | * gnulib: Update.
|
|---|
| 8627 | * tests/in-eq-out-infloop: Convert expected diagnostics to match
|
|---|
| 8628 | new quoting.
|
|---|
| 8629 |
|
|---|
| 8630 | 2012-01-22 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 8631 |
|
|---|
| 8632 | doc: document recent diagnostics-related changes
|
|---|
| 8633 | * NEWS: Document changes re diagnostics related to GREP_COLORS,
|
|---|
| 8634 | directory loops, -s, "write error".
|
|---|
| 8635 |
|
|---|
| 8636 | grep: be quiet about GREP_COLORS syntax
|
|---|
| 8637 | * src/main.c (struct color_cap): fct now returns void,
|
|---|
| 8638 | since there's no longer need to use what it returns.
|
|---|
| 8639 | (color_cap_mt_fct, color_cap_rv_fct, color_cap_ne_fct): Return void.
|
|---|
| 8640 | (parse_grep_colors): Do not output diagnostics and then exit with
|
|---|
| 8641 | status 0. Instead, ignore errors in GREP_COLORS. This is more
|
|---|
| 8642 | consistent with programs that (e.g.) ignore errors in termcap entries,
|
|---|
| 8643 | and it's more internally-consistent as some GREP_COLORS errors
|
|---|
| 8644 | were ignored but not others.
|
|---|
| 8645 |
|
|---|
| 8646 | grep: exit with nonzero status if directory loop
|
|---|
| 8647 | * src/main.c (grepdir): Exit with status 2 if a directory loop is
|
|---|
| 8648 | found, since the output might not be "right" (i.e., infinite...).
|
|---|
| 8649 |
|
|---|
| 8650 | grep: suppress read errors if -s
|
|---|
| 8651 | * src/main.c (reset, grep, grepfile): Do not report an input error
|
|---|
| 8652 | if -s is given.
|
|---|
| 8653 |
|
|---|
| 8654 | grep: don't say "write error" over and over
|
|---|
| 8655 | Problem reported by Travis Gummels in
|
|---|
| 8656 | <https://bugzilla.redhat.com/show_bug.cgi?id=741452>.
|
|---|
| 8657 | * src/main.c (write_error_seen): New static var.
|
|---|
| 8658 | (clean_up_stdout): New function.
|
|---|
| 8659 | (prline): Do not output 'write error' more than once; exit
|
|---|
| 8660 | after the first one. Use the same wording for the diagnostic
|
|---|
| 8661 | that close_stdout uses.
|
|---|
| 8662 | (main): Clean up with clean_up_stdout, not close_stdout, so that
|
|---|
| 8663 | grep doesn't output multiple "write error" diagnostics.
|
|---|
| 8664 | * tests/Makefile.am (TESTS): Add epipe.
|
|---|
| 8665 | * tests/epipe: New file.
|
|---|
| 8666 |
|
|---|
| 8667 | 2012-01-12 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 8668 |
|
|---|
| 8669 | dfa: non-glibc word-constituent unibyte fix
|
|---|
| 8670 | * src/dfa.c (is_valid_unibyte_character): Fix typo that caused
|
|---|
| 8671 | this to incorrectly return 0 on unibyte non-glibc systems.
|
|---|
| 8672 | Problem reported by Aharon Robbins in
|
|---|
| 8673 | <http://lists.gnu.org/archive/html/bug-grep/2012-01/msg00084.html>.
|
|---|
| 8674 |
|
|---|
| 8675 | 2012-01-04 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 8676 |
|
|---|
| 8677 | doc: document empty pattern better
|
|---|
| 8678 | * doc/grep.texi (Top, Fundamental Structure, Usage):
|
|---|
| 8679 | Explain how grep deals with the empty pattern.
|
|---|
| 8680 | Problem spotted by Bernhard Voelker in
|
|---|
| 8681 | <http://lists.gnu.org/archive/html/bug-grep/2012-01/msg00050.html>.
|
|---|
| 8682 |
|
|---|
| 8683 | grep: with no args, search "." only if command-line -r
|
|---|
| 8684 | * NEWS: Document this.
|
|---|
| 8685 | * doc/grep.texi (Environment Variables, grep Programs): Likewise.
|
|---|
| 8686 | * src/main.c (usage): Likewise.
|
|---|
| 8687 | (main): Implement this.
|
|---|
| 8688 | (prepend_default_options): Return a count of prepended options.
|
|---|
| 8689 | * tests/r-dot: Test the above.
|
|---|
| 8690 |
|
|---|
| 8691 | 2012-01-03 Jim Meyering <meyering@redhat.com>
|
|---|
| 8692 |
|
|---|
| 8693 | tests: adjust test to match code, now that --mmap writes to stderr
|
|---|
| 8694 | * tests/ignore-mmap: Separate stdout and stderr; test both.
|
|---|
| 8695 |
|
|---|
| 8696 | deprecate the --mmap option
|
|---|
| 8697 | * src/main.c (main): Deprecate the --mmap option: issue a warning
|
|---|
| 8698 | when it is used.
|
|---|
| 8699 | (usage): Change description.
|
|---|
| 8700 | * doc/grep.texi (Other Options): Document the new behavior.
|
|---|
| 8701 | * NEWS (Changes in behavior): Mention it.
|
|---|
| 8702 |
|
|---|
| 8703 | 2012-01-03 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 8704 |
|
|---|
| 8705 | dfa: fix incorrect comment
|
|---|
| 8706 | * src/dfa.c (dfastate): Fix comment for newline.
|
|---|
| 8707 |
|
|---|
| 8708 | dfa: fix rebase conflict
|
|---|
| 8709 | * src/dfa.c (dfaanalyze): Fix reference to nalloc.
|
|---|
| 8710 |
|
|---|
| 8711 | dfa: automatically resize position_sets
|
|---|
| 8712 | * src/dfa.c (insert, copy, merge): Resize arrays here.
|
|---|
| 8713 | (dfaanalyze): Do not track number of allocated elements here.
|
|---|
| 8714 | (dfastate): Allocate mbps with only one element.
|
|---|
| 8715 |
|
|---|
| 8716 | dfa: change position_set nelem to size_t
|
|---|
| 8717 | * src/dfa.c (REALLOC_IF_NECESSARY): Disable assertion, to avoid
|
|---|
| 8718 | warnings from -Wtype-limits.
|
|---|
| 8719 | (position_set): Change nelem to a size_t.
|
|---|
| 8720 |
|
|---|
| 8721 | dfa: move nalloc to position_set structure
|
|---|
| 8722 | * src/dfa.c (position_set): Add alloc.
|
|---|
| 8723 | (alloc_position_set): Initialize it.
|
|---|
| 8724 | (dfaanalyze): Use it instead of the nalloc array or nelem.
|
|---|
| 8725 |
|
|---|
| 8726 | dfa: remove dead assignment
|
|---|
| 8727 | * src/dfa.c (transit_state): transit_state_consume_1char will clear follows,
|
|---|
| 8728 | do not do this ourselves.
|
|---|
| 8729 |
|
|---|
| 8730 | dfa: introduce alloc_position_set
|
|---|
| 8731 | * src/dfa.c (alloc_position_set): New function, use it throughout.
|
|---|
| 8732 |
|
|---|
| 8733 | dfa: use a more compact data type for grps
|
|---|
| 8734 | * src/dfa.c (leaf_set): New.
|
|---|
| 8735 | (dfastate): Use the smaller type, leaf_set, for grps. Its prior type
|
|---|
| 8736 | contained an unused constraint field.
|
|---|
| 8737 |
|
|---|
| 8738 | dfa: use MALLOC/REALLOC always
|
|---|
| 8739 | src/dfa.c (dfastate, enlist, dfamust): Use MALLOC and REALLOC.
|
|---|
| 8740 |
|
|---|
| 8741 | dfa: remove unnecessary braces
|
|---|
| 8742 | * src/dfa.c (dfastate): Remove unnecessary braces.
|
|---|
| 8743 |
|
|---|
| 8744 | dfa: x2nrealloc starting from a NULL pointer works
|
|---|
| 8745 | * src/dfa.c (parse_bracket_exp): Do not MALLOC mbcset parts the first time
|
|---|
| 8746 | they are encountered. Initialize chars_al correctly.
|
|---|
| 8747 |
|
|---|
| 8748 | 2012-01-03 Jim Meyering <meyering@redhat.com>
|
|---|
| 8749 |
|
|---|
| 8750 | build: avoid build failure with --enable-gcc-warnings and recent gcc
|
|---|
| 8751 | * lib/colorize-posix.c: Disable -Wsuggest-attribute=const, to avoid
|
|---|
| 8752 | warning about this empty init_colorize function.
|
|---|
| 8753 |
|
|---|
| 8754 | 2012-01-03 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 8755 |
|
|---|
| 8756 | remove lib/ms/
|
|---|
| 8757 | * configure.ac: Create lib/colorize.c as a symbolic link.
|
|---|
| 8758 | * lib/colorize-posix.c: New name of lib/colorize-impl.c.
|
|---|
| 8759 | * lib/colorize-w32.c: New name of lib/ms/colorize-impl.c.
|
|---|
| 8760 | * lib/colorize.c: Delete.
|
|---|
| 8761 | * lib/Makefile.am (EXTRA_DIST): Adjust.
|
|---|
| 8762 | * .gitignore: Adjust.
|
|---|
| 8763 | * cfg.mk: Adjust syntax-check exclusions.
|
|---|
| 8764 |
|
|---|
| 8765 | unify colorize.h headers
|
|---|
| 8766 | * lib/Makefile.am (EXTRA_DIST): Adjust.
|
|---|
| 8767 | * lib/colorize.h: Remove inline functions.
|
|---|
| 8768 | * lib/colorize-impl.c: Move them here as functions.
|
|---|
| 8769 | * lib/ms/colorize.h: Remove.
|
|---|
| 8770 | * src/Makefile.am (DEFAULT_HEADERS): Remove.
|
|---|
| 8771 |
|
|---|
| 8772 | 2012-01-02 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 8773 |
|
|---|
| 8774 | colorize: use isatty module
|
|---|
| 8775 | * bootstrap.conf: Add isatty module.
|
|---|
| 8776 | * gnulib: Update to latest.
|
|---|
| 8777 | * lib/colorize.h: Remove argument from should_colorize.
|
|---|
| 8778 | * lib/ms/colorize.h: Likewise.
|
|---|
| 8779 | * lib/colorize-impl.c: Factor isatty call out of here...
|
|---|
| 8780 | * lib/ms/colorize-impl.c: ... and here...
|
|---|
| 8781 | * src/main.c: ... into here.
|
|---|
| 8782 |
|
|---|
| 8783 | 2012-01-02 Jim Meyering <meyering@redhat.com>
|
|---|
| 8784 |
|
|---|
| 8785 | tests: avoid minor "make check" failure
|
|---|
| 8786 | * tests/r-dot: Make executable, to avoid triggering a failed
|
|---|
| 8787 | consistency test in "make check".
|
|---|
| 8788 |
|
|---|
| 8789 | 2012-01-02 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 8790 |
|
|---|
| 8791 | grep: -r with no args now searches "."
|
|---|
| 8792 | This is a patch I've been meaning to put in for years.
|
|---|
| 8793 | When I added support for "grep -r", I forgot to have "grep -r PAT"
|
|---|
| 8794 | search the working directory by default, instead of searching
|
|---|
| 8795 | standard input (which makes no sense, even if stdin is a directory).
|
|---|
| 8796 | This is not an upward compatible change, since "grep -r PAT <file"
|
|---|
| 8797 | will no longer search standard input, but that's OK; nobody should
|
|---|
| 8798 | be using "grep -r" that way anyway.
|
|---|
| 8799 | * NEWS: Document this.
|
|---|
| 8800 | * doc/grep.texi (File and Directory Selection, grep Programs, Usage):
|
|---|
| 8801 | Likewise.
|
|---|
| 8802 | * src/main.c (usage): Likewise.
|
|---|
| 8803 | (grepdir): If DIR is null, search the working directory, but do
|
|---|
| 8804 | not prepend "./" to the file names.
|
|---|
| 8805 | (main): If recursing and no operands are given, search ".".
|
|---|
| 8806 | * tests/Makefile.am (TESTS): Add r-dot.
|
|---|
| 8807 | * tests/r-dot: New file.
|
|---|
| 8808 |
|
|---|
| 8809 | grep: prefer fgets to printf, _ to gettext
|
|---|
| 8810 | * lib/colorize.h (print_end_colorize):
|
|---|
| 8811 | * lib/ms/colorize-impl.c (print_end_colorize):
|
|---|
| 8812 | Use fputs instead of printf.
|
|---|
| 8813 | * src/main.c (usage): Likewise. Use _ instead of gettext.
|
|---|
| 8814 |
|
|---|
| 8815 | 2012-01-01 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 8816 |
|
|---|
| 8817 | grep: check stdin like other files
|
|---|
| 8818 | * NEWS: Document this.
|
|---|
| 8819 | * src/main.c (grepfile): Revamp tests for input files so that
|
|---|
| 8820 | standard input is tested like other files. For example, report
|
|---|
| 8821 | an error if standard input equals standard output.
|
|---|
| 8822 | Prefer open+fstat to stat+open if possible, as open+fstat is
|
|---|
| 8823 | usually a bit faster and avoids a race condition.
|
|---|
| 8824 | * tests/in-eq-out-infloop: Add tests for cases like
|
|---|
| 8825 | 'grep pat <file >>file'.
|
|---|
| 8826 |
|
|---|
| 8827 | 2012-01-01 Jim Meyering <meyering@redhat.com>
|
|---|
| 8828 |
|
|---|
| 8829 | maint: update all copyright year number ranges
|
|---|
| 8830 | Run "make update-copyright".
|
|---|
| 8831 |
|
|---|
| 8832 | 2011-12-31 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 8833 |
|
|---|
| 8834 | grep: lower-case function names
|
|---|
| 8835 | These names used to be macros, but they're functions now.
|
|---|
| 8836 | All callers changed.
|
|---|
| 8837 | * src/main.c (pr_sgr_start): Rename from PR_SGR_START.
|
|---|
| 8838 | (pr_sgr_end): Rename from PR_SGR_END.
|
|---|
| 8839 | (pr_sgr_start_if): Rename from PR_SGR_START_IF.
|
|---|
| 8840 | (pr_sgr_end_if): Rename from PR_SGR_END_IF.
|
|---|
| 8841 |
|
|---|
| 8842 | ms: move Microsoft-specific stuff to lib/ms
|
|---|
| 8843 | * cfg.mk (exclude_file_name_regexp--sc_prohibit_strcmp)
|
|---|
| 8844 | (exclude_file_name_regexp--sc_require_config_h)
|
|---|
| 8845 | (exclude_file_name_regexp--sc_require_config_h_first):
|
|---|
| 8846 | New rules.
|
|---|
| 8847 | * lib/colorize.c, lib/colorize.h, lib/colorize-impl.c:
|
|---|
| 8848 | * lib/ms/colorize.h, lib/ms/colorize-impl.c: New files.
|
|---|
| 8849 | * configure.ac (GREP_SRC_INCLUDES): New macro.
|
|---|
| 8850 | * lib/Makefile.am (libgreputils_a_SOURCES): Add colorize.[ch].
|
|---|
| 8851 | (EXTRA_DIST): New macro.
|
|---|
| 8852 | * src/Makefile.am (DEFAULT_INCLUDES): New macro.
|
|---|
| 8853 | * src/main.c: Include colorize.h.
|
|---|
| 8854 | (PR_SGR_START, PR_SGR_END, PR_SGR_START_IF, PR_SGR_END_IF):
|
|---|
| 8855 | Now static functions, not macros.
|
|---|
| 8856 | (hstdout, norm_attr, w32_console_init, w32_sgr2attr)
|
|---|
| 8857 | (w32_clreol) [__MINGW32__]: Move to lib/ms/colorize-impl.c.
|
|---|
| 8858 | (pr_sgr_start, pr_sgr_end): Remove; callers changed to use new
|
|---|
| 8859 | print_start_colorize, print_end_colorize from colorize.h.
|
|---|
| 8860 | (init_colorize): Rename from w32_console_init and move to
|
|---|
| 8861 | colorize module; caller changed.
|
|---|
| 8862 | (should_colorize): Move to colorize module.
|
|---|
| 8863 |
|
|---|
| 8864 | grep: do input==output check more like dir loop check
|
|---|
| 8865 | * src/main.c (grepfile): Just use SAME_INODE; don't bother
|
|---|
| 8866 | with SAME_REGULAR_FILE. This works better on properly-working
|
|---|
| 8867 | POSIX hosts, since it handles the case where the file is changing
|
|---|
| 8868 | as we grep it. It works worse on hosts that don't support st_ino
|
|---|
| 8869 | properly, but in practice this isn't that much of a problem here.
|
|---|
| 8870 | * src/system.h (same_file_attributes, SAME_REGULAR_FILE):
|
|---|
| 8871 | Remove; no longer needed.
|
|---|
| 8872 |
|
|---|
| 8873 | build: update gnulib submodule to latest
|
|---|
| 8874 |
|
|---|
| 8875 | 2011-12-28 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 8876 |
|
|---|
| 8877 | maint: remove now-unused/obsolete files
|
|---|
| 8878 | * README.DOS: Remove file.
|
|---|
| 8879 | * m4/djgpp.m4: Likewise.
|
|---|
| 8880 | * .gitignore: Remove reference to m4/djgpp.m4.
|
|---|
| 8881 |
|
|---|
| 8882 | 2011-12-28 Jim Meyering <meyering@redhat.com>
|
|---|
| 8883 |
|
|---|
| 8884 | maint: distribute ChangeLog-2009
|
|---|
| 8885 | * Makefile.am (EXTRA_DIST): Add ChangeLog-2009.
|
|---|
| 8886 | Spotted by Eli Zaretskii.
|
|---|
| 8887 |
|
|---|
| 8888 | 2011-12-28 Jim Meyering <meyering@redhat.com>
|
|---|
| 8889 |
|
|---|
| 8890 | main.c: add some 'const' directives
|
|---|
| 8891 | * src/main.c (color_dict, fg_color, bg_color, cap): Declare const.
|
|---|
| 8892 |
|
|---|
| 8893 | No semantic change.
|
|---|
| 8894 |
|
|---|
| 8895 | 2011-12-28 Jim Meyering <meyering@redhat.com>
|
|---|
| 8896 |
|
|---|
| 8897 | main.c: correct indentation and formatting style
|
|---|
| 8898 | * src/main.c: Correct many formatting inconsistencies.
|
|---|
| 8899 | No semantic change.
|
|---|
| 8900 |
|
|---|
| 8901 | avoid new syntax-check failures
|
|---|
| 8902 | * cfg.mk (old_NEWS_hash): Update, to accommodate old NEWS modification.
|
|---|
| 8903 | * src/main.c: Indent solely with spaces, never with TABs.
|
|---|
| 8904 | (should_colorize): Remove useless parens in #if directive.
|
|---|
| 8905 |
|
|---|
| 8906 | 2011-12-28 Eli Zaretskii <eliz@gnu.org>
|
|---|
| 8907 |
|
|---|
| 8908 | Fix whitespace, indentation and documentation
|
|---|
| 8909 | * src/main.c (parse_grep_colors): Fix indentation.
|
|---|
| 8910 | (usage): Mention MS-Windows in help text for -U and -u options.
|
|---|
| 8911 |
|
|---|
| 8912 | update NEWS for MS-Windows changes
|
|---|
| 8913 | * NEWS: Mention MS-Windows related bugfixes and enhancements.
|
|---|
| 8914 |
|
|---|
| 8915 | Fix the test suite for MS-Windows.
|
|---|
| 8916 | * tests/include-exclude: Use --directories=skip, to avoid
|
|---|
| 8917 | gratuitous failures on systems that cannot grep directories.
|
|---|
| 8918 | * tests/reversed-range-endpoints: Don't reject program names with
|
|---|
| 8919 | leading directories and drive letters.
|
|---|
| 8920 | * tests/warn-char-classes: Likewise.
|
|---|
| 8921 |
|
|---|
| 8922 | Support color highlighting on MS-Windows
|
|---|
| 8923 | * src/main.c (SGR_START, SGR_END, PR_SGR_FMT, PR_SGR_FMT_IF): Remove.
|
|---|
| 8924 | (PR_SGR_START, PR_SGR_START_IF): Replace with pr_sgr_start.
|
|---|
| 8925 | (PR_SGR_END, PR_SGR_END_IF): Replace with pr_sgr_end.
|
|---|
| 8926 | (pr_sgr_start, pr_sgr_end, should_colorize): New functions.
|
|---|
| 8927 | (w32_console_init, w32_sgr2attr, w32_clreol) [__MINGW32__]: New functions.
|
|---|
| 8928 | (main): Use should_colorize. Invoke w32_console_init.
|
|---|
| 8929 |
|
|---|
| 8930 | 2011-12-24 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 8931 |
|
|---|
| 8932 | don't ignore errors when reading a directory
|
|---|
| 8933 | grep no longer silently suppresses errors when reading a directory
|
|---|
| 8934 | as if it were a text file. For example, "grep x ." now reports a
|
|---|
| 8935 | read error on most systems; formerly, it ignored the error.
|
|---|
| 8936 | Problem reported as an aside by Bob Proulx (Bug#10355).
|
|---|
| 8937 | * NEWS: Document this.
|
|---|
| 8938 | * src/main.c (grep, grepfile): Implement this. Simplify the code
|
|---|
| 8939 | considerably.
|
|---|
| 8940 | * src/system.h (is_EISDIR): Remove; no longer needed.
|
|---|
| 8941 |
|
|---|
| 8942 | --include etc. now work on command-line args more consistently
|
|---|
| 8943 | --include and --exclude apply only to non-directories and
|
|---|
| 8944 | --exclude-dir applies only to directories. "-" (standard input)
|
|---|
| 8945 | is never excluded, since it is not a file name.
|
|---|
| 8946 | This bug was discovered while fixing a read-directory bug (Bug#10355).
|
|---|
| 8947 | * NEWS: Document this.
|
|---|
| 8948 | * src/main.c (main): Implement this.
|
|---|
| 8949 | * tests/include-exclude: Test for it.
|
|---|
| 8950 |
|
|---|
| 8951 | 2011-12-24 Jim Meyering <meyering@redhat.com>
|
|---|
| 8952 |
|
|---|
| 8953 | build: update gnulib submodule to latest
|
|---|
| 8954 |
|
|---|
| 8955 | 2011-12-12 Arnold D. Robbins <arnold@skeeve.com>
|
|---|
| 8956 |
|
|---|
| 8957 | doc: improve grep.texi
|
|---|
| 8958 | * doc/grep.texi: General editing for improved aesthetics.
|
|---|
| 8959 | Also fix a few problems.
|
|---|
| 8960 |
|
|---|
| 8961 | 2011-12-12 Jim Meyering <meyering@redhat.com>
|
|---|
| 8962 |
|
|---|
| 8963 | build: use gnulib's iswctype wcscoll
|
|---|
| 8964 | * bootstrap.conf (gnulib_modules): Add iswctype and wcscoll.
|
|---|
| 8965 | * configure.ac: Remove explicit checks for those functions.
|
|---|
| 8966 | * src/mbsupport.h (MBS_SUPPORT): Define to 1 if not already defined.
|
|---|
| 8967 | Remove the conditional, now that we're guaranteed by gnulib to have
|
|---|
| 8968 | wcscoll and iswctype.
|
|---|
| 8969 | Suggested by Alan Hourihane in http://savannah.gnu.org/bugs/?34930
|
|---|
| 8970 |
|
|---|
| 8971 | disable the new input==output guard for additional options
|
|---|
| 8972 | * src/main.c (grepfile): Do not reject input == output also
|
|---|
| 8973 | when using a few other options.
|
|---|
| 8974 | * tests/in-eq-out-infloop: Test these new cases.
|
|---|
| 8975 | * NEWS (Bug fixes): Mention it
|
|---|
| 8976 |
|
|---|
| 8977 | 2011-12-11 Nicolas Vigier <boklm@mars-attacks.org>
|
|---|
| 8978 |
|
|---|
| 8979 | do not reject "grep -qr . > out"
|
|---|
| 8980 | The recent fix to avoid an infinite disk-filling loop, commit 5e20a38a,
|
|---|
| 8981 | introduced a minor regression. If you use grep with -q and -r, and
|
|---|
| 8982 | redirect output to a file that will be traversed, then grep would
|
|---|
| 8983 | reject the command, even though it will generate no output.
|
|---|
| 8984 | In that case, there is no risk of an infinite loop.
|
|---|
| 8985 | * src/main.c (grepfile): Do not reject input == output when
|
|---|
| 8986 | using --quiet/--silent (-q).
|
|---|
| 8987 | Reported by J H Wilson in http://bugs.mageia.org/show_bug.cgi?id=3501
|
|---|
| 8988 | forwarded by Nicolas Vigier to https://savannah.gnu.org/bugs/?34917
|
|---|
| 8989 |
|
|---|
| 8990 | 2011-11-29 Arnold Robbins <arnold@skeeve.com>
|
|---|
| 8991 |
|
|---|
| 8992 | dfa: do not call nl_langinfo in !MBS_SUPPORT mode
|
|---|
| 8993 | * src/dfa.c (using_utf8) [!MBS_SUPPORT]: Remove erroneous "defined"
|
|---|
| 8994 | in cpp test for MBS_SUPPORT. Since commit a163349d, MBS_SUPPORT is 0/1.
|
|---|
| 8995 | This error caused trouble only in the !MBS_SUPPORT case.
|
|---|
| 8996 |
|
|---|
| 8997 | dfa: avoid warning from deficient compiler in !MBS_SUPPORT mode
|
|---|
| 8998 | * src/dfa.c (setbit_wc) [!MBS_SUPPORT]: Add explicit "return false;"
|
|---|
| 8999 | after "abort ();", to avoid a warning from deficient compilers.
|
|---|
| 9000 |
|
|---|
| 9001 | 2011-11-29 Jim Meyering <meyering@redhat.com>
|
|---|
| 9002 |
|
|---|
| 9003 | tests: use "compare exp out", not "compare out exp"
|
|---|
| 9004 | Likewise, when an empty file is expected, use "compare /dev/null out",
|
|---|
| 9005 | not "compare out /dev/null". I.e., specify the expected/desired contents
|
|---|
| 9006 | via the first file name. Prompted by a suggestion from Bruno Haible
|
|---|
| 9007 | in http://thread.gmane.org/gmane.comp.gnu.grep.bugs/4020/focus=29154
|
|---|
| 9008 |
|
|---|
| 9009 | Run these commands:
|
|---|
| 9010 |
|
|---|
| 9011 | git grep -l -E 'compare [^ ]+ exp' \
|
|---|
| 9012 | |xargs perl -pi -e 's/(compare) (\S+) (exp\S*)/$1 $3 $2/'
|
|---|
| 9013 | git grep -l -E 'compare [^ ]+ /dev/null' \
|
|---|
| 9014 | |xargs perl -pi -e 's/(compare) (\S+) (\/dev\/null)/$1 $3 $2/'
|
|---|
| 9015 |
|
|---|
| 9016 | 2011-11-29 Jim Meyering <meyering@redhat.com>
|
|---|
| 9017 |
|
|---|
| 9018 | build: update gnulib submodule to latest
|
|---|
| 9019 |
|
|---|
| 9020 | 2011-11-28 Jim Meyering <meyering@redhat.com>
|
|---|
| 9021 |
|
|---|
| 9022 | build: accommodate -Werror=suggest-attribute=pure
|
|---|
| 9023 | Now that we're using the latest manywarnings module from gnulib,
|
|---|
| 9024 | accommodate gcc's -Werror=suggest-attribute=pure option by marking
|
|---|
| 9025 | suggested functions with gnulib-defined _GL_ATTRIBUTE_PURE.
|
|---|
| 9026 | * src/kwset.c (hasevery): Mark function with pure attribute.
|
|---|
| 9027 | (bmexec): Likewise.
|
|---|
| 9028 | * src/dfa.c (nsubtoks, istrstr, find_pred, dfamusts): Likewise.
|
|---|
| 9029 | * configure.ac: Disable (for lib/) options that seem not to be worth
|
|---|
| 9030 | the trouble: -Wunsuffixed-float-constants and -Wformat-nonliteral.
|
|---|
| 9031 |
|
|---|
| 9032 | 2011-11-21 Bruno Haible <bruno@clisp.org>
|
|---|
| 9033 |
|
|---|
| 9034 | build: fix "make check" error on OSF/1
|
|---|
| 9035 | * tests/Makefile.am (TESTS_ENVIRONMENT): Test the value of the variable
|
|---|
| 9036 | BASH_VERSION, not the literal ASH_VERSION.
|
|---|
| 9037 |
|
|---|
| 9038 | 2011-11-21 Jim Meyering <meyering@redhat.com>
|
|---|
| 9039 |
|
|---|
| 9040 | portability: work consistently on *BSD systems
|
|---|
| 9041 | * src/dfa.c (is_valid_unibyte_character): Define.
|
|---|
| 9042 | (IS_WORD_CONSTITUENT): Use it here, to make grep work consistently
|
|---|
| 9043 | even on *BSD systems, which use different tables for ctype macros
|
|---|
| 9044 | like isalpha. http://thread.gmane.org/gmane.comp.gnu.grep.bugs/4022
|
|---|
| 9045 | With help from Bruno Haible.
|
|---|
| 9046 |
|
|---|
| 9047 | 2011-11-20 Jim Meyering <meyering@redhat.com>
|
|---|
| 9048 |
|
|---|
| 9049 | maint: consistently use NULL, not 0, when comparing pointers
|
|---|
| 9050 | * src/dfa.c (dfaanalyze): Compare trans[s] with NULL, not 0.
|
|---|
| 9051 |
|
|---|
| 9052 | maint: remove an avoidable #ifdef/#endif pair
|
|---|
| 9053 | * src/dfa.c (dfaanalyze): Remove avoidable #ifdef around "{".
|
|---|
| 9054 |
|
|---|
| 9055 | tests: fix typo in last change
|
|---|
| 9056 | * tests/word-delim-multibyte: Use double quotes around $e_acute,
|
|---|
| 9057 | not single quotes. Spotted by Bruno Haible.
|
|---|
| 9058 | This and the preceding change do not resolve the XPASS failure
|
|---|
| 9059 | on OpenBSD 4.9 after all. See the explanation at
|
|---|
| 9060 | http://thread.gmane.org/gmane.comp.gnu.grep.bugs/4022
|
|---|
| 9061 |
|
|---|
| 9062 | tests: avoid unwarranted test failure on *BSD-based systems
|
|---|
| 9063 | * tests/word-delim-multibyte (e_acute): Use a more portable
|
|---|
| 9064 | representation of e-acute. Reported by Bruno Haible.
|
|---|
| 9065 |
|
|---|
| 9066 | 2011-11-19 Jim Meyering <meyering@redhat.com>
|
|---|
| 9067 |
|
|---|
| 9068 | maint: accommodate -Wdeclaration-after-statement, but only in dfa.c,
|
|---|
| 9069 | and because doing so does not impact readability/maintainability.
|
|---|
| 9070 | This is solely to accommodate gawk users who are stuck with ancient gcc.
|
|---|
| 9071 | This is no excuse to change any other code in grep.
|
|---|
| 9072 | * src/dfa.c (dfaoptimize, parse_bracket_exp): Move declaration
|
|---|
| 9073 | to precede first statement in block.
|
|---|
| 9074 |
|
|---|
| 9075 | 2011-11-16 Jim Meyering <meyering@redhat.com>
|
|---|
| 9076 |
|
|---|
| 9077 | maint: post-release administrivia
|
|---|
| 9078 | * NEWS: Add header line for next release.
|
|---|
| 9079 | * .prev-version: Record previous version.
|
|---|
| 9080 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 9081 |
|
|---|
| 9082 | version 2.10
|
|---|
| 9083 | * NEWS: Record release date.
|
|---|
| 9084 |
|
|---|
| 9085 | build: update gnulib submodule to latest
|
|---|
| 9086 |
|
|---|
| 9087 | 2011-11-13 Jim Meyering <meyering@redhat.com>
|
|---|
| 9088 |
|
|---|
| 9089 | maint: update bootstrap and init.sh from gnulib
|
|---|
| 9090 | * tests/init.sh: Update from gnulib.
|
|---|
| 9091 | * bootstrap: Likewise.
|
|---|
| 9092 |
|
|---|
| 9093 | 2011-11-12 Jim Meyering <meyering@redhat.com>
|
|---|
| 9094 |
|
|---|
| 9095 | build: update gnulib for exclude-test fixes
|
|---|
| 9096 |
|
|---|
| 9097 | tests: make our "export" replacement efficient with modern shells
|
|---|
| 9098 | * tests/Makefile.am (TESTS_ENVIRONMENT): Use a trivial and efficient
|
|---|
| 9099 | implementation with a shell that supports "export var=val".
|
|---|
| 9100 | Use the sed-invoking replacement only when necessary.
|
|---|
| 9101 | Improved by Stefano Lattarini.
|
|---|
| 9102 |
|
|---|
| 9103 | tests: make the replacement export function more robust
|
|---|
| 9104 | * tests/Makefile.am (sed_quote_value): Also quote single quotes.
|
|---|
| 9105 | Remove sed's -e options. Not needed.
|
|---|
| 9106 |
|
|---|
| 9107 | 2011-11-12 Bruno Haible <bruno@clisp.org>
|
|---|
| 9108 |
|
|---|
| 9109 | tests: fix test suite execution failure on OSF/1 5.1
|
|---|
| 9110 | * tests/Makefile.am (TESTS_ENVIRONMENT): Use a shell function to
|
|---|
| 9111 | ensure that we use only the portable form of the 'export' shell
|
|---|
| 9112 | built-in.
|
|---|
| 9113 |
|
|---|
| 9114 | tests: don't assume that /bin/bash exists
|
|---|
| 9115 | * tests/fedora: Run using /bin/sh, not /bin/bash.
|
|---|
| 9116 |
|
|---|
| 9117 | tests: avoid unwarranted failures due to SATAN's timeout
|
|---|
| 9118 | * tests/init.cfg (require_timeout_): Also ensure that
|
|---|
| 9119 | timeout exits with its child's exit status.
|
|---|
| 9120 |
|
|---|
| 9121 | build: fix compilation error on MSVC 9 to due Pexecute() declaration
|
|---|
| 9122 | * src/pcresearch.c (WITHOUT_PCRE_NORETURN): Remove macro.
|
|---|
| 9123 | (Pexecute): Replace abort() call with code that does not trigger GCC
|
|---|
| 9124 | warnings.
|
|---|
| 9125 |
|
|---|
| 9126 | tests: fix high-bit-range test failure on OSF/1 5.1
|
|---|
| 9127 | * tests/high-bit-range: Use octal escape instead of hexadecimal escape
|
|---|
| 9128 | sequence.
|
|---|
| 9129 |
|
|---|
| 9130 | 2011-11-11 Jim Meyering <meyering@redhat.com>
|
|---|
| 9131 |
|
|---|
| 9132 | build: update gnulib for solaris test fix
|
|---|
| 9133 |
|
|---|
| 9134 | 2011-11-10 Jim Meyering <meyering@redhat.com>
|
|---|
| 9135 |
|
|---|
| 9136 | build: update gnulib submodule to latest
|
|---|
| 9137 |
|
|---|
| 9138 | maint: adjust the URL that will appear in the generated announcement
|
|---|
| 9139 | * cfg.mk (url_dir_list): Use this http://ftp.gnu.org/gnu/$(PACKAGE)
|
|---|
| 9140 | for the first link listed in the generated announcement.
|
|---|
| 9141 | announce-gen now provides the faster mirror link automatically.
|
|---|
| 9142 |
|
|---|
| 9143 | 2011-11-06 Jim Meyering <meyering@redhat.com>
|
|---|
| 9144 |
|
|---|
| 9145 | build: stop distributing gzip'd releases; xz is enough
|
|---|
| 9146 | * configure.ac (AM_INIT_AUTOMAKE): Add no-dist-gzip.
|
|---|
| 9147 | * NEWS (Build-related): Mention that we're dropping .tar.gz.
|
|---|
| 9148 |
|
|---|
| 9149 | build: update gnulib submodule to latest
|
|---|
| 9150 |
|
|---|
| 9151 | 2011-10-14 Stefano Lattarini <stefano.lattarini@gmail.com>
|
|---|
| 9152 |
|
|---|
| 9153 | distcheck: ensure dist-hook fails if syntax-check fails
|
|---|
| 9154 | * Makefile.am (run-syntax-check): Fix logic, to ensure that
|
|---|
| 9155 | the recipe of this target returns a non-zero exit status if
|
|---|
| 9156 | "make syntax-check" fails.
|
|---|
| 9157 |
|
|---|
| 9158 | 2011-10-12 Jim Meyering <meyering@redhat.com>
|
|---|
| 9159 |
|
|---|
| 9160 | build: update gnulib submodule to latest
|
|---|
| 9161 | This should fix a few portability problems, including one on HP-UX
|
|---|
| 9162 | and a test-float failure on PPC, reported by Andreas Metzler.
|
|---|
| 9163 |
|
|---|
| 9164 | 2011-10-10 Stefano Lattarini <stefano.lattarini@gmail.com>
|
|---|
| 9165 |
|
|---|
| 9166 | gitignore: merge top-level and tests/ .gitignore files
|
|---|
| 9167 | * tests/.gitignore: Remove; what little remained of its
|
|---|
| 9168 | contents has been moved ...
|
|---|
| 9169 | * .gitignore: ... here.
|
|---|
| 9170 |
|
|---|
| 9171 | tests: tiny simplification in TESTS_ENVIRONMENT definition
|
|---|
| 9172 | * tests/Makefile.am (TESTS_ENVIRONMENT): Remove redundant use of
|
|---|
| 9173 | `export'.
|
|---|
| 9174 |
|
|---|
| 9175 | 2011-10-10 Stefano Lattarini <stefano.lattarini@gmail.com>
|
|---|
| 9176 |
|
|---|
| 9177 | tests: support development version of automake too
|
|---|
| 9178 | This change implements a more correct and idiomatic use of the
|
|---|
| 9179 | features of the Automake-provided 'parallel-tests' harness.
|
|---|
| 9180 | Moreover, this change is required in order for the testsuite to
|
|---|
| 9181 | continue to work with the new testsuite harness that is planned
|
|---|
| 9182 | to be introduced in Automake 1.12 (which, as of the writing date,
|
|---|
| 9183 | is still under development and in late alpha state).
|
|---|
| 9184 |
|
|---|
| 9185 | * tests/Makefile.am (TESTS_ENVIRONMENT): The development version of
|
|---|
| 9186 | automake dos not support setting the interpreter delegated to run
|
|---|
| 9187 | the tests scripts in this variable; instead, use ...
|
|---|
| 9188 | (LOG_COMPILER): ... this variable.
|
|---|
| 9189 | * .gitignore: Ignore `.trs' files in directory `tests/'.
|
|---|
| 9190 | * build-aux/.gitignore: Ignore `test-driver' script.
|
|---|
| 9191 |
|
|---|
| 9192 | 2011-10-03 Eli Zaretskii <eliz@gnu.org>
|
|---|
| 9193 |
|
|---|
| 9194 | dfa: don't mishandle high-bit bytes in a regexp with signed-char
|
|---|
| 9195 | This appears to arise only on systems for which "char" is signed.
|
|---|
| 9196 | * src/dfa.c (FETCH_WC, FETCH): Produce an unsigned value, rather
|
|---|
| 9197 | than a sign-extended one. Fixes a bug on MS-Windows with compiling
|
|---|
| 9198 | patterns that include characters with the 8-th bit set.
|
|---|
| 9199 | (to_uchar): Define. From coreutils.
|
|---|
| 9200 | Reported by David Millis <tvtronix@yahoo.com>.
|
|---|
| 9201 | See http://thread.gmane.org/gmane.comp.gnu.grep.bugs/3893
|
|---|
| 9202 | * NEWS (Bug fixes): Mention it.
|
|---|
| 9203 |
|
|---|
| 9204 | 2011-09-16 Jim Meyering <meyering@redhat.com>
|
|---|
| 9205 |
|
|---|
| 9206 | maint: dfa: simplify multi-byte-related conditionals
|
|---|
| 9207 | * src/dfa.c (setbit_case_fold_c, parse_bracket_exp, lex):
|
|---|
| 9208 | (addtok_mb, dfaparse): Change each "MBS_SUPPORT && MB_CUR_MAX > 1"
|
|---|
| 9209 | test to just "MB_CUR_MAX > 1".
|
|---|
| 9210 | * src/dfasearch.c (kwsincr_case, EGexecute): Likewise.
|
|---|
| 9211 | * src/kwsearch.c (Fcompile, Fexecute): Likewise.
|
|---|
| 9212 | * src/searchutils.c (kwsinit): Likewise.
|
|---|
| 9213 | * src/dfa.c (parse_bracket_exp): Convert
|
|---|
| 9214 | "if (!MBS_SUPPORT || MB_CUR_MAX == 1)" to
|
|---|
| 9215 | "if (MB_CUR_MAX == 1)" and do this:
|
|---|
| 9216 | - assert(!MBS_SUPPORT || MB_CUR_MAX == 1);
|
|---|
| 9217 | + assert(MB_CUR_MAX == 1);
|
|---|
| 9218 |
|
|---|
| 9219 | maint: dfa: simplify several expressions
|
|---|
| 9220 | * src/dfa.c (dfainit): Set d->mb_cur_max unconditionally, now
|
|---|
| 9221 | that MB_CUR_MAX is always usable. With that, simplify all
|
|---|
| 9222 | "MBS_SUPPORT && d->mb_cur_max > 1" to simply "d->mb_cur_max > 1".
|
|---|
| 9223 | (dfastate, dfaexec, dfainit, dfafree): Simplify, removing each
|
|---|
| 9224 | now-unnecessary "MBS_SUPPORT &&".
|
|---|
| 9225 |
|
|---|
| 9226 | maint: dfa: avoid in-function "#if MBS_SUPPORT" tests
|
|---|
| 9227 | * src/dfa.c (setbit_case_fold_c): Remove "#if MBS_SUPPORT" in favor
|
|---|
| 9228 | of simple "if (MBS_SUPPORT ...".
|
|---|
| 9229 | (dfaexec, addtok): Likewise.
|
|---|
| 9230 |
|
|---|
| 9231 | maint: ensure that MB_CUR_MAX is defined even when !MBS_SUPPORT
|
|---|
| 9232 | * src/mbsupport.h [!MBS_SUPPORT] (MB_CUR_MAX): Define to 1.
|
|---|
| 9233 |
|
|---|
| 9234 | build: fix compilation failure when MBS_SUPPORT is 0
|
|---|
| 9235 | * src/dfa.c (add_utf8_anychar): Always compile this function,
|
|---|
| 9236 | but when MBS_SUPPORT is 0, give it an empty body.
|
|---|
| 9237 | (prepare_wc_buf): Likewise.
|
|---|
| 9238 | [! MBS_SUPPORT] (setbit_wc): Define to always abort.
|
|---|
| 9239 |
|
|---|
| 9240 | maint: dfa: simplify dfaoptimize
|
|---|
| 9241 | * src/dfa.c (dfaoptimize): Simplify.
|
|---|
| 9242 | (dfacomp): Remove now-redundant "if (MBS_SUPPORT)" guard,
|
|---|
| 9243 | since dfaoptimize does nothing if !MBS_SUPPORT.
|
|---|
| 9244 |
|
|---|
| 9245 | maint: dfa: remove some #if MBS_SUPPORT guards
|
|---|
| 9246 | * src/dfa.c: Replace a few "#if MBS_SUPPORT" directives with
|
|---|
| 9247 | "if (MBS_SUPPORT)". Remove some altogether.
|
|---|
| 9248 |
|
|---|
| 9249 | maint: dfa: convert #if-MBS_SUPPORT (dfastate)
|
|---|
| 9250 | * src/dfa.c (dfastate): Use regular "if", not #if MBS_SUPPORT.
|
|---|
| 9251 |
|
|---|
| 9252 | maint: dfa: convert #if-MBS_SUPPORT (dfastate)
|
|---|
| 9253 | * src/dfa.c (dfastate): Use regular "if", not #if MBS_SUPPORT.
|
|---|
| 9254 |
|
|---|
| 9255 | maint: dfa: convert #if-MBS_SUPPORT (state_index)
|
|---|
| 9256 | * src/dfa.c (state_index): Use regular "if", not #if MBS_SUPPORT.
|
|---|
| 9257 |
|
|---|
| 9258 | maint: dfa: convert #if-MBS_SUPPORT (dfaparse)
|
|---|
| 9259 | * src/dfa.c (dfaparse): Use regular "if", not #if MBS_SUPPORT.'
|
|---|
| 9260 |
|
|---|
| 9261 | maint: dfa: convert #if-MBS_SUPPORT (copytoks)
|
|---|
| 9262 | * src/dfa.c (copytoks): Use regular "if", not #if MBS_SUPPORT.'
|
|---|
| 9263 |
|
|---|
| 9264 | maint: dfa: convert #if-MBS_SUPPORT (lex)
|
|---|
| 9265 | * src/dfa.c (lex): Use regular "if", not #if MBS_SUPPORT.'
|
|---|
| 9266 |
|
|---|
| 9267 | maint: dfa: convert #if-MBS_SUPPORT (parse_bracket_exp)
|
|---|
| 9268 | * src/dfa.c (parse_bracket_exp): Use regular "if", not #if MBS_SUPPORT.
|
|---|
| 9269 |
|
|---|
| 9270 | maint: dfa: convert #if-MBS_SUPPORT (parse_bracket_exp)
|
|---|
| 9271 | * src/dfa.c (parse_bracket_exp): Use regular "if", not #if MBS_SUPPORT.
|
|---|
| 9272 |
|
|---|
| 9273 | maint: dfa: convert #if-MBS_SUPPORT (parse_bracket_exp)
|
|---|
| 9274 | * src/dfa.c (parse_bracket_exp): Use regular "if", not #if MBS_SUPPORT.
|
|---|
| 9275 |
|
|---|
| 9276 | maint: dfa: convert #if-MBS_SUPPORT (dfaexec)
|
|---|
| 9277 | * src/dfa.c (dfaexec): Use regular "if", not #if MBS_SUPPORT.
|
|---|
| 9278 |
|
|---|
| 9279 | maint: dfa: convert #if-MBS_SUPPORT (dfaexec)
|
|---|
| 9280 | * src/dfa.c (dfaexec): Use regular "if", not #if MBS_SUPPORT.
|
|---|
| 9281 | Also add curly braces around multi-line if/else blocks.
|
|---|
| 9282 |
|
|---|
| 9283 | maint: dfa: remove #if-MBS_SUPPORT (free_mbdata)
|
|---|
| 9284 | * src/dfa.c (free_mbdata): Remove the #if guard altogether.
|
|---|
| 9285 |
|
|---|
| 9286 | maint: dfa: convert #if-MBS_SUPPORT (dfaoptimize, dfacomp)
|
|---|
| 9287 | * src/dfa.c (dfaoptimize, dfacomp): Use regular "if",
|
|---|
| 9288 | not #if MBS_SUPPORT.
|
|---|
| 9289 |
|
|---|
| 9290 | maint: dfa: convert #if-MBS_SUPPORT (dfafree)
|
|---|
| 9291 | * src/dfa.c (dfafree): Use regular "if", not #if MBS_SUPPORT.
|
|---|
| 9292 |
|
|---|
| 9293 | maint: dfa: convert #if-MBS_SUPPORT (parse_bracket_exp, part1)
|
|---|
| 9294 | * src/dfa.c (parse_bracket_exp): Remove in-function #if MBS_SUPPORT.
|
|---|
| 9295 |
|
|---|
| 9296 | maint: remove #if-MBS_SUPPORT declaration guards
|
|---|
| 9297 | * src/search.h: Don't bother to #if-out declarations.
|
|---|
| 9298 |
|
|---|
| 9299 | maint: convert #if-MBS_SUPPORT (EGexecute)
|
|---|
| 9300 | * src/dfasearch.c (EGexecute): Remove in-function #if MBS_SUPPORT.
|
|---|
| 9301 |
|
|---|
| 9302 | maint: convert #if-MBS_SUPPORT (kwsincr_case)
|
|---|
| 9303 | * src/dfasearch.c (kwsincr_case): Remove in-function #if MBS_SUPPORT.
|
|---|
| 9304 | Move decl's down.
|
|---|
| 9305 |
|
|---|
| 9306 | maint: convert #if-MBS_SUPPORT (Fcompile, etc.)
|
|---|
| 9307 | * src/kwsearch.c (Fcompile, Fexecute): Remove in-function #if MBS_SUPPORT.
|
|---|
| 9308 | (Fcompile): Rearrange some declarations. No semantic change.
|
|---|
| 9309 |
|
|---|
| 9310 | maint: convert #if-MBS_SUPPORT (kwsinit)
|
|---|
| 9311 | * src/searchutils.c (kwsinit): Remove in-function #if MBS_SUPPORT.
|
|---|
| 9312 |
|
|---|
| 9313 | maint: dfa: remove case-guarding #if-MBS_SUPPORT
|
|---|
| 9314 | * src/dfa.c [DEBUG] (prtok): Remove now-useless #if-MBS_SUPPORT.
|
|---|
| 9315 |
|
|---|
| 9316 | 2011-09-15 Jim Meyering <meyering@redhat.com>
|
|---|
| 9317 |
|
|---|
| 9318 | maint: remove #if MBS_SUPPORT around member declaration
|
|---|
| 9319 | * src/dfa.c (dfastate): Don't #ifdef-out "mbps" position_set member.
|
|---|
| 9320 |
|
|---|
| 9321 | maint: dfa: remove #if MBS_SUPPORT around struct definition
|
|---|
| 9322 | * src/dfa.c (struct mb_char_classes): Don't #ifdef-out declarations.
|
|---|
| 9323 |
|
|---|
| 9324 | build: avoid compilation failure when building without PCRE support
|
|---|
| 9325 | * src/pcresearch.c [!HAVE_LIBPCRE] (WITHOUT_PCRE_NORETURN): Define
|
|---|
| 9326 | to _Noreturn, not obsoleted-by-gnulib _GL_ATTRIBUTE_NORETURN.
|
|---|
| 9327 | Reported by Eric Blake.
|
|---|
| 9328 |
|
|---|
| 9329 | tests: stop using skip_test_; use skip_ instead
|
|---|
| 9330 | * tests/init.cfg (skip_test_): Remove definition. Use the improved
|
|---|
| 9331 | skip_ function from init.sh, now that it has the same feature.
|
|---|
| 9332 | * tests/euc-mb: s/skip_test_/skip_/
|
|---|
| 9333 | * tests/sjis-mb: Likewise.
|
|---|
| 9334 | * tests/fmbtest: Likewise.
|
|---|
| 9335 |
|
|---|
| 9336 | tests: skip tests that require MBS support
|
|---|
| 9337 | * tests/init.cfg (require_compiled_in_MB_support): New function.
|
|---|
| 9338 | * tests/char-class-multibyte: Use it here, since this test cannot
|
|---|
| 9339 | succeed without MBS support.
|
|---|
| 9340 | * tests/equiv-classes: Likewise.
|
|---|
| 9341 | * tests/euc-mb: Likewise.
|
|---|
| 9342 | * tests/fgrep-infloop: Likewise.
|
|---|
| 9343 | * tests/init.cfg: Likewise.
|
|---|
| 9344 | * tests/prefix-of-multibyte: Likewise.
|
|---|
| 9345 | * tests/turkish-I: Likewise.
|
|---|
| 9346 | * tests/sjis-mb: Likewise.
|
|---|
| 9347 |
|
|---|
| 9348 | tests: make fmbtest explain (to stderr, not log) why it is skipped
|
|---|
| 9349 | * tests/fmbtest: Use skip_ and fail_ to give better diagnostics.
|
|---|
| 9350 |
|
|---|
| 9351 | maint: dfa: improve comments
|
|---|
| 9352 | * src/dfa.c (match_mb_charset, match_anychar): Improve comments.
|
|---|
| 9353 |
|
|---|
| 9354 | 2011-09-14 Jim Meyering <meyering@redhat.com>
|
|---|
| 9355 |
|
|---|
| 9356 | build: update gnulib submodule to newer
|
|---|
| 9357 |
|
|---|
| 9358 | maint: correct indentation
|
|---|
| 9359 | * src/dfa.c (dfaexec): Reposition curly braces to match indentation style.
|
|---|
| 9360 | Remove useless comment.
|
|---|
| 9361 |
|
|---|
| 9362 | maint: move declaration "down" to inner scope where it is used
|
|---|
| 9363 | * src/dfa.c (dfaexec): Move decl of local down into scope where used.
|
|---|
| 9364 |
|
|---|
| 9365 | 2011-09-07 Jim Meyering <meyering@redhat.com>
|
|---|
| 9366 |
|
|---|
| 9367 | doc: use "file name" consistently in grep's --help output
|
|---|
| 9368 | * src/main.c (usage): Use "file name", not "filename" in descriptions
|
|---|
| 9369 | of --with-filename (-H), --no-filename (-h) and --label=LABEL.
|
|---|
| 9370 | Suggested by Sequoia McDowell.
|
|---|
| 9371 |
|
|---|
| 9372 | bug: requires ru_RU.KOI8-R". [bug introduced in grep-2.9]
|
|---|
| 9373 |
|
|---|
| 9374 | 2011-08-31 Matthew Burgess <matthew@linuxfromscratch.org>
|
|---|
| 9375 |
|
|---|
| 9376 | tests: remove debug code that would cp to /t
|
|---|
| 9377 | * tests/unibyte-bracket-expr: Remove debug artifact introduced
|
|---|
| 9378 | by 2011-06-02 commit de5f7000, "tests: exercise a uni-byte [...]
|
|---|
| 9379 | bug: requires ru_RU.KOI8-R". [bug introduced in grep-2.9]
|
|---|
| 9380 |
|
|---|
| 9381 | 2011-08-20 Jim Meyering <meyering@redhat.com>
|
|---|
| 9382 |
|
|---|
| 9383 | build: use largefile module and update to latest gnulib
|
|---|
| 9384 | * configure.ac: Remove AC_SYS_LARGEFILE, subsumed by ...
|
|---|
| 9385 | * bootstrap.conf (gnulib_modules): ...this. Use largefile module.
|
|---|
| 9386 | * gnulib: Update to latest.
|
|---|
| 9387 |
|
|---|
| 9388 | maint: clean up and plug a leak-on-OOM
|
|---|
| 9389 | * src/dfa.c (icatalloc): Clean up; use xrealloc in place of malloc
|
|---|
| 9390 | and realloc; remove conditionals that are unnecessary, now that
|
|---|
| 9391 | failed allocation results in exit.
|
|---|
| 9392 | (enlist): Use xrealloc in place of realloc; remove conditional.
|
|---|
| 9393 | (comsubs): Avoid leak upon failed enlist call.
|
|---|
| 9394 | (dfamust): Use xmalloc in place of malloc.
|
|---|
| 9395 | Remove conditionals, now that icpyalloc and icatalloc never return NULL.
|
|---|
| 9396 |
|
|---|
| 9397 | maint: use x2nrealloc, not xrealloc
|
|---|
| 9398 | * src/main.c (main): Use x2nrealloc, not xrealloc
|
|---|
| 9399 |
|
|---|
| 9400 | 2011-07-24 Jim Meyering <meyering@redhat.com>
|
|---|
| 9401 |
|
|---|
| 9402 | tests: add a test to trigger the bug
|
|---|
| 9403 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 9404 | * tests/in-eq-out-infloop: Exercise the bug/fix.
|
|---|
| 9405 |
|
|---|
| 9406 | exit 2 (rather than infloop) when an input file is also on stdout
|
|---|
| 9407 | This avoids a potential "infinite" disk-filling loop.
|
|---|
| 9408 | Reported in http://savannah.gnu.org/patch/?5316
|
|---|
| 9409 | and http://savannah.gnu.org/bugs/?17457.
|
|---|
| 9410 | * src/main.c: Include "quote.h".
|
|---|
| 9411 | (out_stat): New global.
|
|---|
| 9412 | (grepfile): Compare each regular file's dev/ino/etc.
|
|---|
| 9413 | with those from the file on stdout (if it too is regular).
|
|---|
| 9414 | (main): Set out_stat, if stdout is a regular file.
|
|---|
| 9415 | * src/system.h: Include "same-inode.h".
|
|---|
| 9416 | (same_file_attributes): Define. From diffutils.
|
|---|
| 9417 | (SAME_REGULAR_FILE): Define.
|
|---|
| 9418 | * bootstrap.conf (gnulib_modules): Use quote, not quotearg.
|
|---|
| 9419 | Use same-inode.
|
|---|
| 9420 | * NEWS (Bug fixes): Mention it.
|
|---|
| 9421 |
|
|---|
| 9422 | 2011-07-15 Reuben Thomas <rrt@sc3d.org>
|
|---|
| 9423 |
|
|---|
| 9424 | doc: improve documentation of character classes in the man page
|
|---|
| 9425 | * doc/grep.in.1: Reword documentation of character classes.
|
|---|
| 9426 |
|
|---|
| 9427 | 2011-07-12 Jim Meyering <meyering@redhat.com>
|
|---|
| 9428 |
|
|---|
| 9429 | dfa: remove unnecessary inclusion of verify.h
|
|---|
| 9430 | * src/dfa.c: Don't include "verify.h".
|
|---|
| 9431 |
|
|---|
| 9432 | dfa: simplify use of *ALLOC macros
|
|---|
| 9433 | * src/dfa.c (XNMALLOC, XCALLOC): Redefine without outer cast-to-(t *).
|
|---|
| 9434 | (CALLOC, MALLOC, REALLOC): Remove type "t" parameter and adjust callers.
|
|---|
| 9435 |
|
|---|
| 9436 | dfa: change semantics of REALLOC_IF_NECESSARY's 3rd parameter
|
|---|
| 9437 | * src/dfa.c (REALLOC_IF_NECESSARY): Change meaning of 3rd param,
|
|---|
| 9438 | from "maximum index" to 1 greater than that: the required number
|
|---|
| 9439 | of *P-sized elements. Note that only some of the uses of
|
|---|
| 9440 | REALLOC_IF_NECESSARY needed to be adjusted, the others had already
|
|---|
| 9441 | required an extra element.
|
|---|
| 9442 |
|
|---|
| 9443 | dfa: rename REALLOC_IF_NECESSARY param/local for clarity
|
|---|
| 9444 | * src/dfa.c (REALLOC_IF_NECESSARY): Rename nalloc and new_nalloc
|
|---|
| 9445 | to n_alloc and new_n_alloc.
|
|---|
| 9446 |
|
|---|
| 9447 | dfa: prepare for a semantic change in REALLOC_IF_NECESSARY
|
|---|
| 9448 | * src/dfa.c (REALLOC_IF_NECESSARY): Remove "t" (type) parameter.
|
|---|
| 9449 | Use (*p) instead. Adjust all callers.
|
|---|
| 9450 |
|
|---|
| 9451 | dfa: add braces to REALLOC_IF_NECESSARY definition
|
|---|
| 9452 | * src/dfa.c (REALLOC_IF_NECESSARY): Add curly braces; use TABs
|
|---|
| 9453 | to right-indent.
|
|---|
| 9454 |
|
|---|
| 9455 | 2011-06-28 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 9456 |
|
|---|
| 9457 | doc: improve documentation of character classes
|
|---|
| 9458 | * doc/grep.texi (Character classes): Mention explicitly when
|
|---|
| 9459 | examples refer to the C locale, explain better the general
|
|---|
| 9460 | meaning of character classes.
|
|---|
| 9461 |
|
|---|
| 9462 | 2011-06-28 Jim Meyering <meyering@redhat.com>
|
|---|
| 9463 |
|
|---|
| 9464 | dfa: fix the root cause of the heap overrun
|
|---|
| 9465 | dfa's "insert" function was supposed to be maintaining the position
|
|---|
| 9466 | list sorted on *decreasing* index, but since the 2009-12-09 "Speed
|
|---|
| 9467 | up insert" commit, 62458291, it was using code that assumed the data
|
|---|
| 9468 | were sorted on *increasing* index. As such, sometimes it would no
|
|---|
| 9469 | longer merge constraints (not finding a match) and would append
|
|---|
| 9470 | entries that normally would have matched and been merged. Those
|
|---|
| 9471 | erroneous append operations resulted in the heap overrun fixed by
|
|---|
| 9472 | 2011-06-17 commit 0b91d692 by doubling the array size.
|
|---|
| 9473 | * src/dfa.c (insert): Fix the comparison.
|
|---|
| 9474 | (dfaanalyze): Now that that's fixed, revert commit 0b91d692,
|
|---|
| 9475 | allocating space for only d->nleaves entries, not double that.
|
|---|
| 9476 | As far as I can tell, this change has no effect other than
|
|---|
| 9477 | decreased memory usage, although it may improve performance
|
|---|
| 9478 | slightly, since the resulting list of positions is half as long
|
|---|
| 9479 | as it used to be.
|
|---|
| 9480 |
|
|---|
| 9481 | 2011-06-28 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 9482 |
|
|---|
| 9483 | dfa: use memcpy to copy position_sets
|
|---|
| 9484 | * src/dfa.c (copy): Use memcpy.
|
|---|
| 9485 |
|
|---|
| 9486 | dfa: use copyset to copy charclasses
|
|---|
| 9487 | * src/dfa.c (add_utf8_anychar): Change memcpy to copyset.
|
|---|
| 9488 |
|
|---|
| 9489 | gnulib: Update
|
|---|
| 9490 | Fixes mmap-anon.m4 conflict with fn_grep, reported by Rainer Orth.
|
|---|
| 9491 |
|
|---|
| 9492 | 2011-06-21 Jim Meyering <meyering@redhat.com>
|
|---|
| 9493 |
|
|---|
| 9494 | maint: update bootstrap from gnulib
|
|---|
| 9495 | * bootstrap: Update to latest, so it no longer inserts empty lines
|
|---|
| 9496 | in .gitignore files.
|
|---|
| 9497 | * .gitignore: Let bootstrap move "!..." lines to end of file.
|
|---|
| 9498 |
|
|---|
| 9499 | post-release administrivia
|
|---|
| 9500 | * NEWS: Add header line for next release.
|
|---|
| 9501 | * .prev-version: Record previous version.
|
|---|
| 9502 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 9503 |
|
|---|
| 9504 | version 2.9
|
|---|
| 9505 | * NEWS: Record release date.
|
|---|
| 9506 |
|
|---|
| 9507 | build: avoid a warning when building with --disable-perl-regexp...
|
|---|
| 9508 | and --enable-gcc-warnings.
|
|---|
| 9509 | * src/pcresearch.c (WITHOUT_PCRE_NORETURN): Define.
|
|---|
| 9510 | Remove the unreachable return statement.
|
|---|
| 9511 | Reported by Eric Blake.
|
|---|
| 9512 |
|
|---|
| 9513 | tests: ensure that each test script is executable
|
|---|
| 9514 | This adds a rule run at "make check" time to ensure that
|
|---|
| 9515 | test scripts are consistently executable.
|
|---|
| 9516 | This change is not required for "make check", but makes it easier
|
|---|
| 9517 | for people to run scripts manually, but that is discouraged because
|
|---|
| 9518 | doing so makes it easy to omit important variable settings that
|
|---|
| 9519 | are normally provided via TESTS_ENVIRONMENT.
|
|---|
| 9520 | This change also makes each of the existing TESTS executable.
|
|---|
| 9521 | * tests/Makefile.am (check_executable_TESTS): New rule.
|
|---|
| 9522 | (check): Depend on it.
|
|---|
| 9523 | * tests/{all_scripts}: chmod 755.
|
|---|
| 9524 | Prompted by a report from Eric Blake.
|
|---|
| 9525 |
|
|---|
| 9526 | maint: update bootstrap from gnulib
|
|---|
| 9527 | * bootstrap: Update from gnulib.
|
|---|
| 9528 |
|
|---|
| 9529 | maint: update po/POTFILES.in
|
|---|
| 9530 | * po/POTFILES.in: Remove dfasearch.c, now that it no longer
|
|---|
| 9531 | contains a translatable diagnostic.
|
|---|
| 9532 |
|
|---|
| 9533 | tests: include-exclude: avoid false positive failure on FreeBSD
|
|---|
| 9534 | * tests/include-exclude: Avoid false-positive failure due to
|
|---|
| 9535 | matching "a" in a directory on FreeBSD, when searching a directory
|
|---|
| 9536 | without "-r". Search for '^aaa$' rather than just 'a'.
|
|---|
| 9537 | Adjust test inputs and expected output files accordingly.
|
|---|
| 9538 |
|
|---|
| 9539 | dfa: remove some useless casts
|
|---|
| 9540 | * src/dfa.c (icatalloc): Change type of "old" parameter
|
|---|
| 9541 | from "char const *" to "char *".
|
|---|
| 9542 | Don't cast-away const on realloc argument.
|
|---|
| 9543 | Remove now-unnecessary const-discarding cast.
|
|---|
| 9544 | Don't (void)-cast strcpy result.
|
|---|
| 9545 | * src/dosbuf.c (undossify_input): Remove anachronistic
|
|---|
| 9546 | cast-to-"char *" of realloc argument.
|
|---|
| 9547 |
|
|---|
| 9548 | dfa: more heap-allocation-related overflow protection
|
|---|
| 9549 | * src/dfa.c (enlist): Use xnrealloc, not realloc.
|
|---|
| 9550 | Also, remove unnecessary cast-to-(char *).
|
|---|
| 9551 | (dfamust): Use xnmalloc, not malloc. Before, this code would
|
|---|
| 9552 | return upon malloc failure (xnmalloc exits upon failure), but
|
|---|
| 9553 | later, via the *ALLOC macros, it could already exit, so this
|
|---|
| 9554 | new potential exit point is nothing new. The same applies
|
|---|
| 9555 | to enlist, since it is called only through dfamust.
|
|---|
| 9556 |
|
|---|
| 9557 | tests: update init.sh; simplify TESTS_ENVIRONMENT
|
|---|
| 9558 | * tests/init.sh: Update from coreutils.
|
|---|
| 9559 | * tests/Makefile.am (TESTS_ENVIRONMENT): Remove shell_or_perl_
|
|---|
| 9560 | function. Instead, just use $(SHELL), since grep has no test
|
|---|
| 9561 | that starts with #!/usr/bin/perl.
|
|---|
| 9562 |
|
|---|
| 9563 | 2011-06-20 Jim Meyering <meyering@redhat.com>
|
|---|
| 9564 |
|
|---|
| 9565 | build: update gnulib submodule to latest
|
|---|
| 9566 |
|
|---|
| 9567 | build: avoid configure/gnulib-related errors
|
|---|
| 9568 | * bootstrap.conf: Remove now-unnecessary code to exclude
|
|---|
| 9569 | gettext/intl-related m4 tests.
|
|---|
| 9570 |
|
|---|
| 9571 | 2011-06-19 Jim Meyering <meyering@redhat.com>
|
|---|
| 9572 |
|
|---|
| 9573 | maint: tighten up superfluous code
|
|---|
| 9574 | * src/main.c (parse_grep_colors): Use xstrdup in place of xmalloc,
|
|---|
| 9575 | a useless test, strlen, and strcpy.
|
|---|
| 9576 |
|
|---|
| 9577 | 2011-06-19 Paul Eggert <eggert@cs.ucla.edu>
|
|---|
| 9578 |
|
|---|
| 9579 | dfa: avoid possibility of overflow
|
|---|
| 9580 | * src/dfa.c (REALLOC_IF_NECESSARY, CALLOC, MALLOC, REALLOC):
|
|---|
| 9581 | Use functions from xalloc.h to avoid overflow.
|
|---|
| 9582 | * src/dfasearch.c (GEAcompile): Use xnrealloc rather than realloc.
|
|---|
| 9583 | * src/pcresearch.c (Pcompile): Use xnmalloc, not xmalloc.
|
|---|
| 9584 |
|
|---|
| 9585 | 2011-06-17 Jim Meyering <meyering@redhat.com>
|
|---|
| 9586 |
|
|---|
| 9587 | build: update gnulib submodule to latest
|
|---|
| 9588 |
|
|---|
| 9589 | dfa: correct two uses of btowc
|
|---|
| 9590 | * src/dfa.c (setbit_c, setbit_case_fold_c): Compare the btowc
|
|---|
| 9591 | return value against WEOF, not EOF. Suggested by Eli Zaretskii.
|
|---|
| 9592 | On a system like MinGW with unsigned wint_t, comparing a btowc
|
|---|
| 9593 | return value against EOF (-1) would always be false.
|
|---|
| 9594 |
|
|---|
| 9595 | dfa: don't overrun a malloc'd buffer for certain regexps
|
|---|
| 9596 | * src/dfa.c (dfaanalyze): Allocate space for twice as many
|
|---|
| 9597 | positions as there are leaves. Before this change, for some
|
|---|
| 9598 | regular expressions, DFA analysis would have inserted far more
|
|---|
| 9599 | "positions" than dfa->nleaves (up to double).
|
|---|
| 9600 | Reported by Raymond Russell in http://savannah.gnu.org/bugs/?33547
|
|---|
| 9601 | * tests/dfa-heap-overrun: Trigger the overrun.
|
|---|
| 9602 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 9603 | * NEWS (Bug fixes): Mention it.
|
|---|
| 9604 |
|
|---|
| 9605 | 2011-06-08 Jim Meyering <meyering@redhat.com>
|
|---|
| 9606 |
|
|---|
| 9607 | tests: don't ignore sjis-mb test failure
|
|---|
| 9608 | I made changes that caused grep to segfault during "make check" --
|
|---|
| 9609 | as seen in dmesg output -- yet no test failed(!), and there was no
|
|---|
| 9610 | trace of the segfault in the logs.
|
|---|
| 9611 | * tests/sjis-mb (test_grep_reject): Ensure that output is empty.
|
|---|
| 9612 | Don't ignore test failure.
|
|---|
| 9613 |
|
|---|
| 9614 | 2011-06-07 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 9615 |
|
|---|
| 9616 | dfa: optimize wide characters in a bracket expression
|
|---|
| 9617 | * src/dfa.c (addtok): Compile characters to an alternation. Handle the
|
|---|
| 9618 | case when nothing else remains in the MBCSET.
|
|---|
| 9619 |
|
|---|
| 9620 | dfa: refactor to prepare for upcoming optimizations
|
|---|
| 9621 | * src/dfa.c (parse_bracket_exp): Move optimization of MBCSET from here...
|
|---|
| 9622 | (addtok): ... to here.
|
|---|
| 9623 |
|
|---|
| 9624 | 2011-06-07 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 9625 |
|
|---|
| 9626 | dfa: correct handling of single-byte character ranges
|
|---|
| 9627 | This provides a better fix for the unibyte-bracket-expr and high-bit-range
|
|---|
| 9628 | testcases, and fixes the latent bug tested by bogus-wctob.
|
|---|
| 9629 |
|
|---|
| 9630 | * src/dfa.c (setbit_case_fold): Remove, replace with...
|
|---|
| 9631 | (setbit_wc, setbit_c, setbit_case_fold_c): ... these.
|
|---|
| 9632 | (parse_bracket_exp): Use setbit_case_fold_c when iterating over
|
|---|
| 9633 | single-byte sequences. Use setbit_wc for multi-byte character sets,
|
|---|
| 9634 | and setbit_case_fold_c for single-byte character sets.
|
|---|
| 9635 | (lex): Use setbit_case_fold_c for single-byte character sets.
|
|---|
| 9636 |
|
|---|
| 9637 | 2011-06-07 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 9638 |
|
|---|
| 9639 | tests: exercise latent bug in character ranges
|
|---|
| 9640 | * tests/bogus-wctob: New.
|
|---|
| 9641 | * Makefile.am (TESTS): Add it.
|
|---|
| 9642 |
|
|---|
| 9643 | 2011-06-07 Jim Meyering <meyering@redhat.com>
|
|---|
| 9644 |
|
|---|
| 9645 | tests: exercise a uni-byte [...] bug: requires ru_RU.KOI8-R
|
|---|
| 9646 | * tests/unibyte-bracket-expr: New file.
|
|---|
| 9647 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 9648 | * init.cfg (require_ru_RU_koi8_r): New function.
|
|---|
| 9649 |
|
|---|
| 9650 | fix the [...] bug also for relatively unusual uni-byte encodings
|
|---|
| 9651 | * src/dfa.c (setbit_case_fold): Also handle uni-byte locales
|
|---|
| 9652 | like the one mentioned in the original report: see 2011-05-07
|
|---|
| 9653 | commit d98338eb. Re-reported by Santiago Ruano Rincón.
|
|---|
| 9654 | Note that most uni-byte locales are not affected.
|
|---|
| 9655 | * NEWS (Bug fixes): Mention it.
|
|---|
| 9656 |
|
|---|
| 9657 | tests: use skip_test_, not skip_
|
|---|
| 9658 | Use skip_test_, not skip_. The former prints its message both to
|
|---|
| 9659 | the log file and to FD 9 (redirected to tty via tests/Makefile.am),
|
|---|
| 9660 | while skip_ prints only to stderr, which goes to the log file.
|
|---|
| 9661 | * tests/init.cfg (skip_test_): New function.
|
|---|
| 9662 | Use skip_test_ in place of skip_ everywhere.
|
|---|
| 9663 | * tests/fmbtest: s/skip_/skip_test_/
|
|---|
| 9664 | * tests/sjis-mb: Likewise.
|
|---|
| 9665 | * tests/euc-mb: Likewise.
|
|---|
| 9666 |
|
|---|
| 9667 | tests: fmbtest: factor
|
|---|
| 9668 | * tests/fmbtest: Factor out locale-name duplication.
|
|---|
| 9669 |
|
|---|
| 9670 | tests: fix skip-inducing typo in fmbtest
|
|---|
| 9671 | * tests/fmbtest: Fix locale name typo (s/cz_CZ/cs_CZ/)
|
|---|
| 9672 | that would cause this test to be skipped every time.
|
|---|
| 9673 |
|
|---|
| 9674 | 2011-06-07 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 9675 |
|
|---|
| 9676 | gnulib: adjust included modules
|
|---|
| 9677 | * bootstrap.conf (gnulib_modules): Drop strtoul, rename wctype to
|
|---|
| 9678 | wctype-h.
|
|---|
| 9679 |
|
|---|
| 9680 | 2011-05-21 Jim Meyering <meyering@redhat.com>
|
|---|
| 9681 |
|
|---|
| 9682 | grep -P: don't abort upon exceeding PCRE's backtracking limit
|
|---|
| 9683 | * src/pcresearch.c (Pexecute): Handle PCRE_ERROR_MATCHLIMIT.
|
|---|
| 9684 | * tests/Makefile.am (XFAIL_TESTS): Remove pcre-abort.
|
|---|
| 9685 | * tests/pcre-abort: Expect failure, no output, and increase
|
|---|
| 9686 | the length of the input string, in case the backtracking limit
|
|---|
| 9687 | is ever raised. Adjust comment.
|
|---|
| 9688 | * NEWS (Bug fixes): Mention it.
|
|---|
| 9689 |
|
|---|
| 9690 | tests: show how to make grep -P abort
|
|---|
| 9691 | * tests/pcre-abort: New file.
|
|---|
| 9692 | Minimal testcase by Paolo Bonzini, derived from a report
|
|---|
| 9693 | by www.beaver@list.ru.
|
|---|
| 9694 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 9695 | (XFAIL_TESTS): Add it here, too, since this test always fails, for now.
|
|---|
| 9696 |
|
|---|
| 9697 | tests: fix oddities in pcre-z
|
|---|
| 9698 | * tests/pcre-z: Redirect stderr inside $(), not outside.
|
|---|
| 9699 | Remove double quotes around $REGEX (which is just 'a') within
|
|---|
| 9700 | double-quoted "$(...)". Split a long line.
|
|---|
| 9701 |
|
|---|
| 9702 | tests: factor out a new require_pcre_ function
|
|---|
| 9703 | * tests/init.cfg (require_pcre_): New function, factored out of...
|
|---|
| 9704 | * tests/pcre-z: ...here. Use the function.
|
|---|
| 9705 | * tests/pcre: Likewise.
|
|---|
| 9706 |
|
|---|
| 9707 | tests: clean up pcre
|
|---|
| 9708 | * tests/pcre: Skip (don't pass) the test when PCRE support is disabled.
|
|---|
| 9709 | Don't redirect so much to /dev/null, now that all test output goes to
|
|---|
| 9710 | pcre.log. Remove unnecessary braces and diagnostic about failing test.
|
|---|
| 9711 |
|
|---|
| 9712 | 2011-05-13 Jim Meyering <meyering@redhat.com>
|
|---|
| 9713 |
|
|---|
| 9714 | post-release administrivia
|
|---|
| 9715 | * NEWS: Add header line for next release.
|
|---|
| 9716 | * .prev-version: Record previous version.
|
|---|
| 9717 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 9718 |
|
|---|
| 9719 | version 2.8
|
|---|
| 9720 | * NEWS: Record release date.
|
|---|
| 9721 |
|
|---|
| 9722 | build: update gnulib, for fixed getcwd test
|
|---|
| 9723 |
|
|---|
| 9724 | build: update gnulib submodule to latest
|
|---|
| 9725 |
|
|---|
| 9726 | maint: remove syntax-checking sc_tight_scope rule
|
|---|
| 9727 | * src/Makefile.am (sc_tight_scope): Remove rule.
|
|---|
| 9728 | Now it's provided via gnulib's maint.mk.
|
|---|
| 9729 | * cfg.mk (sc_tight_scope): Likewise.
|
|---|
| 9730 |
|
|---|
| 9731 | 2011-05-08 Jim Meyering <meyering@redhat.com>
|
|---|
| 9732 |
|
|---|
| 9733 | maint: use consistent declaration syntax
|
|---|
| 9734 | * src/grep.h (matchers): Declare consistently, so the sc_tight_scope
|
|---|
| 9735 | rule detects this as an extern-marked variable.
|
|---|
| 9736 |
|
|---|
| 9737 | 2011-05-07 Jim Meyering <meyering@redhat.com>
|
|---|
| 9738 |
|
|---|
| 9739 | maint: use gnulib's new readme-release module
|
|---|
| 9740 | * bootstrap.conf (gnulib_modules): Add readme-release.
|
|---|
| 9741 | (bootstrap_epilogue): Add the recommended perl one-liner.
|
|---|
| 9742 | * README-release: Remove file; it is now generated from gnulib.
|
|---|
| 9743 | * .gitignore: Add it.
|
|---|
| 9744 | * gnulib: Update submodule to latest.
|
|---|
| 9745 |
|
|---|
| 9746 | tests: exercise bug with 0x80..0xff in [...]
|
|---|
| 9747 | * tests/high-bit-range: New test, inspired by an example in the
|
|---|
| 9748 | report by Igor O. Ladygin: http://bugs.debian.org/624387,
|
|---|
| 9749 | via Santiago Ruano Rincón's http://savannah.gnu.org/bugs/?33198
|
|---|
| 9750 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 9751 |
|
|---|
| 9752 | fix a bug whereby echo c|grep '[c]' would fail for any c in 0x80..0xff
|
|---|
| 9753 | * src/dfa.c (setbit_case_fold) [MBS_SUPPORT]: Set the bit also
|
|---|
| 9754 | when wctob returns EOF.
|
|---|
| 9755 | * NEWS (Bug fixes): Mention it.
|
|---|
| 9756 |
|
|---|
| 9757 | 2011-05-02 Reuben Thomas <rrt@sc3d.org>
|
|---|
| 9758 |
|
|---|
| 9759 | doc: correct comment about mmap
|
|---|
| 9760 | * doc/grep.texi (Other Options) [--mmap]: This option is now
|
|---|
| 9761 | ignored, so using it can have no effect on performance.
|
|---|
| 9762 |
|
|---|
| 9763 | 2011-05-02 Arnold D. Robbins <arnold@skeeve.com>
|
|---|
| 9764 |
|
|---|
| 9765 | build: move add_utf8_anychar into MBS ifdef
|
|---|
| 9766 |
|
|---|
| 9767 | 2011-05-01 Arnold D. Robbins <arnold@skeeve.com>
|
|---|
| 9768 |
|
|---|
| 9769 | maint: remove GAWK ifndef; no longer needed
|
|---|
| 9770 |
|
|---|
| 9771 | 2011-05-01 Jim Meyering <meyering@redhat.com>
|
|---|
| 9772 |
|
|---|
| 9773 | maint: remove now-unnecessary use of gnulib's strtol module
|
|---|
| 9774 | * bootstrap.conf (gnulib_modules): Remove now-obsolete "strtol".
|
|---|
| 9775 |
|
|---|
| 9776 | 2011-04-29 Jim Meyering <meyering@redhat.com>
|
|---|
| 9777 |
|
|---|
| 9778 | maint: tweak README-release
|
|---|
| 9779 | * README-release: Add note to check the NixOS/Hydra autobuilder results.
|
|---|
| 9780 |
|
|---|
| 9781 | 2011-04-28 Jim Meyering <meyering@redhat.com>
|
|---|
| 9782 |
|
|---|
| 9783 | build: update gnulib submodule to latest
|
|---|
| 9784 |
|
|---|
| 9785 | maint: add the tight_scope syntax-checking rule
|
|---|
| 9786 | This ensures that the only externally scoped symbols are ones
|
|---|
| 9787 | that are explicitly marked as "extern" or white-listed like "main".
|
|---|
| 9788 | * src/Makefile.am (sc_tight_scope): New rule, copied from coreutils.
|
|---|
| 9789 | * cfg.mk (sc_tight_scope): Define, to hook to it from the top level.
|
|---|
| 9790 |
|
|---|
| 9791 | maint: mark some function declarations as extern
|
|---|
| 9792 | * src/search.h: Add "extern" keyword to each function declaration.
|
|---|
| 9793 |
|
|---|
| 9794 | 2011-04-23 Jim Meyering <meyering@redhat.com>
|
|---|
| 9795 |
|
|---|
| 9796 | maint: fix doubled-word typos in comments
|
|---|
| 9797 | * src/dfa.c (SUCCEEDS_IN_CONTEXT): Remove doubled "a".
|
|---|
| 9798 | * src/dfa.c (BACKREF): s/it it/it is/
|
|---|
| 9799 |
|
|---|
| 9800 | 2011-04-09 Jim Meyering <meyering@redhat.com>
|
|---|
| 9801 |
|
|---|
| 9802 | maint: fix typos in comments: s/can not/cannot/
|
|---|
| 9803 | * src/dfa.c (check_matching_with_multibyte_ops, dfastate): As above.
|
|---|
| 9804 |
|
|---|
| 9805 | 2011-03-19 Jim Meyering <meyering@redhat.com>
|
|---|
| 9806 |
|
|---|
| 9807 | maint: stop using .x-sc_* files to list syntax-check exemptions
|
|---|
| 9808 | Instead, use the new mechanism with which you merely use a
|
|---|
| 9809 | variable (derived from the rule name) defined in cfg.mk to an ERE
|
|---|
| 9810 | matching the exempted file names.
|
|---|
| 9811 | * gnulib: Update to latest, to get maint.mk that implements this.
|
|---|
| 9812 | * .x-sc_bindtextdomain: Remove file.
|
|---|
| 9813 | * .x-sc_prohibit_tab_based_indentation: Likewise.
|
|---|
| 9814 | * .x-sc_prohibit_xalloc_without_use: Likewise.
|
|---|
| 9815 | * .x-sc_space_tab: Likewise.
|
|---|
| 9816 | * cfg.mk: Define variables to exempt the same files.
|
|---|
| 9817 |
|
|---|
| 9818 | build: correct my change of 2011-01-28
|
|---|
| 9819 | Do not override original dist-hook rule.
|
|---|
| 9820 | * Makefile.am (run-syntax-check): Rename from overriding dist-hook.
|
|---|
| 9821 | (dist-hook): Depend on run-syntax-check.
|
|---|
| 9822 |
|
|---|
| 9823 | 2011-02-27 Jim Meyering <meyering@redhat.com>
|
|---|
| 9824 |
|
|---|
| 9825 | maint: update from gnulib
|
|---|
| 9826 | * bootstrap: Update from gnulib.
|
|---|
| 9827 | * tests/init.sh: Likewise.
|
|---|
| 9828 | * gnulib: Update to latest.
|
|---|
| 9829 |
|
|---|
| 9830 | 2011-01-27 Jim Meyering <meyering@redhat.com>
|
|---|
| 9831 |
|
|---|
| 9832 | build: update gnulib submodule to latest
|
|---|
| 9833 |
|
|---|
| 9834 | build: run syntax-check rules as part of "make dist"
|
|---|
| 9835 | * Makefile.am (dist-hook): Depend on syntax-check.
|
|---|
| 9836 | Suggested by Reuben Thomas.
|
|---|
| 9837 |
|
|---|
| 9838 | 2011-01-26 Jim Meyering <meyering@redhat.com>
|
|---|
| 9839 |
|
|---|
| 9840 | maint: remove unneeded #include directives
|
|---|
| 9841 | * lib/savedir.c: Don't include <stddef.h>. Not needed.
|
|---|
| 9842 | * src/dfa.c: Likewise.
|
|---|
| 9843 |
|
|---|
| 9844 | 2011-01-22 Jim Meyering <meyering@redhat.com>
|
|---|
| 9845 |
|
|---|
| 9846 | build: avoid new syntax-check failures
|
|---|
| 9847 | * .x-sc_bindtextdomain: New file, used to avoid a spurious
|
|---|
| 9848 | failure from the new syntax-check rule.
|
|---|
| 9849 | * NEWS: Remove a trailing space.
|
|---|
| 9850 |
|
|---|
| 9851 | 2011-01-19 Jim Meyering <meyering@redhat.com>
|
|---|
| 9852 |
|
|---|
| 9853 | tests: add a known-to-fail test
|
|---|
| 9854 | * tests/turkish-I: New test.
|
|---|
| 9855 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 9856 | (XFAIL_TESTS): Add here, too.
|
|---|
| 9857 | Reported by Ilya Basin.
|
|---|
| 9858 |
|
|---|
| 9859 | maint: sort test names in Makefile.am
|
|---|
| 9860 | * tests/Makefile.am (TESTS): Sort test names.
|
|---|
| 9861 |
|
|---|
| 9862 | 2011-01-05 Jim Meyering <meyering@redhat.com>
|
|---|
| 9863 |
|
|---|
| 9864 | doc: remove erroneous "{,m}" item from grep man page
|
|---|
| 9865 | * doc/grep.in.1: Remove item describing bogus {,m} regex notation.
|
|---|
| 9866 | Reported by Fernando Basso.
|
|---|
| 9867 |
|
|---|
| 9868 | 2011-01-03 Jim Meyering <meyering@redhat.com>
|
|---|
| 9869 |
|
|---|
| 9870 | maint: update copyright year ranges to include 2011
|
|---|
| 9871 | Run "make update-copyright", so "make syntax-check" works in 2011.
|
|---|
| 9872 |
|
|---|
| 9873 | build: update gnulib submodule to latest
|
|---|
| 9874 |
|
|---|
| 9875 | 2010-12-20 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 9876 |
|
|---|
| 9877 | main: fix exit status on xmalloc failures
|
|---|
| 9878 | * NEWS: Update.
|
|---|
| 9879 | * src/main.c (main): Set exit_failure. Reported by Guy Shaw.
|
|---|
| 9880 |
|
|---|
| 9881 | add comment above fn_grep
|
|---|
| 9882 | * configure.ac (fn_grep): Add comment suggested by Bruno Haible.
|
|---|
| 9883 |
|
|---|
| 9884 | 2010-11-14 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 9885 |
|
|---|
| 9886 | grep: add include guards
|
|---|
| 9887 | * src/system.h: Add multiple inclusion guards.
|
|---|
| 9888 | * src/grep.h: Likewise.
|
|---|
| 9889 |
|
|---|
| 9890 | configure: fix M4 quotation
|
|---|
| 9891 | * configure.ac: Add extra brackets around [...] patterns.
|
|---|
| 9892 |
|
|---|
| 9893 | configure: remove dependency on grep that supports long lines and -e
|
|---|
| 9894 | * configure.ac (fn_grep): New. Set GREP and EGREP to it, replace
|
|---|
| 9895 | with newly-built grep before AC_OUTPUT. Reported by Florin Iucha
|
|---|
| 9896 | <http://savannah.gnu.org/bugs/?31646>.
|
|---|
| 9897 |
|
|---|
| 9898 | 2010-11-04 Jim Meyering <meyering@redhat.com>
|
|---|
| 9899 |
|
|---|
| 9900 | build: update gnulib to latest
|
|---|
| 9901 |
|
|---|
| 9902 | tests: don't hard-code a 5-second timeout; that's not always enough
|
|---|
| 9903 | Instead, time the command in the C locale and use 10 times that
|
|---|
| 9904 | duration -- rounded up to whole seconds -- as the timeout when running
|
|---|
| 9905 | it in the UTF-8 locale.
|
|---|
| 9906 | * tests/backref-multibyte-slow: Compute a performance-relative timeout.
|
|---|
| 9907 | Reported by Gilles Espinasse, regarding an imac 400. For more details,
|
|---|
| 9908 | see http://thread.gmane.org/gmane.comp.gnu.grep.bugs/3360
|
|---|
| 9909 |
|
|---|
| 9910 | 2010-10-09 Jim Meyering <meyering@redhat.com>
|
|---|
| 9911 |
|
|---|
| 9912 | maint: describe policy on copyright year number ranges
|
|---|
| 9913 | * README: Mention coreutils' long-standing policy on use of M-N
|
|---|
| 9914 | ranges in copyright year lists. Requested by Richard Stallman.
|
|---|
| 9915 |
|
|---|
| 9916 | 2010-10-04 Dmitry V. Levin <ldv@altlinux.org>
|
|---|
| 9917 |
|
|---|
| 9918 | build: compile gnulib without -Wcast-align to avoid warnings on ARM
|
|---|
| 9919 | * configure.ac (GNULIB_WARN_CFLAGS): Remove -Wcast-align.
|
|---|
| 9920 |
|
|---|
| 9921 | 2010-09-30 Jim Meyering <meyering@redhat.com>
|
|---|
| 9922 |
|
|---|
| 9923 | maint: don't define a gpg_key_ID. now it's obtained automatically
|
|---|
| 9924 | * cfg.mk (gpg_key_ID): Remove definition. No longer needed.
|
|---|
| 9925 |
|
|---|
| 9926 | 2010-09-23 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 9927 |
|
|---|
| 9928 | tests: add testcase for previous fix
|
|---|
| 9929 | * tests/inconsistent-ranges: New.
|
|---|
| 9930 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 9931 |
|
|---|
| 9932 | 2010-09-23 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 9933 |
|
|---|
| 9934 | dfa: process range expressions consistently with system regex
|
|---|
| 9935 | The actual meaning of range expressions in glibc is not exactly strcoll,
|
|---|
| 9936 | which makes the behavior of grep hard to predict when compiled with the
|
|---|
| 9937 | system regex. Leave to the system regex matcher the decision of which
|
|---|
| 9938 | single-byte characters are matched by a range expression.
|
|---|
| 9939 |
|
|---|
| 9940 | This partially reverts a change made in commit 0d38a8bb (which made
|
|---|
| 9941 | sense at the time, but not now that src/dfa.c is not doing multibyte
|
|---|
| 9942 | character set matching anymore).
|
|---|
| 9943 |
|
|---|
| 9944 | * src/dfa.c (in_coll_range): Remove.
|
|---|
| 9945 | (parse_bracket_exp): Use system regex to find which single-char
|
|---|
| 9946 | bytes match a range expression.
|
|---|
| 9947 |
|
|---|
| 9948 | 2010-09-23 Bruno Haible <bruno@clisp.org>
|
|---|
| 9949 |
|
|---|
| 9950 | build: fix link error on systems that have libiconv but not libintl
|
|---|
| 9951 | * src/Makefile.am (LDADD): Add $(LIBICONV).
|
|---|
| 9952 |
|
|---|
| 9953 | 2010-09-21 Jim Meyering <meyering@redhat.com>
|
|---|
| 9954 |
|
|---|
| 9955 | build: avoid compilation failure on the Hurd
|
|---|
| 9956 | * src/dfasearch.c (dfawarn): Rename enum symbols to use DW_ prefix,
|
|---|
| 9957 | so as not to collide with "GNU", which is defined by the Hurd.
|
|---|
| 9958 | Reported by Matthias Lanzinger in http://savannah.gnu.org/bugs/?31096
|
|---|
| 9959 |
|
|---|
| 9960 | 2010-09-20 Jim Meyering <meyering@redhat.com>
|
|---|
| 9961 |
|
|---|
| 9962 | maint: avoid obsolete gnulib modules
|
|---|
| 9963 | * bootstrap.conf (gnulib_modules): Don't use obsolete atexit module.
|
|---|
| 9964 | Use malloc-gnu and realloc-gnu -- malloc and realloc are obsolete.
|
|---|
| 9965 |
|
|---|
| 9966 | maint: update README-release
|
|---|
| 9967 | * README-release: Reflect changes in coreutils' version of this file.
|
|---|
| 9968 |
|
|---|
| 9969 | 2010-09-20 Aharon Robbins <arnold@skeeve.com>
|
|---|
| 9970 |
|
|---|
| 9971 | dfa: fix compilation when not using MBS
|
|---|
| 9972 | * src/dfa.c (prepare_wc_buf) [!MBS_SUPPORT]: Do not compile this
|
|---|
| 9973 | function.
|
|---|
| 9974 |
|
|---|
| 9975 | 2010-09-16 Jim Meyering <meyering@redhat.com>
|
|---|
| 9976 |
|
|---|
| 9977 | post-release administrivia
|
|---|
| 9978 | * NEWS: Add header line for next release.
|
|---|
| 9979 | * .prev-version: Record previous version.
|
|---|
| 9980 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 9981 |
|
|---|
| 9982 | version 2.7
|
|---|
| 9983 | * NEWS: Record release date.
|
|---|
| 9984 |
|
|---|
| 9985 | 2010-09-13 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 9986 |
|
|---|
| 9987 | tests: add equiv-classes
|
|---|
| 9988 | * configure.ac (USE_INCLUDED_REGEX): Add Automake conditional.
|
|---|
| 9989 | * tests/equiv-classes: New test.
|
|---|
| 9990 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 9991 | (XFAIL_TESTS) [USE_INCLUDED_REGEX]: Mark it as expected failure.
|
|---|
| 9992 |
|
|---|
| 9993 | 2010-09-13 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 9994 |
|
|---|
| 9995 | dfa: fall back to glibc matcher if a MBCSET is found
|
|---|
| 9996 | This patch enables full support of equivalence classes and multicharacter
|
|---|
| 9997 | collation symbols. It can also improve performance problems in some
|
|---|
| 9998 | cases for multibyte grep. Both of these changes however depend on the
|
|---|
| 9999 | glibc version installed in the system.
|
|---|
| 10000 |
|
|---|
| 10001 | For UTF-8 it will trigger only in the presence of MBCSET, e.g. [a-z].
|
|---|
| 10002 | For other character sets all brackets and `.` as well will trigger it.
|
|---|
| 10003 |
|
|---|
| 10004 | * NEWS: Document this.
|
|---|
| 10005 | * src/dfa.c (dfaexec): Fall back to glibc for multibyte matches,
|
|---|
| 10006 | if possible.
|
|---|
| 10007 |
|
|---|
| 10008 | 2010-09-13 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 10009 |
|
|---|
| 10010 | build: update gnulib submodule to latest
|
|---|
| 10011 | This is done to include commit "regex: Pass the system regex if its only
|
|---|
| 10012 | problem is 32-bit regoff_t".
|
|---|
| 10013 |
|
|---|
| 10014 | * gnulib: Update to e2b0e1a.
|
|---|
| 10015 |
|
|---|
| 10016 | 2010-09-12 Jim Meyering <meyering@redhat.com>
|
|---|
| 10017 |
|
|---|
| 10018 | build: update gnulib submodule to latest
|
|---|
| 10019 |
|
|---|
| 10020 | tests: update init.sh from gnulib
|
|---|
| 10021 | * tests/init.sh: Update from gnulib.
|
|---|
| 10022 |
|
|---|
| 10023 | 2010-09-08 Patrick Boyd <pboyd04@gmail.com>
|
|---|
| 10024 |
|
|---|
| 10025 | dfa: reduce stack usage
|
|---|
| 10026 | * src/dfa.c (dfaanalyze): Allocate GRPS and LABELS arrays from heap,
|
|---|
| 10027 | not on the stack. With this change, grep can now run in these UEFI
|
|---|
| 10028 | simulators:
|
|---|
| 10029 | http://sourceforge.net/apps/mediawiki/tianocore/index.php?title=EDK
|
|---|
| 10030 | http://sourceforge.net/apps/mediawiki/tianocore/index.php?title=EDK2
|
|---|
| 10031 |
|
|---|
| 10032 | 2010-09-08 Jim Meyering <meyering@redhat.com>
|
|---|
| 10033 |
|
|---|
| 10034 | tests/portability: avoid spurious failure with OpenBSD's /bin/sh
|
|---|
| 10035 | * tests/warn-char-classes: Don't use "set -x" here. It causes
|
|---|
| 10036 | a spurious test failure on openbsd 4.7 when using its /bin/sh,
|
|---|
| 10037 | since the command, /bin/sh -xc 'P=1 : 2> err' emits "P=1" into err.
|
|---|
| 10038 | To enable set -x, run the test with "VERBOSE=yes", e.g.,
|
|---|
| 10039 | make check -C tests TESTS=warn-char-classes VERBOSE=yes
|
|---|
| 10040 |
|
|---|
| 10041 | 2010-09-07 Jim Meyering <meyering@redhat.com>
|
|---|
| 10042 |
|
|---|
| 10043 | build: update gnulib submodule to latest
|
|---|
| 10044 |
|
|---|
| 10045 | 2010-09-03 Jim Meyering <meyering@redhat.com>
|
|---|
| 10046 |
|
|---|
| 10047 | tests: remove .sh suffix from remaining test scripts.
|
|---|
| 10048 | * tests/backref: Rename from backref.sh.
|
|---|
| 10049 | * tests/bre: Rename from bre.sh.
|
|---|
| 10050 | * tests/ere: Rename from ere.sh.
|
|---|
| 10051 | * tests/file: Rename from file.sh.
|
|---|
| 10052 | * tests/khadafy: Rename from khadafy.sh.
|
|---|
| 10053 | * tests/options: Rename from options.sh.
|
|---|
| 10054 | * tests/pcre: Rename from pcre.sh.
|
|---|
| 10055 | * tests/spencer1: Rename from spencer1.sh.
|
|---|
| 10056 | * tests/spencer2: Rename from spencer2.sh.
|
|---|
| 10057 | * tests/status: Rename from status.sh.
|
|---|
| 10058 | * tests/yesno: Rename from yesno.sh.
|
|---|
| 10059 | * tests/Makefile.am: Reflect renamings.
|
|---|
| 10060 |
|
|---|
| 10061 | tests: convert remaining tests to use init.sh
|
|---|
| 10062 | * tests/file.sh: Use init.sh. Use Exit, not exit. Use grep, not ${GREP}.
|
|---|
| 10063 | * tests/khadafy.sh: Likewise.
|
|---|
| 10064 | * tests/options.sh: Likewise.
|
|---|
| 10065 | * tests/spencer1.sh: Likewise.
|
|---|
| 10066 | * tests/spencer2.sh: Likewise.
|
|---|
| 10067 | * tests/status.sh: Likewise.
|
|---|
| 10068 | * tests/spencer1.awk: Use grep, not ${GREP}.
|
|---|
| 10069 | Don't ignore failure to generate intermediate shell script.
|
|---|
| 10070 | * tests/Makefile.am (CLEANFILES): Remove altogether, now that
|
|---|
| 10071 | all tests use init.sh.
|
|---|
| 10072 | (TESTS_ENVIRONMENT): Don't set GREP. It's no longer used.
|
|---|
| 10073 |
|
|---|
| 10074 | tests: remove warning.sh
|
|---|
| 10075 | * tests/warning.sh: Remove file. All it did was print a warning.
|
|---|
| 10076 | * tests/Makefile.am (TESTS): Remove warning.sh.
|
|---|
| 10077 |
|
|---|
| 10078 | tests: convert pcre.sh to use init.sh
|
|---|
| 10079 | * tests/pcre.sh: Use init.sh. Use Exit, not exit. Use grep, not ${GREP}.
|
|---|
| 10080 |
|
|---|
| 10081 | tests: convert bre.sh to use init.sh
|
|---|
| 10082 | * tests/bre.sh: Use init.sh.
|
|---|
| 10083 | Use Exit, not exit.
|
|---|
| 10084 | Use "$abs_top_srcdir/tests/", not "$srcdir/" to specify inputs.
|
|---|
| 10085 | Source generated bre.script, rather than invoking $SHELL.
|
|---|
| 10086 | * tests/ere.sh: Likewise.
|
|---|
| 10087 | * tests/bre.awk: Use grep, not ${GREP}.
|
|---|
| 10088 | * tests/ere.awk: Likewise.
|
|---|
| 10089 | * tests/Makefile.am (CLEANFILES): Remove bre.script and ere.script.
|
|---|
| 10090 |
|
|---|
| 10091 | tests: convert to use init.sh
|
|---|
| 10092 | * tests/yesno.sh: Use init.sh.
|
|---|
| 10093 | Use Exit, not exit.
|
|---|
| 10094 | Use grep, not $GREP.
|
|---|
| 10095 | * tests/backref.sh: Likewise.
|
|---|
| 10096 | * tests/Makefile.am (CLEANFILES): Remove yesno.txt.
|
|---|
| 10097 |
|
|---|
| 10098 | build: update gnulib submodule to latest
|
|---|
| 10099 |
|
|---|
| 10100 | build: update build/test tools from gnulib
|
|---|
| 10101 | * bootstrap: Update from gnulib.
|
|---|
| 10102 | * tests/init.sh: Likewise.
|
|---|
| 10103 |
|
|---|
| 10104 | 2010-09-01 Jim Meyering <meyering@redhat.com>
|
|---|
| 10105 |
|
|---|
| 10106 | maint: add lib/version-etc.c to the list in POTFILES.in
|
|---|
| 10107 | * po/POTFILES.in: Add lib/version-etc.c.
|
|---|
| 10108 |
|
|---|
| 10109 | 2010-09-01 Jim Meyering <meyering@redhat.com>
|
|---|
| 10110 |
|
|---|
| 10111 | grep: diagnose and exit-2 for bogus REs like [:space:], [:digit:], etc.
|
|---|
| 10112 | When I make a mistake like this:
|
|---|
| 10113 | grep '[:lower:]' ...
|
|---|
| 10114 | be it in a script or on the command line, I want to know about
|
|---|
| 10115 | it as soon as possible. I don't want grep to print a mere warning
|
|---|
| 10116 | that it is interpreting this suspicious and almost guaranteed-wrong
|
|---|
| 10117 | regular expression as a set of just 6 bytes. And I certainly don't
|
|---|
| 10118 | want grep to silently do the wrong thing, even if that would be
|
|---|
| 10119 | officially standards-conforming. It's obvious that I intended
|
|---|
| 10120 | [[:lower:]], and I want my error to be diagnosed in a way that is
|
|---|
| 10121 | most likely to get my attention. Thus, with this change, grep now
|
|---|
| 10122 | prints a diagnostic and exits with status 2 the moment it
|
|---|
| 10123 | encounters an offending [:char_class:] construct.
|
|---|
| 10124 |
|
|---|
| 10125 | This changes the way grep works by default, rather than
|
|---|
| 10126 | putting this new behavior on an option. A new option
|
|---|
| 10127 | would seldom be used in scripts (not portable), and would
|
|---|
| 10128 | probably be used only rarely by those who need it the most.
|
|---|
| 10129 | This new functionality provides a valuable safety measure
|
|---|
| 10130 | and incurs truly negligible risk.
|
|---|
| 10131 |
|
|---|
| 10132 | For strict POSIX compliance, set POSIXLY_CORRECT in
|
|---|
| 10133 | your environment. That disables this new feature.
|
|---|
| 10134 |
|
|---|
| 10135 | Revert the changes from commit 2cd3bcea, "grep: add
|
|---|
| 10136 | --warnings={always,never,auto}.", and then do the following:
|
|---|
| 10137 |
|
|---|
| 10138 | * src/dfasearch.c (dfawarn): Call getenv("POSIXLY_CORRECT") here;
|
|---|
| 10139 | Remove "warning: " from the diagnostic, now that it's more than
|
|---|
| 10140 | a warning, and exit with status 2.
|
|---|
| 10141 | * NEWS (New features): Describe the new semantics.
|
|---|
| 10142 | * tests/warn-char-classes: Adjust one test to accommodate this change.
|
|---|
| 10143 | * doc/grep.texi (Character Classes and Bracket Expressions): Document.
|
|---|
| 10144 | (Environment Variables): Cross-reference it.
|
|---|
| 10145 | Remove reference to obsolete getopt illegal vs. invalid difference.
|
|---|
| 10146 | Thanks to Paul Eggert for suggestions and an initial prod.
|
|---|
| 10147 |
|
|---|
| 10148 | 2010-08-30 Jim Meyering <meyering@redhat.com>
|
|---|
| 10149 |
|
|---|
| 10150 | maint: use gnulib's standard --version-printing code
|
|---|
| 10151 | This includes author names and keeps the copyright year up to date.
|
|---|
| 10152 | * bootstrap.conf (gnulib_modules): Add propername and version-etc-fsf.
|
|---|
| 10153 | * src/main.c (AUTHORS): Define.
|
|---|
| 10154 | (main): Use version_etc, rather than hard-coding the copyright text.
|
|---|
| 10155 | Prompted by a patch from Paolo Bonzini.
|
|---|
| 10156 |
|
|---|
| 10157 | 2010-08-27 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 10158 |
|
|---|
| 10159 | dfa: warn on [:space:] and similar
|
|---|
| 10160 | * src/dfa.c (parse_bracket_exp): Warn on regular expressions such as
|
|---|
| 10161 | [:space:].
|
|---|
| 10162 | * src/dfa.h (dfawarn): New prototype.
|
|---|
| 10163 | * src/dfasearch.c (dfawarn): New.
|
|---|
| 10164 | * NEWS: Document.
|
|---|
| 10165 |
|
|---|
| 10166 | tests: add test for warnings
|
|---|
| 10167 | * tests/Makefile.am (TESTS): Add warn-char-class.
|
|---|
| 10168 | * tests/warn-char-class: New.
|
|---|
| 10169 |
|
|---|
| 10170 | grep: add --warnings={always,never,auto}.
|
|---|
| 10171 | * src/grep.h (no_warnings): New declaration.
|
|---|
| 10172 | * src/main.c (no_warnings): New.
|
|---|
| 10173 | (WARNINGS_OPTION): Add to enum.
|
|---|
| 10174 | (main): Add --warnings. Handle color_option == 2 together with it.
|
|---|
| 10175 |
|
|---|
| 10176 | tests: add failing test for grep from a directory
|
|---|
| 10177 | * tests/Makefile.am (TESTS, XFAIL_TESTS): Add grep-dir.
|
|---|
| 10178 | * tests/grep-dir: New.
|
|---|
| 10179 |
|
|---|
| 10180 | tests: add test for previous commit
|
|---|
| 10181 | * tests/Makefile.am (TESTS): Add grep-dev-null.
|
|---|
| 10182 | * tests/grep-dev-null: New.
|
|---|
| 10183 |
|
|---|
| 10184 | search: fix "grep -Fif /dev/null"
|
|---|
| 10185 | * bootstrap.conf: Include gnulib module minmax.
|
|---|
| 10186 | * src/searchutils.c (mbtolower): Handle *N == 0 case.
|
|---|
| 10187 | * src/system.h: Include minmax.h from gnulib.
|
|---|
| 10188 |
|
|---|
| 10189 | 2010-08-27 Adam Katz <savannah@kopis.com>
|
|---|
| 10190 |
|
|---|
| 10191 | Remove declaration after statement in dfa.c
|
|---|
| 10192 | * dfa.c (dfaexec): Declare saved_end at the beginning of the function.
|
|---|
| 10193 |
|
|---|
| 10194 | 2010-08-13 Jim Meyering <meyering@redhat.com>
|
|---|
| 10195 |
|
|---|
| 10196 | make --include=FILE work once again
|
|---|
| 10197 | The semantics of excluded_file_name changed (when operating on
|
|---|
| 10198 | an "included" file name list).
|
|---|
| 10199 | * src/main.c (main): Adjust for changed semantics of excluded_file_name
|
|---|
| 10200 | simply by removing a negation.
|
|---|
| 10201 | * NEWS (Bug fixes): Mention this fix.
|
|---|
| 10202 | * tests/include-exclude: Add a test for this.
|
|---|
| 10203 | Reported by Joe Perches in http://savannah.gnu.org/bugs/?29876.
|
|---|
| 10204 |
|
|---|
| 10205 | 2010-07-16 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 10206 |
|
|---|
| 10207 | doc: document \s and \S
|
|---|
| 10208 | * doc/grep.texi (The Backslash Character and Special Expressions):
|
|---|
| 10209 | Document \s and \S escapes.
|
|---|
| 10210 |
|
|---|
| 10211 | 2010-05-29 Karl Berry <karl@gnu.org>
|
|---|
| 10212 |
|
|---|
| 10213 | doc: discuss matches that span two or more lines
|
|---|
| 10214 | * doc/grep.texi (Usage): Discuss matching across lines.
|
|---|
| 10215 | (Character Classes and Bracket Expressions) <[:space:]>: refer to it.
|
|---|
| 10216 |
|
|---|
| 10217 | 2010-05-25 Jim Meyering <meyering@redhat.com>
|
|---|
| 10218 |
|
|---|
| 10219 | build: use latest gettext: 0.18
|
|---|
| 10220 | * configure.ac: Use gettext-0.18.
|
|---|
| 10221 | * bootstrap.conf (gnulib_modules): Use gettext-h, not gettext.
|
|---|
| 10222 | since the latter drags in a depedency on gettext 0.18.
|
|---|
| 10223 | Suggested by Bruno Haible.
|
|---|
| 10224 |
|
|---|
| 10225 | maint: update helper scripts from gnulib
|
|---|
| 10226 | * tests/init.sh: Update from gnulib.
|
|---|
| 10227 | * bootstrap: Likewise.
|
|---|
| 10228 |
|
|---|
| 10229 | build: update gnulib submodule to latest
|
|---|
| 10230 |
|
|---|
| 10231 | maint: don't emit an extra newline in each of two diagnostics
|
|---|
| 10232 | * src/main.c (context_length_arg, grepdir): Remove a stray \n in
|
|---|
| 10233 | each of two diagnostics.
|
|---|
| 10234 |
|
|---|
| 10235 | 2010-05-24 Bruno Haible <bruno@clisp.org>
|
|---|
| 10236 |
|
|---|
| 10237 | search: Avoid out-of-bounds access.
|
|---|
| 10238 | * src/dfasearch.c (EGexecute): Avoid access beyond end of buffer
|
|---|
| 10239 | that could happen if start != beg - buf.
|
|---|
| 10240 |
|
|---|
| 10241 | 2010-05-23 Aharon Robbins <arnold@skeeve.com>
|
|---|
| 10242 |
|
|---|
| 10243 | dfa: fix signedness warnings
|
|---|
| 10244 | * src/dfa.c (dfaexec): Cast p when passing it to prepare_wc_buf.
|
|---|
| 10245 |
|
|---|
| 10246 | 2010-05-09 Jim Meyering <meyering@redhat.com>
|
|---|
| 10247 |
|
|---|
| 10248 | tests: update init.sh
|
|---|
| 10249 | * tests/init.sh: Update from gnulib.
|
|---|
| 10250 |
|
|---|
| 10251 | tests: normalize init.sh-sourcing code
|
|---|
| 10252 | * tests/backref-multibyte-slow: Use one-line idiom.
|
|---|
| 10253 | * tests/backref-word: Likewise.
|
|---|
| 10254 | * tests/case-fold-backref: Likewise.
|
|---|
| 10255 | * tests/case-fold-backslash-w: Likewise.
|
|---|
| 10256 | * tests/case-fold-char-class: Likewise.
|
|---|
| 10257 | * tests/case-fold-char-range: Likewise.
|
|---|
| 10258 | * tests/case-fold-char-type: Likewise.
|
|---|
| 10259 | * tests/char-class-multibyte: Likewise.
|
|---|
| 10260 | * tests/dfaexec-multibyte: Likewise.
|
|---|
| 10261 | * tests/empty: Likewise.
|
|---|
| 10262 | * tests/euc-mb: Likewise.
|
|---|
| 10263 | * tests/fedora: Likewise.
|
|---|
| 10264 | * tests/fgrep-infloop: Likewise.
|
|---|
| 10265 | * tests/fmbtest: Likewise.
|
|---|
| 10266 | * tests/foad1: Likewise.
|
|---|
| 10267 | * tests/ignore-mmap: Likewise.
|
|---|
| 10268 | * tests/include-exclude: Likewise.
|
|---|
| 10269 | * tests/max-count-vs-context: Likewise.
|
|---|
| 10270 | * tests/pcre-z: Likewise.
|
|---|
| 10271 | * tests/prefix-of-multibyte: Likewise.
|
|---|
| 10272 | * tests/reversed-range-endpoints: Likewise.
|
|---|
| 10273 | * tests/sjis-mb: Likewise.
|
|---|
| 10274 | * tests/spencer1-locale: Likewise.
|
|---|
| 10275 | * tests/word-delim-multibyte: Likewise.
|
|---|
| 10276 | * tests/word-multi-file: Likewise.
|
|---|
| 10277 |
|
|---|
| 10278 | tests: update help-version
|
|---|
| 10279 | * tests/help-version: Update from coreutils.
|
|---|
| 10280 |
|
|---|
| 10281 | 2010-05-06 Jim Meyering <meyering@redhat.com>
|
|---|
| 10282 |
|
|---|
| 10283 | tests: enable glibc's malloc-perturbing option
|
|---|
| 10284 | * tests/Makefile.am (MALLOC_PERTURB_): Define, in case it's not already
|
|---|
| 10285 | set in your environment.
|
|---|
| 10286 | (TESTS_ENVIRONMENT): Propagate MALLOC_PERTURB_ setting to test scripts.
|
|---|
| 10287 |
|
|---|
| 10288 | 2010-05-06 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 10289 |
|
|---|
| 10290 | dfa: speed up [[:digit:]] and [[:xdigit:]]
|
|---|
| 10291 | There's no "multibyte pain" in these two classes, since POSIX
|
|---|
| 10292 | and ISO C99 mandate their contents.
|
|---|
| 10293 |
|
|---|
| 10294 | Time for "./grep -x '[[:digit:]]' /usr/share/dict/linux.words"
|
|---|
| 10295 | Before: 1.5s, after: 0.07s. (sed manages only 0.5s).
|
|---|
| 10296 |
|
|---|
| 10297 | * src/dfa.c (predicates): Declare struct dfa_ctype separately
|
|---|
| 10298 | from definition. Add sb_only.
|
|---|
| 10299 | (find_pred): Return const struct dfa_ctype *.
|
|---|
| 10300 | (parse_bracket_exp): Return const struct dfa_ctype *. Do
|
|---|
| 10301 | not fill MBCSET for sb_only character types.
|
|---|
| 10302 |
|
|---|
| 10303 | 2010-05-05 Jim Meyering <meyering@redhat.com>
|
|---|
| 10304 |
|
|---|
| 10305 | tests: readability: use awk rather than obfuscated sed
|
|---|
| 10306 | * tests/backref-multibyte-slow: Generate input using an awk for-loop
|
|---|
| 10307 | rather than expensive and harder-to-read sed pipes.
|
|---|
| 10308 | Remove stray "set -x" and "wc -l in".
|
|---|
| 10309 |
|
|---|
| 10310 | dfa: avoid segfault when processing an invalid multi-byte sequence
|
|---|
| 10311 | * src/dfa.c (dfaexec): Handle the cases in which mbrtowc returns
|
|---|
| 10312 | (size_t)-1 or (size_t)-2, rather than setting mblen_buf[i] to an
|
|---|
| 10313 | outrageously large value.
|
|---|
| 10314 |
|
|---|
| 10315 | 2010-05-05 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 10316 |
|
|---|
| 10317 | grep: remove redundant syntax bit
|
|---|
| 10318 | * grep.c (Gcompile): Remove RE_HAT_LISTS_NOT_NEWLINE.
|
|---|
| 10319 |
|
|---|
| 10320 | tests: add test for newly-fixed performance problem
|
|---|
| 10321 | * tests/backref-multibyte-slow: New.
|
|---|
| 10322 | * tests/Makefile.am: Add it.
|
|---|
| 10323 |
|
|---|
| 10324 | 2010-05-05 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 10325 |
|
|---|
| 10326 | dfa: convert to wide character line-by-line
|
|---|
| 10327 | This provides a nice speedup for -m in general, but especially
|
|---|
| 10328 | it avoids quadratic complexity in case we have to go to glibc.
|
|---|
| 10329 |
|
|---|
| 10330 | * NEWS: Document change.
|
|---|
| 10331 | * src/dfa.c (prepare_wc_buf): Extract out of dfaexec. Convert
|
|---|
| 10332 | only up to the next newline.
|
|---|
| 10333 | (dfaexec): Exit multibyte processing loop if past buf_end.
|
|---|
| 10334 | Call prepare_wc_buf again after processing a newline.
|
|---|
| 10335 |
|
|---|
| 10336 | 2010-05-01 Jim Meyering <meyering@redhat.com>
|
|---|
| 10337 |
|
|---|
| 10338 | maint: remove useless #if HAVE_STDLIB_H
|
|---|
| 10339 | * src/mbsupport.h: Don't test HAVE_STDLIB_H.
|
|---|
| 10340 |
|
|---|
| 10341 | 2010-04-20 Jim Meyering <meyering@redhat.com>
|
|---|
| 10342 |
|
|---|
| 10343 | dfa: don't #ifdef-out member declarations
|
|---|
| 10344 | * src/dfa.c (struct dfa): Remove "#if MBS_SUPPORT" guard that made
|
|---|
| 10345 | several member declarations conditional on this cpp definition.
|
|---|
| 10346 | (token): Likewise.
|
|---|
| 10347 | Reported by Anders Wallin.
|
|---|
| 10348 |
|
|---|
| 10349 | tests: ensure that the --mmap option is ignored
|
|---|
| 10350 | * tests/ignore-mmap: New file.
|
|---|
| 10351 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 10352 | Reported by Jaroslav Škarvada in <http://savannah.gnu.org/bugs/?29614>
|
|---|
| 10353 |
|
|---|
| 10354 | 2010-04-20 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 10355 |
|
|---|
| 10356 | dfa: honor RE_DOT_NEWLINE and RE_DOT_NOT_NULL in UTF-8 period optimization
|
|---|
| 10357 | * src/dfa.c (add_utf8_anychar): Check for RE_DOT_NEWLINE and
|
|---|
| 10358 | RE_DOT_NOT_NULL.
|
|---|
| 10359 |
|
|---|
| 10360 | grep: fix --mmap not being ignored
|
|---|
| 10361 | * NEWS: Document bugfix.
|
|---|
| 10362 | * main.c (main): Ignore MMAP_OPTION.
|
|---|
| 10363 |
|
|---|
| 10364 | 2010-04-19 Jim Meyering <meyering@redhat.com>
|
|---|
| 10365 |
|
|---|
| 10366 | maint: avoid syntax-check failure due to indentation via TABs
|
|---|
| 10367 | * src/dfa.c (atom): Expand TABs in indentation.
|
|---|
| 10368 |
|
|---|
| 10369 | build: update gnulib submodule to latest
|
|---|
| 10370 |
|
|---|
| 10371 | maint: restrict scope of two globals to dfasearch.c
|
|---|
| 10372 | * src/dfasearch.c (patterns, pcount): Declare these file-scoped
|
|---|
| 10373 | globals to be static.
|
|---|
| 10374 |
|
|---|
| 10375 | 2010-04-19 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 10376 |
|
|---|
| 10377 | dfa: optimize UTF-8 period
|
|---|
| 10378 | * NEWS: Document improvement.
|
|---|
| 10379 | * src/dfa.c (struct dfa): Add utf8_anychar_classes.
|
|---|
| 10380 | (add_utf8_anychar): New.
|
|---|
| 10381 | (atom): Simplify if/else nesting. Call add_utf8_anychar for ANYCHAR
|
|---|
| 10382 | in UTF-8 locales.
|
|---|
| 10383 | (dfaoptimize): Abort on ANYCHAR.
|
|---|
| 10384 |
|
|---|
| 10385 | dfa: drop ORTOP
|
|---|
| 10386 | * src/dfa.c (token, prtok, addtok_mb, nsubtoks, dfaanalyze, dfamust):
|
|---|
| 10387 | Remove ORTOP.
|
|---|
| 10388 | (regexp): Remove parameter, always add OR at the end, adjust callers.
|
|---|
| 10389 | (atom): Adjust caller.
|
|---|
| 10390 | (dfaparse): Adjust caller. Always add OR at the end.
|
|---|
| 10391 |
|
|---|
| 10392 | dfa: fix {0,0}
|
|---|
| 10393 | * NEWS: Document change.
|
|---|
| 10394 | * src/dfa.c (struct dfa): Remove "broken" field.
|
|---|
| 10395 | (lex): Do not set it.
|
|---|
| 10396 | (closure): On {0,0}, backup and lex another closure without
|
|---|
| 10397 | adding a CAT.
|
|---|
| 10398 | (dfabroken): Remove.
|
|---|
| 10399 | * src/dfa.h (dfabroken): Remove.
|
|---|
| 10400 | * tests/spencer1.tests: Add testcases for {m,n}.
|
|---|
| 10401 |
|
|---|
| 10402 | dfa: simplify dfainit
|
|---|
| 10403 | * src/dfa.c (dfainit): Use memset.
|
|---|
| 10404 |
|
|---|
| 10405 | 2010-04-17 Jim Meyering <meyering@redhat.com>
|
|---|
| 10406 |
|
|---|
| 10407 | doc: fix a nit in HACKING
|
|---|
| 10408 | * HACKING: Correct size of .git/ dir: 9MB, not 30MB.
|
|---|
| 10409 |
|
|---|
| 10410 | tests: add an expected-to-fail test using \< in a multi-byte locale
|
|---|
| 10411 | * tests/word-delim-multibyte: New test. Currently failing.
|
|---|
| 10412 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 10413 | (XFAIL_TESTS): Define, temporarily.
|
|---|
| 10414 | Reported by Jaroslav Škarvada in http://savannah.gnu.org/bugs/?29537.
|
|---|
| 10415 |
|
|---|
| 10416 | 2010-04-16 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 10417 |
|
|---|
| 10418 | test: cover just-fixed bug
|
|---|
| 10419 | * tests/empty: Test -Fw too.
|
|---|
| 10420 |
|
|---|
| 10421 | grep: fix matching the empty string with grep -Fw
|
|---|
| 10422 | * NEWS: Document fix.
|
|---|
| 10423 | * src/kwsearch.c (Fexecute): The empty string is a valid match if it is
|
|---|
| 10424 | a whole word.
|
|---|
| 10425 |
|
|---|
| 10426 | 2010-04-15 Jim Meyering <meyering@redhat.com>
|
|---|
| 10427 |
|
|---|
| 10428 | maint: update init.sh and HACKING
|
|---|
| 10429 | * HACKING: Sync from coreutils.
|
|---|
| 10430 | * tests/init.sh: Update from gnulib.
|
|---|
| 10431 |
|
|---|
| 10432 | 2010-04-13 Jim Meyering <meyering@redhat.com>
|
|---|
| 10433 |
|
|---|
| 10434 | build: update gnulib submodule to latest; adapt
|
|---|
| 10435 | * COPYING: Remove empty line.
|
|---|
| 10436 | * README: Likewise.
|
|---|
| 10437 | * doc/fdl.texi: Likewise.
|
|---|
| 10438 | * tests/backref-word: Likewise.
|
|---|
| 10439 |
|
|---|
| 10440 | 2010-04-11 Stefano Lattarini <stefano.lattarini@gmail.com>
|
|---|
| 10441 |
|
|---|
| 10442 | tests: accept the Debian timeout program
|
|---|
| 10443 | * tests/init.cfg: test timeout with `timeout 10s true'
|
|---|
| 10444 |
|
|---|
| 10445 | 2010-04-08 Jim Meyering <meyering@redhat.com>
|
|---|
| 10446 |
|
|---|
| 10447 | dfa: convert "cannot happen" code/comment to use assert
|
|---|
| 10448 | * src/dfa.c (dfamust): There were numerous "cannot happen" comments,
|
|---|
| 10449 | some associated with "if (expr) goto done;". Replace each with an
|
|---|
| 10450 | equivalent "assert (!expr);".
|
|---|
| 10451 |
|
|---|
| 10452 | build: use gnulib's isblank module
|
|---|
| 10453 | * bootstrap.conf (gnulib_modules): Use gnulib's isblank module,
|
|---|
| 10454 | now that we rely on the function by that name.
|
|---|
| 10455 |
|
|---|
| 10456 | maint: undo TAB-conversion change to gl/lib/*.c.diff
|
|---|
| 10457 | This fixes a bootstrap failure due to the patches not applying.
|
|---|
| 10458 | * .x-sc_prohibit_tab_based_indentation: Add ^gl/lib/.*\.c\.diff$
|
|---|
| 10459 | * gl/lib/regcomp.c.diff: Revert today's TAB->space change.
|
|---|
| 10460 | * gl/lib/regex_internal.c.diff: Likewise.
|
|---|
| 10461 | * gl/lib/regexec.c.diff: Likewise.
|
|---|
| 10462 |
|
|---|
| 10463 | 2010-04-08 Arnold D. Robbins <arnold@skeeve.com>
|
|---|
| 10464 |
|
|---|
| 10465 | dfa: fix declaration of dfabroken in dfa.h
|
|---|
| 10466 | * dfa.h (dfabroken) [GAWK]: Fix declaration to match that in dfa.c.
|
|---|
| 10467 |
|
|---|
| 10468 | 2010-04-08 Jim Meyering <meyering@redhat.com>
|
|---|
| 10469 |
|
|---|
| 10470 | maint: add syntax-check rule to enforce the new no-leading-TABs policy
|
|---|
| 10471 | * cfg.mk (sc_prohibit_tab_based_indentation): New rule, from coreutils.
|
|---|
| 10472 | (sc_prohibit_emacs__indent_tabs_mode__setting): Likewise.
|
|---|
| 10473 | (old_NEWS_hash): Update.
|
|---|
| 10474 | * .x-sc_prohibit_tab_based_indentation: List exempt files.
|
|---|
| 10475 |
|
|---|
| 10476 | 2010-04-08 Jim Meyering <meyering@redhat.com>
|
|---|
| 10477 |
|
|---|
| 10478 | convert all TABs to equivalent spaces in indentation
|
|---|
| 10479 | Using this file,
|
|---|
| 10480 |
|
|---|
| 10481 | cat > leading-blank.exempt <<\EOF
|
|---|
| 10482 | (?:^|\/)ChangeLog[^/]*$
|
|---|
| 10483 | (?:^|\/)(?:GNU)?[Mm]akefile[^/]*$
|
|---|
| 10484 | \.(?:am|mk)$
|
|---|
| 10485 | EOF
|
|---|
| 10486 |
|
|---|
| 10487 | run this command to convert all non-conforming leading white
|
|---|
| 10488 | space to be all spaces:
|
|---|
| 10489 |
|
|---|
| 10490 | git ls-files \
|
|---|
| 10491 | | pcregrep -vf leading-blank.exempt \
|
|---|
| 10492 | | xargs pcregrep -l '^ *\t' \
|
|---|
| 10493 | | xargs perl -MText::Tabs -ni -le \
|
|---|
| 10494 | '$m=/^( *\t[ \t]*)(.*)/; print $m ? expand($1) . $2 : $_'
|
|---|
| 10495 |
|
|---|
| 10496 | 2010-04-08 Jim Meyering <meyering@redhat.com>
|
|---|
| 10497 |
|
|---|
| 10498 | build: include cfg.mk in the distribution tarball
|
|---|
| 10499 | * Makefile.am (EXTRA_DIST): Add cfg.mk.
|
|---|
| 10500 |
|
|---|
| 10501 | 2010-04-08 Jim Meyering <meyering@redhat.com>
|
|---|
| 10502 |
|
|---|
| 10503 | maint: Makefile.am tweak (no semantic change)
|
|---|
| 10504 | * Makefile.am (EXTRA_DIST): List one per line. Sort.
|
|---|
| 10505 |
|
|---|
| 10506 | build: include cfg.mk in the distribution tarball
|
|---|
| 10507 | * Makefile.am (EXTRA_DIST): Add cfg.mk.
|
|---|
| 10508 |
|
|---|
| 10509 | 2010-04-08 Jim Meyering <meyering@redhat.com>
|
|---|
| 10510 |
|
|---|
| 10511 | dfa: move definition of __attribute__ back into dfa.h
|
|---|
| 10512 | * src/dfa.c (__attribute__): Move definition back to...
|
|---|
| 10513 | * src/dfa.h: ... this file. It is essential for non-gcc compilers.
|
|---|
| 10514 | Reported by Arnold Robbins.
|
|---|
| 10515 |
|
|---|
| 10516 | 2010-04-07 Arnold D. Robbins <arnold@skeeve.com>
|
|---|
| 10517 |
|
|---|
| 10518 | dfa: move internals from dfa.h to dfa.c
|
|---|
| 10519 | * src/dfa.h: Move internals into dfa.c.
|
|---|
| 10520 | * src/dfa.c: The dfa internals are now totally local to this file.
|
|---|
| 10521 | (dfaalloc, dfamusts, dfabroken): New functions to access features.
|
|---|
| 10522 | * src/dfasearch.c (dfa): Change this global variable from struct to pointer.
|
|---|
| 10523 | Adapt to that change, and use new functions, dfamusts and dfaalloc.
|
|---|
| 10524 |
|
|---|
| 10525 | 2010-04-07 Jim Meyering <meyering@redhat.com>
|
|---|
| 10526 |
|
|---|
| 10527 | mbtolower: avoid potential NULL-dereference
|
|---|
| 10528 | * src/searchutils.c: Include <assert.h>.
|
|---|
| 10529 | (mbtolower): Assert that 0 < *n, to avoid possibility of NULL-deref.
|
|---|
| 10530 | Remove dead increment.
|
|---|
| 10531 |
|
|---|
| 10532 | maint: tell git to ignore more build products
|
|---|
| 10533 | * .gitignore: Also ignore results of "make ID" and "make tags".
|
|---|
| 10534 |
|
|---|
| 10535 | build: update gnulib submodule to latest
|
|---|
| 10536 |
|
|---|
| 10537 | tests: use init.sh consistently
|
|---|
| 10538 | * tests/euc-mb: Call "path_prepend_ ." on a line by itself,
|
|---|
| 10539 | and with a comment. This makes it so all of the srcdir/init.sh
|
|---|
| 10540 | lines are consistent, project-wide, and so that the addition of "."
|
|---|
| 10541 | to PATH for this test is properly documented.
|
|---|
| 10542 | * tests/sjis-mb: Likewise.
|
|---|
| 10543 |
|
|---|
| 10544 | maint: avoid new syntax-check failure, ...
|
|---|
| 10545 | ...now that the sole use of xmalloc no longer matches the
|
|---|
| 10546 | regular expression used by the syntax-check rule.
|
|---|
| 10547 | * .x-sc_prohibit_xalloc_without_use: Exempt src/kwset.c.
|
|---|
| 10548 |
|
|---|
| 10549 | grep: make kwset's obstack use xmalloc, not malloc
|
|---|
| 10550 | This insidious bug could make grep fail to diagnose a failed malloc,
|
|---|
| 10551 | and then proceed to dereference the resulting NULL pointer.
|
|---|
| 10552 | Note that this bug was unlikely ever to cause real trouble; without
|
|---|
| 10553 | the fix, grep would segfault upon OOM, now it exits with a diagnostic.
|
|---|
| 10554 | * src/kwset.c (malloc) [GREP]: Define without the "(s)" macro
|
|---|
| 10555 | parameter, so that unadorned uses of malloc are also mapped to xmalloc.
|
|---|
| 10556 | One such use is in the expansion of obstack_init.
|
|---|
| 10557 | Report and patch by Nelson H. F. Beebe, in
|
|---|
| 10558 | http://thread.gmane.org/gmane.comp.gnu.grep.bugs/2995
|
|---|
| 10559 |
|
|---|
| 10560 | tests: improve help-version (sync from gzip's version)
|
|---|
| 10561 | * tests/help-version: Cross-check $VERSION and --version output.
|
|---|
| 10562 | * tests/Makefile.am (TESTS_ENVIRONMENT): Export VERSION=$(VERSION).
|
|---|
| 10563 |
|
|---|
| 10564 | 2010-04-06 Jim Meyering <meyering@redhat.com>
|
|---|
| 10565 |
|
|---|
| 10566 | doc: update THANKS
|
|---|
| 10567 | * THANKS: Update.
|
|---|
| 10568 |
|
|---|
| 10569 | 2010-04-06 Aharon Robbins <arnold@skeeve.com>
|
|---|
| 10570 |
|
|---|
| 10571 | build: avoid conflict with WCHAR definition from Cygwin's <windows.h>
|
|---|
| 10572 | * src/dfa.h (enum token): Remove the definition from this file.
|
|---|
| 10573 | Replace with a declaration and typedef. Moved to ...
|
|---|
| 10574 | * src/dfa.c (enum token): ... here.
|
|---|
| 10575 | Reported by Corinna Vinschen.
|
|---|
| 10576 |
|
|---|
| 10577 | 2010-04-06 Jim Meyering <meyering@redhat.com>
|
|---|
| 10578 |
|
|---|
| 10579 | doc: add HACKING
|
|---|
| 10580 | * HACKING: New file. Copied from coreutils, with s/coreutils/grep/
|
|---|
| 10581 | and a few minor edits.
|
|---|
| 10582 |
|
|---|
| 10583 | 2010-04-05 Jim Meyering <meyering@redhat.com>
|
|---|
| 10584 |
|
|---|
| 10585 | tests: pull fixed init.sh from gnulib
|
|---|
| 10586 | * tests/init.sh: Update from gnulib.
|
|---|
| 10587 |
|
|---|
| 10588 | maint: fix new argmatch-related syntax-check failures
|
|---|
| 10589 | * configure.ac (ARGMATCH_DIE): Use usage(EXIT_FAILURE), not exit(1).
|
|---|
| 10590 | * po/POTFILES.in: Add lib/argmatch.c.
|
|---|
| 10591 |
|
|---|
| 10592 | maint: update cfg.mk to work with gnulib's newer "make syntax-check"
|
|---|
| 10593 | * cfg.mk: Update to use new _sc_search_regexp interface. Run this:
|
|---|
| 10594 | perl -pi -e 's/\b_prohibit_regexp\b/_sc_search_regexp/;'
|
|---|
| 10595 | -e 's/\bmsg=/halt=/; s/\bre=/prohibit=/;' cfg.mk
|
|---|
| 10596 | and then adjust backslashes so they still line up.
|
|---|
| 10597 |
|
|---|
| 10598 | maint: update tests/init.sh from gnulib
|
|---|
| 10599 | This ensures that the explanation for any skipped or failed test
|
|---|
| 10600 | is printed on stderr, not buried in each .log file.
|
|---|
| 10601 | * tests/init.sh: Update from gnulib.
|
|---|
| 10602 | * tests/init.cfg (stderr_fileno_): Define to 9, to match the
|
|---|
| 10603 | literal 2>&9 in tests/Makefile.am
|
|---|
| 10604 |
|
|---|
| 10605 | build: update gnulib submodule to latest
|
|---|
| 10606 |
|
|---|
| 10607 | 2010-04-04 Jim Meyering <meyering@redhat.com>
|
|---|
| 10608 |
|
|---|
| 10609 | maint: use argmatch, for better --directories=INVAL diagnostics
|
|---|
| 10610 | Before, you'd see this:
|
|---|
| 10611 | grep: unknown directories method
|
|---|
| 10612 |
|
|---|
| 10613 | Now, you'll see this:
|
|---|
| 10614 | grep: invalid argument `INVAL' for `--directories'
|
|---|
| 10615 | Valid arguments are:
|
|---|
| 10616 | - `read'
|
|---|
| 10617 | - `recurse'
|
|---|
| 10618 | - `skip'
|
|---|
| 10619 | Usage: src/grep [OPTION]... PATTERN [FILE]...
|
|---|
| 10620 | Try `src/grep --help' for more information.
|
|---|
| 10621 |
|
|---|
| 10622 | * bootstrap.conf: Add argmatch.
|
|---|
| 10623 | * configure.ac: Define ARGMATCH_DIE and ARGMATCH_DIE_DECL.
|
|---|
| 10624 | * src/main.c (directories_type): Define.
|
|---|
| 10625 | (directories_args, directories_types) Define.
|
|---|
| 10626 | All of the above so we can...
|
|---|
| 10627 | (main): Use XARGMATCH.
|
|---|
| 10628 | (usage): Declare extern, now that argmatch calls it via ARGMATCH_DIE.
|
|---|
| 10629 |
|
|---|
| 10630 | 2010-04-04 Jim Meyering <meyering@redhat.com>
|
|---|
| 10631 |
|
|---|
| 10632 | dfa.c: const correctness; and remove useless casts of realloc and malloc
|
|---|
| 10633 | * src/dfa.c (icatalloc, icpyalloc, istrstr, enlist): As above.
|
|---|
| 10634 | (inboth, dfamust, comsubs): Likewise.
|
|---|
| 10635 |
|
|---|
| 10636 | dfa.c: use a better (unsigned) type for an index: int->unsigned int
|
|---|
| 10637 | * src/dfa.c (dfaexec): Use "unsigned int" for a logically unsigned index.
|
|---|
| 10638 |
|
|---|
| 10639 | maint: style: use sizeof VAR, rather than sizeof TYPE, where possible
|
|---|
| 10640 | * src/dfa.c (copyset, zeroset): Prefer sizeof EXPR, over sizeof TYPE,
|
|---|
| 10641 | for improved readability/maintainability.
|
|---|
| 10642 | (equal, parse_bracket_exp, addtok_wc, dfaparse, dfaexec): Likewise.
|
|---|
| 10643 |
|
|---|
| 10644 | 2010-04-02 Jim Meyering <meyering@redhat.com>
|
|---|
| 10645 |
|
|---|
| 10646 | dfa.c: use a better (unsigned) type for an index: int->size_t
|
|---|
| 10647 | * src/dfa.c (parse_bracket_exp): Use size_t as type of index, not int.
|
|---|
| 10648 |
|
|---|
| 10649 | maint: const-correctness
|
|---|
| 10650 | * src/dfa.c (tstbit, copyset, equal, charclass_index): Declare read-only
|
|---|
| 10651 | "charclass" parameters to be "const". No semantic change.
|
|---|
| 10652 |
|
|---|
| 10653 | maint: include <wchar.h> and <wctype.h> unconditionally
|
|---|
| 10654 | * src/main.c: Include <wchar.h> and <wctype.h> unconditionally.
|
|---|
| 10655 | Their presence/usefulness are assured by gnulib.
|
|---|
| 10656 | * src/dfa.c: Likewise.
|
|---|
| 10657 | * src/search.h: Likewise.
|
|---|
| 10658 |
|
|---|
| 10659 | maint: MBS_SUPPORT: define to 0/1, not undef/1
|
|---|
| 10660 | Prepare to remove many of these #ifdefs.
|
|---|
| 10661 | * src/mbsupport.h (MBS_SUPPORT): Define to 0/1, not undef/1.
|
|---|
| 10662 | Change each "#ifdef MBS_SUPPORT" to "#if MBS_SUPPORT". Use this:
|
|---|
| 10663 | perl -pi -e 's/ifdef (MBS_SUPPORT)/if $1/' $(g grep -l ifdef.MBS_SUPPO)
|
|---|
| 10664 | * src/dfa.c: s/#ifdef MBS_SUPPORT/#if MBS_SUPPORT/
|
|---|
| 10665 | * src/dfa.h: Likewise.
|
|---|
| 10666 | * src/dfasearch.c: Likewise.
|
|---|
| 10667 | * src/kwsearch.c: Likewise.
|
|---|
| 10668 | * src/main.c: Likewise.
|
|---|
| 10669 | * src/search.h: Likewise.
|
|---|
| 10670 | * src/searchutils.c: Likewise.
|
|---|
| 10671 |
|
|---|
| 10672 | 2010-04-02 Jim Meyering <meyering@redhat.com>
|
|---|
| 10673 |
|
|---|
| 10674 | maint: use STREQ in place of strcmp
|
|---|
| 10675 | perl -pi -e 's/\bstrcmp *\((.*?)\) == 0/STREQ ($1)/' src/main.c
|
|---|
| 10676 | perl -pi -e 's/\bstrcmp *\((.*?)\) != 0/!STREQ ($1)/' src/main.c
|
|---|
| 10677 |
|
|---|
| 10678 | * src/dfa.c (STREQ): Define.
|
|---|
| 10679 | Use it instead of strcmp.
|
|---|
| 10680 | * src/main.c (STREQ): Likewise.
|
|---|
| 10681 | * cfg.mk (local-checks-to-skip): Remove sc_prohibit_strcmp,
|
|---|
| 10682 | to enable the strcmp-prohibition.
|
|---|
| 10683 |
|
|---|
| 10684 | 2010-04-02 Jim Meyering <meyering@redhat.com>
|
|---|
| 10685 |
|
|---|
| 10686 | maint: enable the useless_cpp_parens syntax check
|
|---|
| 10687 | * cfg.mk (local-checks-to-skip): Remove sc_useless_cpp_parens.
|
|---|
| 10688 | * src/main.c (devices, fillbuf, exit_on_match): Remove useless parens.
|
|---|
| 10689 | (print_line_head, grepfile, set_limits, main): Likewise.
|
|---|
| 10690 | * src/vms_fab.h: Likewise.
|
|---|
| 10691 | * vms/config_vms.h: Likewise.
|
|---|
| 10692 | * src/mbsupport.h: Likewise.
|
|---|
| 10693 |
|
|---|
| 10694 | cleanup and improvement: parse command line arguments consistently
|
|---|
| 10695 | * src/main.c: Include c-ctype.h, for this:
|
|---|
| 10696 | (prepend_args): Use c_isspace, not ISSPACE.
|
|---|
| 10697 | This is important so that we parse arguments consistently,
|
|---|
| 10698 | and independently of the current locale.
|
|---|
| 10699 | * bootstrap.conf (gnulib_modules): Add c-ctype.
|
|---|
| 10700 | * src/system.h: Remove IS* definitions here, too.
|
|---|
| 10701 | * src/dfasearch.c (WCHAR): Use isalnum, not ISALNUM.
|
|---|
| 10702 | * src/kwsearch.c (WCHAR): Likewise.
|
|---|
| 10703 | * src/searchutils.c (kwsinit): Use tolower, not TOLOWER.
|
|---|
| 10704 |
|
|---|
| 10705 | cleanup: rely on gnulib's ctype.h functions; remove IS* macros and is_*
|
|---|
| 10706 | * src/dfa.c (setbit_case_fold, prednames): Use official names.
|
|---|
| 10707 | (IS_WORD_CONSTITUENT, lex): Likewise.
|
|---|
| 10708 | (ISALNUM, ISALPHA, ISCNTRL, ISDIGIT, ISGRAPH): Remove definitions.
|
|---|
| 10709 | (ISLOWER, ISPRINT, ISPUNCT, ISSPACE, ISUPPER, ISXDIGIT): Likewise.
|
|---|
| 10710 | (is_alnum, is_alpha, is_blank, is_cntrl, is_digit, is_graph): Likewise.
|
|---|
| 10711 | (is_lower, is_print, is_punct, is_space, is_upper, is_xdigit): Likewise.
|
|---|
| 10712 | (isgraph): Likewise.
|
|---|
| 10713 |
|
|---|
| 10714 | build: update gnulib submodule to latest, and adjust
|
|---|
| 10715 | * src/main.c (parse_grep_colors): Adjust diagnostics not to trigger
|
|---|
| 10716 | the sc_error_message_period and sc_error_message_uppercase
|
|---|
| 10717 | syntax-check rules.
|
|---|
| 10718 |
|
|---|
| 10719 | maint: remove all VMS-related code
|
|---|
| 10720 | * configure.ac (AC_CONFIG_FILES): Remove vms/Makefile
|
|---|
| 10721 | * Makefile.am (SUBDIRS): Remove vms.
|
|---|
| 10722 | * src/Makefile.am (EXTRA_DIST): Remove vms_fab.c and vms_fab.h.
|
|---|
| 10723 | * src/vms_fab.c, src/vms_fab.h, vms/make.com: Remove files.
|
|---|
| 10724 | * vms/Makefile.am, vms/README, vms/config_vms.h: Likewise.
|
|---|
| 10725 |
|
|---|
| 10726 | post-release administrivia
|
|---|
| 10727 | * NEWS: Add header line for next release.
|
|---|
| 10728 | * .prev-version: Record previous version.
|
|---|
| 10729 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 10730 |
|
|---|
| 10731 | version 2.6.3
|
|---|
| 10732 | * NEWS: Record release date.
|
|---|
| 10733 |
|
|---|
| 10734 | 2010-04-02 Jim Meyering <meyering@redhat.com>
|
|---|
| 10735 |
|
|---|
| 10736 | grep: avoid used-undefined error with truncated multibyte input
|
|---|
| 10737 | * src/dfa.c (addtok_wc): Don't use buf[0] (it's undefined) when
|
|---|
| 10738 | wcrtomb returns <= 0.
|
|---|
| 10739 |
|
|---|
| 10740 | MBS_SUPPORT-removal: * src/dfa.c (dfastate):
|
|---|
| 10741 |
|
|---|
| 10742 | 2010-04-01 Jim Meyering <meyering@redhat.com>
|
|---|
| 10743 |
|
|---|
| 10744 | maint: avoid unnecessary 2nd getenv("TERM")
|
|---|
| 10745 | * src/main.c (main): Don't call getenv("TERM") twice -- in the same
|
|---|
| 10746 | expression, even.
|
|---|
| 10747 |
|
|---|
| 10748 | tests: remove all unportable uses of echo
|
|---|
| 10749 | * src/main.c: Use printf rather than echo -ne in a comment.
|
|---|
| 10750 | * tests/fedora: Use printf (not echo) also in ok/fail functions.
|
|---|
| 10751 | * cfg.mk (sc_prohibit_echo_minus_en): New rule, to prohibit
|
|---|
| 10752 | any future introduction.
|
|---|
| 10753 |
|
|---|
| 10754 | tests: add explicit requirement for en_US.UTF-8
|
|---|
| 10755 | * tests/char-class-multibyte: Use require_en_utf8_locale_,
|
|---|
| 10756 | rather than open-coding it.
|
|---|
| 10757 | * tests/prefix-of-multibyte: Require the locale explicitly.
|
|---|
| 10758 | * tests/fgrep-infloop: Likewise.
|
|---|
| 10759 | This fixes test failures that would arise on systems without
|
|---|
| 10760 | that particular locale. Reported by Ludovic Courtès.
|
|---|
| 10761 |
|
|---|
| 10762 | tests: new function, to require an en_US UTF8 locale
|
|---|
| 10763 | * tests/init.cfg (require_en_utf8_locale_): New function.
|
|---|
| 10764 |
|
|---|
| 10765 | tests: use printf, not echo -n, echo -e, or any combination
|
|---|
| 10766 | * tests/fedora: Using printf is more portable.
|
|---|
| 10767 |
|
|---|
| 10768 | grep: remove unnecessary code
|
|---|
| 10769 | * src/main.c (print_line_middle): Now that we use RE_ICASE
|
|---|
| 10770 | (enabled in commit 70e23616, "dfa: rewrite handling of multibyte
|
|---|
| 10771 | case_fold lexing"), this case-conversion code is useless and wasteful.
|
|---|
| 10772 | Remove it.
|
|---|
| 10773 |
|
|---|
| 10774 | doc: fix typo: s/AM_V_AT/AM_V_at/
|
|---|
| 10775 | * doc/Makefile.am (egrep.1 fgrep.1): The former has case consistent
|
|---|
| 10776 | with its sister variable, AM_V_GEN, but the latter is the one that
|
|---|
| 10777 | actually works.
|
|---|
| 10778 |
|
|---|
| 10779 | doc: generated files are best made read-only, ...
|
|---|
| 10780 | ...to minimize risk of accidentally modifying the generated file
|
|---|
| 10781 | rather than its template. These are tiny, so no risk, but it's
|
|---|
| 10782 | a good to be consistent, so generated files are easier to spot.
|
|---|
| 10783 | * doc/Makefile.am (egrep.1 fgrep.1): When generating these files,
|
|---|
| 10784 | ensure that they too are created read-only.
|
|---|
| 10785 |
|
|---|
| 10786 | doc: generate grep.1 from template
|
|---|
| 10787 | * doc/Makefile.am (grep.1): New rule.
|
|---|
| 10788 | (CLEANFILES): Add grep.1 to the list.
|
|---|
| 10789 | * .gitignore: Add /doc/grep.1
|
|---|
| 10790 | * doc/grep.in.1: Replace hard-coded "2.5.1-cvs" with @VERSION@.
|
|---|
| 10791 | Update copyright year list.
|
|---|
| 10792 | Omit the line-splitting \(co directive so that update-copyright
|
|---|
| 10793 | will perform future updates automatically.
|
|---|
| 10794 | Egmont Koblinger reported the outdated version string
|
|---|
| 10795 | and copyright year list in the man page:
|
|---|
| 10796 | http://savannah.gnu.org/bugs/?29390
|
|---|
| 10797 |
|
|---|
| 10798 | doc: prepare to generate grep.1
|
|---|
| 10799 | * doc/grep.1: Rename to...
|
|---|
| 10800 | * doc/grep.in.1: ...this.
|
|---|
| 10801 |
|
|---|
| 10802 | 2010-03-31 Eric Blake <eblake@redhat.com>
|
|---|
| 10803 |
|
|---|
| 10804 | build: avoid another warning
|
|---|
| 10805 | Noticed on cygwin:
|
|---|
| 10806 | get-mb-cur-max.c: In function 'main':
|
|---|
| 10807 | get-mb-cur-max.c:27: error: unused parameter 'argc' [-Wunused-parameter]
|
|---|
| 10808 |
|
|---|
| 10809 | * tests/get-mb-cur-max.c (main): Use argc.
|
|---|
| 10810 |
|
|---|
| 10811 | 2010-03-31 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 10812 |
|
|---|
| 10813 | tests: fix on systems with broken sh
|
|---|
| 10814 | * tests/Makefile.am (TESTS_ENVIRONMENT): Adjust coreutils remnants.
|
|---|
| 10815 | * tests/bre.sh: Invoke script with $SHELL if defined.
|
|---|
| 10816 | * tests/ere.sh: Likewise.
|
|---|
| 10817 | * tests/spencer1-locale: Likewise.
|
|---|
| 10818 | * tests/spencer1.sh: Likewise.
|
|---|
| 10819 |
|
|---|
| 10820 | tests: improve empty test
|
|---|
| 10821 | * tests/empty: Add more tests, note expected failure.
|
|---|
| 10822 |
|
|---|
| 10823 | tests: improve empty test with respect to locales
|
|---|
| 10824 | * tests/empty: Add tests for multiple locales.
|
|---|
| 10825 |
|
|---|
| 10826 | grep: fix grep -F against empty string
|
|---|
| 10827 | * src/searchutils.c (is_mb_middle): Do not return true for empty matches
|
|---|
| 10828 | when p == buf.
|
|---|
| 10829 |
|
|---|
| 10830 | tests: rename empty.sh to empty
|
|---|
| 10831 | * tests/empty.sh: Rename to...
|
|---|
| 10832 | * tests/empty: ... this.
|
|---|
| 10833 | * tests/Makefile.am (TESTS): Adjust.
|
|---|
| 10834 |
|
|---|
| 10835 | tests: convert empty.sh to new style
|
|---|
| 10836 | * tests/empty.sh: Convert to init.sh, add 10-second timeout.
|
|---|
| 10837 |
|
|---|
| 10838 | tests: use get-mb-cur-max in char-class-multibyte
|
|---|
| 10839 | * tests/char-class-multibyte: Use get-mb-cur-max to detect UTF-8 support.
|
|---|
| 10840 | Rewrite previous locale detection code as a grep test.
|
|---|
| 10841 |
|
|---|
| 10842 | tests: fix -Wformat failure
|
|---|
| 10843 | * tests/get-mb-cur-max (main): Cast MB_CUR_MAX to int.
|
|---|
| 10844 |
|
|---|
| 10845 | 2010-03-30 Jim Meyering <meyering@redhat.com>
|
|---|
| 10846 |
|
|---|
| 10847 | doc: add a "Reply-To" to the suggested announcement mail header
|
|---|
| 10848 | * README-release: Add "Reply-To" with the list address,
|
|---|
| 10849 | to minimize risk of replies to the other announcement recipients.
|
|---|
| 10850 | Suggestion from Eric Blake.
|
|---|
| 10851 |
|
|---|
| 10852 | 2010-03-29 Jim Meyering <meyering@redhat.com>
|
|---|
| 10853 |
|
|---|
| 10854 | build: avoid compiler warning when building test program
|
|---|
| 10855 | * tests/Makefile.am (AM_CPPFLAGS, AM_CFLAGS, AM_LDFLAGS): Define,
|
|---|
| 10856 | so that all the usual C compile-and-link machinery comes into play.
|
|---|
| 10857 | * tests/get-mb-cur-max.c: Include "progname.h".
|
|---|
| 10858 | Remove unnecessary inclusion of <ctype.h>.
|
|---|
| 10859 | Mike Frysinger reported the "implicit decl of set_program_name" warning.
|
|---|
| 10860 |
|
|---|
| 10861 | build: detect PCRE support also when <pcre/pcre.h> is the header
|
|---|
| 10862 | * m4/pcre.m4: Also check for <pcre/pcre.h>.
|
|---|
| 10863 | * src/pcresearch.c: Include <pcre/pcre.h>, if needed.
|
|---|
| 10864 | Guard inclusions with HAVE_PCRE_H and HAVE_PCRE_PCRE_H, not HAVE_LIBPCRE.
|
|---|
| 10865 | * NEWS (Bug fixes): Mention it.
|
|---|
| 10866 | Dmitry V. Levin reported that PCRE support was not detected
|
|---|
| 10867 | on systems with <pcre.h> not in the default include path.
|
|---|
| 10868 |
|
|---|
| 10869 | post-release administrivia
|
|---|
| 10870 | * NEWS: Add header line for next release.
|
|---|
| 10871 | * .prev-version: Record previous version.
|
|---|
| 10872 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 10873 |
|
|---|
| 10874 | version 2.6.2
|
|---|
| 10875 | * NEWS: Record release date.
|
|---|
| 10876 |
|
|---|
| 10877 | 2010-03-29 Eric Blake <eblake@redhat.com>
|
|---|
| 10878 |
|
|---|
| 10879 | build: avoid warnings on cygwin
|
|---|
| 10880 | * lib/savedir.c (isdir): Avoid shadowing a declaration.
|
|---|
| 10881 | * src/main.c (get_nondigit_option): Cast away const to avoid
|
|---|
| 10882 | compiler warning.
|
|---|
| 10883 |
|
|---|
| 10884 | maint: ignore new test executable
|
|---|
| 10885 | * .gitignore: Enhance.
|
|---|
| 10886 |
|
|---|
| 10887 | 2010-03-29 Jim Meyering <meyering@redhat.com>
|
|---|
| 10888 |
|
|---|
| 10889 | doc: consolidate redundant-looking entries
|
|---|
| 10890 | * NEWS: Consolidate the two --include/exclude-related entries.
|
|---|
| 10891 | Suggested by Eric Blake.
|
|---|
| 10892 |
|
|---|
| 10893 | 2010-03-29 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 10894 |
|
|---|
| 10895 | tests: use $(...) consistently
|
|---|
| 10896 | * tests/backref.sh: Use `...' instead of ``...'' in comments.
|
|---|
| 10897 | * tests/bre.awk: Use $(...) instead of `...`.
|
|---|
| 10898 | * tests/ere.awk: Use $(...) instead of `...`.
|
|---|
| 10899 | * tests/euc-mb: Use $(...) instead of `...`.
|
|---|
| 10900 | * tests/fmbtest: Use $(...) instead of `...`.
|
|---|
| 10901 | * tests/foad1: Use $(...) instead of `...`.
|
|---|
| 10902 | * tests/pcre-z: Use $(...) instead of `...`. Quote output of grep.
|
|---|
| 10903 | * tests/spencer1-locale.awk: Use $(...) instead of `...`.
|
|---|
| 10904 | * tests/spencer1.awk: Use $(...) instead of `...`.
|
|---|
| 10905 | * tests/yesno.sh: Use $(...) instead of `...`.
|
|---|
| 10906 |
|
|---|
| 10907 | 2010-03-29 Jim Meyering <meyering@redhat.com>
|
|---|
| 10908 |
|
|---|
| 10909 | build: make doc/Makefile.am cleaner and more robust
|
|---|
| 10910 | * doc/Makefile.am (egrep.1 fgrep.1): Generate robustly, i.e.,
|
|---|
| 10911 | do not redirect directly to $@.
|
|---|
| 10912 | Use $(AM_V_GEN).
|
|---|
| 10913 | Do not distribute intermediate files like fgrep.man and egrep.man.
|
|---|
| 10914 | Likewise, do not use them to generate their %.1 images.
|
|---|
| 10915 | Instead, generate the .1 files directly.
|
|---|
| 10916 |
|
|---|
| 10917 | 2010-03-29 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 10918 |
|
|---|
| 10919 | tests: add program to detect locales
|
|---|
| 10920 | * tests/Makefile.am (check_PROGRAMS): Add get-mb-cur-max.
|
|---|
| 10921 | * tests/get-mb-cur-max.c: New.
|
|---|
| 10922 | * tests/euc-mb: Use it. Fail if the former detection test fails.
|
|---|
| 10923 | * tests/sjis-mb: Use it. Fail if the former detection test fails. Expand
|
|---|
| 10924 | comments.
|
|---|
| 10925 |
|
|---|
| 10926 | 2010-03-29 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 10927 |
|
|---|
| 10928 | tests: add tests for SJIS character sets
|
|---|
| 10929 | The attached test will be skipped unless (on a glibc system) you run
|
|---|
| 10930 | something like
|
|---|
| 10931 |
|
|---|
| 10932 | mkdir /usr/lib/locale/ja_JP.SHIFT_JIS
|
|---|
| 10933 | zcat /usr/share/i18n/charmaps/SHIFT_JIS.gz | \
|
|---|
| 10934 | localedef \
|
|---|
| 10935 | -f - \
|
|---|
| 10936 | -i /usr/share/i18n/locales/ja_JP \
|
|---|
| 10937 | /usr/lib/locale/ja_JP.SHIFT_JIS
|
|---|
| 10938 |
|
|---|
| 10939 | * tests/Makefile.am: Add sjis-mb.
|
|---|
| 10940 | * tests/sjis-mb: New.
|
|---|
| 10941 |
|
|---|
| 10942 | 2010-03-29 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 10943 |
|
|---|
| 10944 | grep -F: fix a bug with SJIS character sets
|
|---|
| 10945 | Commit db9d6 would erroneously skip matches in SJIS character sets. In
|
|---|
| 10946 | this character set low bytes (i.e. ASCII bytes) are also valid second
|
|---|
| 10947 | bytes in a double-byte character, so you have to continue looking for
|
|---|
| 10948 | a match, even if you match in the middle of a double-byte character.
|
|---|
| 10949 |
|
|---|
| 10950 | * src/kwsearch.c: Ensure that beg is advanced by at least one byte,
|
|---|
| 10951 | but do not fail immediately after matching in the middle of a double-byte
|
|---|
| 10952 | character.
|
|---|
| 10953 |
|
|---|
| 10954 | 2010-03-28 Bruno Haible <bruno@clisp.org>
|
|---|
| 10955 |
|
|---|
| 10956 | build: update after change in gnulib's lib-ignore module
|
|---|
| 10957 | * src/Makefile.am (AM_LDFLAGS): Define. Use gnulib's new
|
|---|
| 10958 | $(IGNORE_UNUSED_LIBRARIES_CFLAGS).
|
|---|
| 10959 |
|
|---|
| 10960 | 2010-03-28 Jim Meyering <meyering@redhat.com>
|
|---|
| 10961 |
|
|---|
| 10962 | tests: disable new texinfo-acronym syntax-check from gnulib
|
|---|
| 10963 | * cfg.mk (local-checks-to-skip): Add new sc_texinfo_acronym, to skip it.
|
|---|
| 10964 |
|
|---|
| 10965 | 2010-03-28 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 10966 |
|
|---|
| 10967 | tests: exercise fix for improper match of incomplete MB char prefix
|
|---|
| 10968 | * tests/prefix-of-multibyte: New file.
|
|---|
| 10969 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 10970 |
|
|---|
| 10971 | 2010-03-28 Jim Meyering <meyering@redhat.com>
|
|---|
| 10972 |
|
|---|
| 10973 | grep -F: fix a multi-byte erroneous-match-in-middle bug
|
|---|
| 10974 | Just as Perl prints nothing in this case,
|
|---|
| 10975 | printf '\357\274\241\n' | perl -CIO -lne '/\357/ and print'
|
|---|
| 10976 |
|
|---|
| 10977 | grep should also print nothing when used as follows.
|
|---|
| 10978 | However, these would mistakenly match with grep prior to 2.6.2:
|
|---|
| 10979 | printf '\357\274\241\n' | LC_ALL=en_US.UTF-8 src/grep -F $'\357'
|
|---|
| 10980 | printf '\357\274\241\n' | LC_ALL=en_US.UTF-8 src/grep -F $'\357\274'
|
|---|
| 10981 |
|
|---|
| 10982 | * src/searchutils.c (is_mb_middle): New parameter: the length of the
|
|---|
| 10983 | match, in bytes, as determined by kwsexec. Use this to detect when
|
|---|
| 10984 | the nominal match found by kwsexec must be skipped because it is for
|
|---|
| 10985 | an incomplete multi-byte character that is a prefix of a character
|
|---|
| 10986 | in the input.
|
|---|
| 10987 | * src/dfasearch.c (EGexecute): Update caller.
|
|---|
| 10988 | * src/kwsearch.c (Fexecute): Likewise.
|
|---|
| 10989 | * src/search.h: Update prototype.
|
|---|
| 10990 | * NEWS (Bug fixes): Mention it.
|
|---|
| 10991 | Report and analysis by Norihiro Tanaka.
|
|---|
| 10992 |
|
|---|
| 10993 | 2010-03-28 Norihiro Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 10994 |
|
|---|
| 10995 | tests: add tests for the fgrep-infloop bug
|
|---|
| 10996 | * tests/init.cfg (require_timeout_): New function.
|
|---|
| 10997 | * tests/fgrep-infloop: New file. Test for the above fix.
|
|---|
| 10998 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 10999 |
|
|---|
| 11000 | 2010-03-28 Jim Meyering <meyering@redhat.com>
|
|---|
| 11001 |
|
|---|
| 11002 | grep -F: avoid infinite loop when searching for incomplete MB character
|
|---|
| 11003 | Searching for an incomplete non-prefix of a multi-byte character
|
|---|
| 11004 | should find no match.
|
|---|
| 11005 |
|
|---|
| 11006 | Just as these print nothing,
|
|---|
| 11007 | printf '\357\274\241\357\274\241\n' \
|
|---|
| 11008 | | perl -CIO -ne '/\241\357/ and print'
|
|---|
| 11009 | printf '\357\274\241\n' | perl -CIO -ne '/\274\241/ and print'
|
|---|
| 11010 | printf '\357\274\241\n' | perl -CIO -ne '/\241/ and print'
|
|---|
| 11011 | printf '\357\274\241\n' | perl -CIO -ne '/\274/ and print'
|
|---|
| 11012 |
|
|---|
| 11013 | These should also print nothing, but with grep-2.6 and grep-2.6.1,
|
|---|
| 11014 | they would infloop:
|
|---|
| 11015 | printf '\357\274\241\n' | LC_ALL=en_US.UTF-8 src/grep -F $'\241'
|
|---|
| 11016 | printf '\357\274\241\n' | LC_ALL=en_US.UTF-8 src/grep -F $'\274'
|
|---|
| 11017 | printf '\357\274\241\n' | LC_ALL=en_US.UTF-8 src/grep -F $'\274\241'
|
|---|
| 11018 |
|
|---|
| 11019 | * src/kwsearch.c (Fexecute): Don't infloop when searching for
|
|---|
| 11020 | an incomplete non-prefix part of a multi-byte character.
|
|---|
| 11021 | * NEWS (Bug fixes): Mention it.
|
|---|
| 11022 | Reported and diagnosed by Norihiro Tanaka.
|
|---|
| 11023 |
|
|---|
| 11024 | 2010-03-28 Jim Meyering <meyering@redhat.com>
|
|---|
| 11025 |
|
|---|
| 11026 | tests: rename: fmbtest.sh -> fmbtest
|
|---|
| 11027 | * tests/fmbtest.sh: Rename to ...
|
|---|
| 11028 | * tests/fmbtest: ...this, dropping the .sh suffix.
|
|---|
| 11029 | * tests/Makefile.am (TESTS): Reflect renaming.
|
|---|
| 11030 |
|
|---|
| 11031 | tests: convert fmbtest.sh to use init.sh
|
|---|
| 11032 | * tests/fmbtest.sh: Use init.sh and adapt accordingly:
|
|---|
| 11033 | Use "grep", not ${GREP}. Use Exit, not exit.
|
|---|
| 11034 |
|
|---|
| 11035 | tests: also exercise the --include + glob path
|
|---|
| 11036 | * tests/include-exclude: Exercise Javier's fix.
|
|---|
| 11037 |
|
|---|
| 11038 | 2010-03-28 Javier Villavicencio <the_paya@gentoo.org>
|
|---|
| 11039 |
|
|---|
| 11040 | grep -r: fix --include with globs, too
|
|---|
| 11041 | The previous fix addressed only the non-glob case.
|
|---|
| 11042 | * src/main.c (main): Use add_exclude's EXCLUDE_WILDCARDS option,
|
|---|
| 11043 | to enable the use of fnmatch with --include=GLOB.
|
|---|
| 11044 | gnulib: Update to latest, for the fixed exclude.c.
|
|---|
| 11045 |
|
|---|
| 11046 | 2010-03-28 Jim Meyering <meyering@redhat.com>
|
|---|
| 11047 |
|
|---|
| 11048 | grep -r: fix --include with non-globs
|
|---|
| 11049 | * lib/savedir.c (savedir): Fix logic error. Introduced by commit
|
|---|
| 11050 | bf3bd92c, "build: adapt to the newer exclude API we now get from gnulib"
|
|---|
| 11051 | * tests/include-exclude: Test for this bug by exercising --include, too.
|
|---|
| 11052 | * NEWS (Bug fixes): Mention it.
|
|---|
| 11053 | Reported by Philipp Kohlbecher in http://savannah.gnu.org/bugs/?29358
|
|---|
| 11054 |
|
|---|
| 11055 | 2010-03-27 Jim Meyering <meyering@redhat.com>
|
|---|
| 11056 |
|
|---|
| 11057 | kwset: correct comments; require non-NULL kwsmatch argument
|
|---|
| 11058 | * src/kwset.c (kwsexec): Correct comments. This function has been
|
|---|
| 11059 | returning an offset, not a pointer, for 9 years.
|
|---|
| 11060 | Do not test for kwsmatch == NULL. All callers pass non-NULL.
|
|---|
| 11061 | (cwexec): Likewise.
|
|---|
| 11062 | * src/kwset.h (kwsexec): Mark the 4th parameter, kwsmatch, as non-NULL.
|
|---|
| 11063 | Include "arg-nonnull.h".
|
|---|
| 11064 |
|
|---|
| 11065 | build: add -I$(top_builddir)/lib so we also find generated .h files
|
|---|
| 11066 | * src/Makefile.am (AM_CPPFLAGS): Rename from INCLUDES to avoid
|
|---|
| 11067 | warning from automake -Wall.
|
|---|
| 11068 | Add -I$(top_builddir)/lib, so we find generated .h files like
|
|---|
| 11069 | getopt.h in a non-srcdir build.
|
|---|
| 11070 |
|
|---|
| 11071 | build: remove superfluous LOCALEDIR definition
|
|---|
| 11072 | * src/Makefile.am (INCLUDES): Remove unnecessary definition of
|
|---|
| 11073 | LOCALEDIR here. Now, it's defined via gnulib's configmake.h.
|
|---|
| 11074 | * src/system.h: Include "configmake.h" for its LOCALEDIR definition.
|
|---|
| 11075 |
|
|---|
| 11076 | grep: don't segfault upon use of --include or --exclude* options
|
|---|
| 11077 | * lib/savedir.c (isdir1): Fix fatal typo: deref "dir" argument,
|
|---|
| 11078 | not the global (initially-NULL) "path". Reported by Standish Parsley.
|
|---|
| 11079 | * tests/include-exclude: New file.
|
|---|
| 11080 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 11081 | * NEWS (Bug fixes): Mention it.
|
|---|
| 11082 |
|
|---|
| 11083 | 2010-03-26 Jim Meyering <meyering@redhat.com>
|
|---|
| 11084 |
|
|---|
| 11085 | tests: rename: foad1.sh -> foad1
|
|---|
| 11086 | * tests/foad1.sh: Rename to ...
|
|---|
| 11087 | * tests/foad1: ...this, dropping the .sh suffix.
|
|---|
| 11088 | * tests/Makefile.am (TESTS): Reflect renaming.
|
|---|
| 11089 |
|
|---|
| 11090 | tests: convert foad1.sh to use init.sh
|
|---|
| 11091 | This fixes a spurious test failure when "make check" is run with
|
|---|
| 11092 | certain envvars set, e.g., "make check GREP_COLOR=always"
|
|---|
| 11093 | * tests/foad1.sh: Use init.sh and adapt accordingly:
|
|---|
| 11094 | Use "grep", not ${GREP}. Test VERBOSE against "yes", not "1",
|
|---|
| 11095 | to be consistent with init.sh.
|
|---|
| 11096 | Use Exit, not exit.
|
|---|
| 11097 | Reported by Nelson H. F. Beebe.
|
|---|
| 11098 |
|
|---|
| 11099 | tests: insulate tests from envvar settings
|
|---|
| 11100 | * tests/init.cfg (vars_): Unset each envvar that can affect how
|
|---|
| 11101 | grep works. This protects only those tests that have been
|
|---|
| 11102 | converted to use init.sh.
|
|---|
| 11103 |
|
|---|
| 11104 | 2010-03-25 Eric Blake <eblake@redhat.com>
|
|---|
| 11105 |
|
|---|
| 11106 | maint: ignore 'make dist pdf' droppings
|
|---|
| 11107 | * .gitignore: Add more exemptions.
|
|---|
| 11108 |
|
|---|
| 11109 | 2010-03-25 Jim Meyering <meyering@redhat.com>
|
|---|
| 11110 |
|
|---|
| 11111 | tests: avoid spurious test failure due to lack of a French UTF8 locale
|
|---|
| 11112 | * tests/init.cfg: New file. If either $LOCALE_FR or $LOCALE_FR_UTF8
|
|---|
| 11113 | is set to "none", reset it to the empty string.
|
|---|
| 11114 | Reported by Mike Frysinger and Sven Joachim.
|
|---|
| 11115 | * tests/Makefile.am (EXTRA_DIST): Add init.cfg.
|
|---|
| 11116 |
|
|---|
| 11117 | build: do not use pkg-config to test for PCRE support
|
|---|
| 11118 | * configure.ac: Do not use PKG_PROG_PKG_CONFIG or PKG_CHECK_MODULES.
|
|---|
| 11119 | Do not modify CPPFLAGS; that belongs to those who invoke make.
|
|---|
| 11120 | Instead, use autoconf's AC_CHECK_HEADERS and AC_SEARCH_LIBS via the
|
|---|
| 11121 | new macro, gl_FUNC_PCRE, defined in...
|
|---|
| 11122 | * m4/pcre.m4 (gl_FUNC_PCRE): New macro, to handle pcre-related
|
|---|
| 11123 | configure-time tests.
|
|---|
| 11124 | * src/Makefile.am (grep_LDADD): Use LIB_PCRE, not PCRE_LIBS.
|
|---|
| 11125 | * src/pcresearch.c: Test HAVE_LIBPCRE via "#if", not "#ifdef".
|
|---|
| 11126 | All other cpp tests of this symbol used "#if".
|
|---|
| 11127 | Prompted by a suggestion from Bruno Haible.
|
|---|
| 11128 | * NEWS (Build-related): Mention this.
|
|---|
| 11129 |
|
|---|
| 11130 | doc: correct and amend NEWS entries for 2.6.1
|
|---|
| 11131 | * NEWS (Bug fixes): Correct character ranges bug description.
|
|---|
| 11132 | Add an example from Dmitry V. Levin.
|
|---|
| 11133 | Add that the word-with-backref bug was introduced in 2.5.1.
|
|---|
| 11134 | * cfg.mk (old_NEWS_hash): Update to match.
|
|---|
| 11135 |
|
|---|
| 11136 | post-release administrivia
|
|---|
| 11137 | * NEWS: Add header line for next release.
|
|---|
| 11138 | * .prev-version: Record previous version.
|
|---|
| 11139 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 11140 |
|
|---|
| 11141 | version 2.6.1
|
|---|
| 11142 | * NEWS: Record release date.
|
|---|
| 11143 |
|
|---|
| 11144 | 2010-03-25 Tony Abou-Assaleh <taa@acm.org>
|
|---|
| 11145 |
|
|---|
| 11146 | tests: use awk's -v option more portably
|
|---|
| 11147 | * tests/spencer1-locale: Add a space between awk's "-v" option and
|
|---|
| 11148 | the following VAR=value string, to avoid test failure on Mac OS X.
|
|---|
| 11149 |
|
|---|
| 11150 | 2010-03-25 Norihirio Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 11151 |
|
|---|
| 11152 | dfa/grep: fix compilation with MBS_SUPPORT
|
|---|
| 11153 | * src/dfa.c (cur_mb_len): Initialize to 1 and always make it available.
|
|---|
| 11154 | (setbit_case_fold): Do not use wint_t in prototype if !MBS_SUPPORT.
|
|---|
| 11155 | (parse_bracket_exp): Fix compilation with !MBS_SUPPORT.
|
|---|
| 11156 | * src/kwsearch.c (kwsinit): Do not use mbtolower and MB_CUR_MAX
|
|---|
| 11157 | if !MBS_SUPPORT.
|
|---|
| 11158 | * src/searchutils.c (kwsinit): Do not refer to MB_CUR_MAX if !MBS_SUPPORT.
|
|---|
| 11159 |
|
|---|
| 11160 | * tests/char-class-multibyte: Skip if UTF-8 matching does not work.
|
|---|
| 11161 | * tests/fmbtest.sh: Likewise.
|
|---|
| 11162 |
|
|---|
| 11163 | 2010-03-25 Jim Meyering <meyering@redhat.com>
|
|---|
| 11164 |
|
|---|
| 11165 | build: avoid warnings about unnecessary use of "return"
|
|---|
| 11166 | * src/grep.c (Gcompile, Ecompile, Acompile): Do not "return X"
|
|---|
| 11167 | from a function returning void, not even when X itself is a
|
|---|
| 11168 | function returning void. This avoids warnings from Sun Studio 11
|
|---|
| 11169 | reported by Dagobert Michelsen.
|
|---|
| 11170 | * src/egrep.c (Ecompile): Likewise.
|
|---|
| 11171 |
|
|---|
| 11172 | 2010-03-25 Norihirio Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 11173 |
|
|---|
| 11174 | grep: fix printing when -w is used and regex is needed for matching
|
|---|
| 11175 | * NEWS: Document bugfix.
|
|---|
| 11176 | * src/dfasearch.c (EGexecute): After assess_pattern_match len, is either
|
|---|
| 11177 | invalid or end-beg; jump to success.
|
|---|
| 11178 | * tests/Makefile.am (TESTS): Add new test.
|
|---|
| 11179 | * tests/backref-word: New.
|
|---|
| 11180 |
|
|---|
| 11181 | 2010-03-25 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 11182 |
|
|---|
| 11183 | dfa: fix single byte character ranges
|
|---|
| 11184 | * src/dfa.c (in_coll_range): Fix ordering for second strcoll. Reported
|
|---|
| 11185 | by Dmitry V. Levin.
|
|---|
| 11186 | * tests/spencer1-locale.awk: Also test single-byte character sets.
|
|---|
| 11187 | * NEWS: Add a note about this bugfix.
|
|---|
| 11188 | * THANKS: Add Dmitry.
|
|---|
| 11189 |
|
|---|
| 11190 | 2010-03-25 Norihirio Tanaka <noritnk@kcn.ne.jp>
|
|---|
| 11191 |
|
|---|
| 11192 | grep: reset state after truncated or invalid multibyte sequences
|
|---|
| 11193 | * src/searchutils.c (is_mb_middle): When treating an invalid sequence
|
|---|
| 11194 | or a truncated multibyte character as a single byte character, reset
|
|---|
| 11195 | mbstate
|
|---|
| 11196 |
|
|---|
| 11197 | grep: do lowercase conversion in print_line_middle only for single-byte case
|
|---|
| 11198 | * src/main.c (print_line_middle): Restrict match_icase code
|
|---|
| 11199 | to MB_CUR_MAX == 1. Adjust comments.
|
|---|
| 11200 |
|
|---|
| 11201 | 2010-03-25 Jim Meyering <meyering@redhat.com>
|
|---|
| 11202 |
|
|---|
| 11203 | tests: provide framework_failure_ function
|
|---|
| 11204 | The shell function "framework_failure" was called in the unusual
|
|---|
| 11205 | event that some fundamental test set-up operation would fail.
|
|---|
| 11206 | However it was not defined. Define it, but with a trailing underscore
|
|---|
| 11207 | to impinge less on the test writer's name space. Adjust all uses.
|
|---|
| 11208 | * tests/init.sh (framework_failure_): New function.
|
|---|
| 11209 | * tests/case-fold-backref: s/framework_failure/framework_failure_/
|
|---|
| 11210 | * tests/case-fold-char-class: Likewise.
|
|---|
| 11211 | * tests/case-fold-char-range: Likewise.
|
|---|
| 11212 | * tests/case-fold-char-type: Likewise.
|
|---|
| 11213 | * tests/char-class-multibyte: Likewise.
|
|---|
| 11214 | * tests/dfaexec-multibyte: Likewise.
|
|---|
| 11215 | * tests/max-count-vs-context: Likewise.
|
|---|
| 11216 | * tests/word-multi-file: Likewise.
|
|---|
| 11217 |
|
|---|
| 11218 | 2010-03-24 Jim Meyering <meyering@redhat.com>
|
|---|
| 11219 |
|
|---|
| 11220 | doc: tweak THANKS
|
|---|
| 11221 | * THANKS: Update Arnold's name and address, per request.
|
|---|
| 11222 |
|
|---|
| 11223 | portability: use gnulib's lseek wrapper
|
|---|
| 11224 | * bootstrap.conf (gnulib_modules): Use gnulib's lseek wrapper,
|
|---|
| 11225 | for improved portability. lseek does not fail with ESPIPE on
|
|---|
| 11226 | pipes on some systems.
|
|---|
| 11227 |
|
|---|
| 11228 | build: avoid link failure on Solaris 8
|
|---|
| 11229 | * bootstrap.conf (gnulib_modules): Add wctob.
|
|---|
| 11230 | * NEWS (Portability): Mention this.
|
|---|
| 11231 | Reported by Dagobert Michelsen in <http://sv.gnu.org/bugs/?29325>.
|
|---|
| 11232 |
|
|---|
| 11233 | 2010-03-24 Petr Písař <petr.pisar@atlas.cz>
|
|---|
| 11234 |
|
|---|
| 11235 | doc: translate new --help message
|
|---|
| 11236 | * src/main.c: Translate "after_options".
|
|---|
| 11237 |
|
|---|
| 11238 | 2010-03-24 Jim Meyering <meyering@redhat.com>
|
|---|
| 11239 |
|
|---|
| 11240 | doc: NEWS make it clear that the bug was introduced in 2.6
|
|---|
| 11241 | * NEWS: Clarify.
|
|---|
| 11242 |
|
|---|
| 11243 | 2010-03-24 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 11244 |
|
|---|
| 11245 | tests: fix char-class-multibyte
|
|---|
| 11246 | * tests/char-class-multibyte: Make it pass.
|
|---|
| 11247 |
|
|---|
| 11248 | 2010-03-23 Jim Meyering <meyering@redhat.com>
|
|---|
| 11249 |
|
|---|
| 11250 | build: avoid compilation failure when MBS_SUPPORT not defined
|
|---|
| 11251 | * src/dfa.c (setbit_case_fold) [!MBS_SUPPORT]: Fix curly brace mismatch.
|
|---|
| 11252 |
|
|---|
| 11253 | 2010-03-23 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 11254 |
|
|---|
| 11255 | dfa: fix sigsegv on multibyte character classes
|
|---|
| 11256 | Reported by Jaroslav Škarvada <jskarvad@redhat.com>. This is
|
|---|
| 11257 | unfortunate. grep needs an automatic testcase generator.
|
|---|
| 11258 |
|
|---|
| 11259 | * NEWS: Document bug.
|
|---|
| 11260 | * THANKS: Mention reporter.
|
|---|
| 11261 | * src/dfa.c (set_bit_casefold): Change type of first argument for
|
|---|
| 11262 | self-documentation.
|
|---|
| 11263 | (parse_bracket_exp): Fix call.
|
|---|
| 11264 | * tests/Makefile.am: Add new testcase.
|
|---|
| 11265 | * tests/char-class-multibyte: New testcase.
|
|---|
| 11266 |
|
|---|
| 11267 | 2010-03-23 Jim Meyering <meyering@redhat.com>
|
|---|
| 11268 |
|
|---|
| 11269 | post-release administrivia
|
|---|
| 11270 | * NEWS: Add header line for next release.
|
|---|
| 11271 | * .prev-version: Record previous version.
|
|---|
| 11272 | * cfg.mk (old_NEWS_hash): Auto-update.
|
|---|
| 11273 |
|
|---|
| 11274 | version 2.6
|
|---|
| 11275 | * NEWS: Record release date.
|
|---|
| 11276 |
|
|---|
| 11277 | build: avoid warnings: tell gcc and clang that dfaerror never returns
|
|---|
| 11278 | * src/dfa.h (__attribute__): Define.
|
|---|
| 11279 | (dfaerror): Declare with the "noreturn" attribute.
|
|---|
| 11280 | * src/dfasearch.c (dfaerror): Add an unreachable use of abort.
|
|---|
| 11281 |
|
|---|
| 11282 | 2010-03-22 Eric Blake <eblake@redhat.com>
|
|---|
| 11283 |
|
|---|
| 11284 | build: fix cygwin build
|
|---|
| 11285 | Portions of gnulib depend on -lintl, and cygwin does not allow
|
|---|
| 11286 | lazy linking.
|
|---|
| 11287 |
|
|---|
| 11288 | * src/Makefile.am (LDADD): Include libraries in correct order.
|
|---|
| 11289 |
|
|---|
| 11290 | 2010-03-22 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 11291 |
|
|---|
| 11292 | grep: remove --mmap
|
|---|
| 11293 | mmap is a bad idea for sequentially accessed file because it will cause
|
|---|
| 11294 | a page fault for every read page. Just consider it a failed experiment,
|
|---|
| 11295 | and ignore --mmap while accepting it for backwards compatibility.
|
|---|
| 11296 |
|
|---|
| 11297 | * configure.ac (AC_FUNC_MMAP): Remove.
|
|---|
| 11298 | * doc/grep.texi (Other options): Say --mmap is ignored.
|
|---|
| 11299 | * src/grep.c (mmap_option): Remove.
|
|---|
| 11300 | (long_options): Do not reference it.
|
|---|
| 11301 | (bufmapped, initial_bufoffset): Remove.
|
|---|
| 11302 | (reset, fillbuf): Remove HAVE_MMAP code.
|
|---|
| 11303 | (grepfile): Remove bufmapped reference.
|
|---|
| 11304 | (usage): Say --mmap is ignored.
|
|---|
| 11305 |
|
|---|
| 11306 | 2010-03-22 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 11307 |
|
|---|
| 11308 | grep: rename files for intuitiveness
|
|---|
| 11309 | * Makefile.am (libgrep_a_SOURCES, grep_SOURCES, egrep_SOURCES,
|
|---|
| 11310 | fgrep_SOURCES): Adjust.
|
|---|
| 11311 | * grep.c: Rename to main.c.
|
|---|
| 11312 | * esearch.c: Rename to egrep.c.
|
|---|
| 11313 | * fsearch.c: Rename to fgrep.c.
|
|---|
| 11314 | * gsearch.c: Rename to grep.c.
|
|---|
| 11315 |
|
|---|
| 11316 | grep: kill GREP_PROGRAM/EGREP_PROGRAM/FGREP_PROGRAM
|
|---|
| 11317 | * NEWS: Document slight semantic change.
|
|---|
| 11318 | * TODO: #ifdefs are gone.
|
|---|
| 11319 | * po/POTFILES.in: Update.
|
|---|
| 11320 | * src/Makefile.am (grep_SOURCES, egrep_SOURCES, fgrep_SOURCES): Remove
|
|---|
| 11321 | grep.c/egrep.c/fgrep.c.
|
|---|
| 11322 | (noinst_LIBRARIES): Change libsearch.a to libgrep.a.
|
|---|
| 11323 | (libsearch_a_SOURCES): Rename to libgrep_a_SOURCES, add grep.c
|
|---|
| 11324 | (LDADD): Change libsearch.a to libgrep.a.
|
|---|
| 11325 | * src/esearch.c: Add before_options and after_options.
|
|---|
| 11326 | * src/fsearch.c: Likewise.
|
|---|
| 11327 | * src/gsearch.c: Likewise.
|
|---|
| 11328 | * src/grep.c (short_options, long_options): Remove GREP_PROGRAM
|
|---|
| 11329 | special-casing.
|
|---|
| 11330 | (usage): Use before_options and after_options, look at matchers.
|
|---|
| 11331 | (setmatcher): Merge with install_matcher.
|
|---|
| 11332 | (main): Call setmatcher (NULL) instead of install_matcher.
|
|---|
| 11333 | * src/grep.h (GREP_PROGRAM): Remove.
|
|---|
| 11334 | (before_options, after_options): Add.
|
|---|
| 11335 |
|
|---|
| 11336 | thank Eric Blake
|
|---|
| 11337 | * THANKS: Add Eric Blake, who reported the warning fixed by 774d0ee.
|
|---|
| 11338 |
|
|---|
| 11339 | grep: libify *search.c
|
|---|
| 11340 | * src/Makefile.am (libsearch_a_SOURCES): Add dfasearch.c, kwsearch.c,
|
|---|
| 11341 | pcresearch.c.
|
|---|
| 11342 | * src/esearch.c, src/fsearch.c, * src/gsearch.c: Only include search.h.
|
|---|
| 11343 | * src/dfasearch.c (GEAcompile, EGexecute): Export.
|
|---|
| 11344 | * src/kwsearch.c (Fcompile, Fexecute): Export.
|
|---|
| 11345 | * src/pcresearch.c (Pcompile, Pexecute): Export.
|
|---|
| 11346 | * src/search.h: Add new exported functions.
|
|---|
| 11347 |
|
|---|
| 11348 | grep: prepare for libification of *search.c
|
|---|
| 11349 | * src/dfasearch.c (Ecompile): Remove.
|
|---|
| 11350 | * src/esearch.c: Place it here...
|
|---|
| 11351 | * src/gsearch.c: ... and here.
|
|---|
| 11352 |
|
|---|
| 11353 | grep: split search.c
|
|---|
| 11354 | * po/POTFILES.in: Update.
|
|---|
| 11355 | * src/Makefile.am (grep_SOURCES, egrep_SOURCES, fgrep_SOURCES): Move
|
|---|
| 11356 | kwset.c and dfa.c to libsearch.a. Add searchutils.c there too.
|
|---|
| 11357 | * src/search.h, src/dfasearch.c, src/pcresearch.c, src/kwsearch.c,
|
|---|
| 11358 | src/searchutils.c: New files, split out of src/search.c.
|
|---|
| 11359 | * src/esearch.c, src/fsearch.c: Include the new files instead of search.c.
|
|---|
| 11360 | * src/gsearch.c: Likewise, plus move Gcompile/Acompile here.
|
|---|
| 11361 |
|
|---|
| 11362 | grep: remove one #ifdef
|
|---|
| 11363 | * search.c (GEAcompile) [EGREP_PROGRAM]: Use common code. Inline IF_BK.
|
|---|
| 11364 |
|
|---|
| 11365 | 2010-03-22 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 11366 |
|
|---|
| 11367 | grep: eliminate {COMPILE,EXECUTE}_{RET,ARGS,FCT}
|
|---|
| 11368 | Modern compilers warn about type mismatches.
|
|---|
| 11369 |
|
|---|
| 11370 | * src/grep.c (do_execute): Write full declaration.
|
|---|
| 11371 | * src/grep.h (COMPILE_RET, COMPILE_ARGS, COMPILE_FCT, EXECUTE_RET,
|
|---|
| 11372 | EXECUTE_ARGS, EXECUTE_FCT): Remove.
|
|---|
| 11373 | (compile_fp_t, execute_fp_t): Write full declaration.
|
|---|
| 11374 | * src/search.c (GEAcompile, Gcompile, Acompile, Ecompile, EGexecute,
|
|---|
| 11375 | Fcompile, Fexecute, Pcompile, Pexecute): Write full declaration.
|
|---|
| 11376 |
|
|---|
| 11377 | 2010-03-22 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 11378 |
|
|---|
| 11379 | grep: make egrep/fgrep use struct matcher
|
|---|
| 11380 | * Makefile.am (grep_SOURCES): Add gsearch.c.
|
|---|
| 11381 | (EXTRA_DIST): Add search.c.
|
|---|
| 11382 | * esearch.c (matchers): New.
|
|---|
| 11383 | * fsearch.c (matchers): New.
|
|---|
| 11384 | * gsearch.c: New.
|
|---|
| 11385 | * search.c (matchers): Remove.
|
|---|
| 11386 | * grep.c: Always compile most !GREP_PROGRAM sections.
|
|---|
| 11387 | (main): Use first matcher if none is explicitly provided. Remove
|
|---|
| 11388 | "default" matcher.
|
|---|
| 11389 | * grep.h (struct matcher): Adjust comments.
|
|---|
| 11390 |
|
|---|
| 11391 | grep: change struct matcher termination
|
|---|
| 11392 | * src/grep.c (setmatcher): Look for NULL matchers[i].name.
|
|---|
| 11393 | * src/grep.h (struct matcher): Change name to pointer. Adjust comments.
|
|---|
| 11394 | * src/search.c (matchers): Terminate with three NULLs.
|
|---|
| 11395 |
|
|---|
| 11396 | grep: remove one #ifdef
|
|---|
| 11397 | * search.c (Ecompile): Always go through GEAcompile to use same code path
|
|---|
| 11398 | for both grep and egrep.
|
|---|
| 11399 |
|
|---|
| 11400 | grep: remove getpagesize.h
|
|---|
| 11401 | * src/getpagesize.h: Remove.
|
|---|
| 11402 | * src/Makefile.am (noinst_HEADERS): Remove getpagesize.h.
|
|---|
| 11403 |
|
|---|
| 11404 | 2010-03-21 Jim Meyering <meyering@redhat.com>
|
|---|
| 11405 |
|
|---|
| 11406 | build: use the fcntl-h module, not "fcntl"
|
|---|
| 11407 | * bootstrap.conf (gnulib_modules): We might need fcntl.h somewhere,
|
|---|
| 11408 | but don't use the fcntl function. Reported by Bruno Haible.
|
|---|
| 11409 |
|
|---|
| 11410 | build: avoid link failure on systems using gnulib's fcntl but not open
|
|---|
| 11411 | * bootstrap.conf (gnulib_modules): Using gnulib's fcntl module
|
|---|
| 11412 | and including <fcntl.h>, but not also using gnulib's "open" module
|
|---|
| 11413 | would result in link failure due to references to rpl_open
|
|---|
| 11414 | on systems requiring the replacement (e.g., Cygwin and Darwin).
|
|---|
| 11415 |
|
|---|
| 11416 | build: avoid compilation failure on systems using rpl_open
|
|---|
| 11417 | This new build failure has arisen as a result of using gnulib's
|
|---|
| 11418 | "fcntl" module. Now that an inadequate "open" syscall is replace
|
|---|
| 11419 | by gnulib's wrapper, it is essential to include <fcntl.h>.
|
|---|
| 11420 | * src/grep.c: Include <fcntl.h>.
|
|---|
| 11421 | This is required, for grepfile's use of open, at least on
|
|---|
| 11422 | Cygwin and Darwin.
|
|---|
| 11423 |
|
|---|
| 11424 | maint: use gnulib's fcntl module, just in case
|
|---|
| 11425 | * bootstrap.conf (gnulib_modules): Add fcntl.
|
|---|
| 11426 | Grep uses at least O_BINARY, which may be defined therein.
|
|---|
| 11427 |
|
|---|
| 11428 | maint: remove TYPE_* definitions from src/system.h
|
|---|
| 11429 | * src/system.h (TYPE_MAXIMUM, TYPE_MINIMUM, TYPE_SIGNED): Remove
|
|---|
| 11430 | definitions. They are provided by intprops.h.
|
|---|
| 11431 | * src/grep.c: Include "intprops.h"
|
|---|
| 11432 | * bootstrap.conf (gnulib_modules): Add intprops.
|
|---|
| 11433 |
|
|---|
| 11434 | maint: alphabetize #include directives
|
|---|
| 11435 | * src/grep.c: Alphabetize #include directives.
|
|---|
| 11436 |
|
|---|
| 11437 | 2010-03-20 Jim Meyering <meyering@redhat.com>
|
|---|
| 11438 |
|
|---|
| 11439 | build: stop using gnulib's memmove module
|
|---|
| 11440 | * bootstrap.conf (gnulib_modules): Remove obsolete module: memmove
|
|---|
| 11441 |
|
|---|
| 11442 | build: reinstate gnulib's fcntl-h-tests
|
|---|
| 11443 | * bootstrap.conf (gnulib_tool_option_extras): Do not avoid
|
|---|
| 11444 | the fcntl-h-tests. I cannot reproduce the failure.
|
|---|
| 11445 |
|
|---|
| 11446 | 2010-03-20 Eric Blake <eblake@redhat.com>
|
|---|
| 11447 |
|
|---|
| 11448 | build: allow compilation on cygwin
|
|---|
| 11449 | Gnulib is incompatible with -Wunused-macros. Addtionally,
|
|---|
| 11450 | cygwin 1.7.1 coupled with --enable-gcc-warnings tripped on:
|
|---|
| 11451 |
|
|---|
| 11452 | grep.c: In function 'print_line_middle':
|
|---|
| 11453 | grep.c:805: error: array subscript has type 'char' [-Wchar-subscripts]
|
|---|
| 11454 | grep.c: In function 'main':
|
|---|
| 11455 | grep.c:1833: error: 'optarg' redeclared without dllimport attribute: previous dllimport ignored [-Wattributes]
|
|---|
| 11456 | grep.c:1834: error: 'optind' redeclared without dllimport attribute after being referenced with dll linkage
|
|---|
| 11457 |
|
|---|
| 11458 | * configure.ac (GNULIB_WARN_FLAGS): Disable -Wunused-macros.
|
|---|
| 11459 | * src/grep.c (print_line_middle): Use correct type to tolower.
|
|---|
| 11460 | (main): Drop useless redeclarations.
|
|---|
| 11461 | * .gitignore: Ignore more built files.
|
|---|
| 11462 |
|
|---|
| 11463 | 2010-03-20 Jim Meyering <meyering@redhat.com>
|
|---|
| 11464 |
|
|---|
| 11465 | tests: ensure that all programs handle [b-a] consistently
|
|---|
| 11466 | * tests/reversed-range-endpoints: New test.
|
|---|
| 11467 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 11468 |
|
|---|
| 11469 | 2010-03-20 Jim Meyering <meyering@redhat.com>
|
|---|
| 11470 |
|
|---|
| 11471 | build: update gnulib submodule to latest
|
|---|
| 11472 | This pulls in the latest regex module from gnulib, including a fix
|
|---|
| 11473 | to make it honor the RE_NO_EMPTY_RANGES syntax bit.
|
|---|
| 11474 |
|
|---|
| 11475 | tests: temporarily disable irrelevant-to-grep failing C++ fcntl-h-tests
|
|---|
| 11476 | * bootstrap.conf (gnulib_tool_option_extras): Temporarily add
|
|---|
| 11477 | --avoid=fcntl-h-tests, until the C++ part of that test is fixed.
|
|---|
| 11478 |
|
|---|
| 11479 | 2010-03-20 Jim Meyering <meyering@redhat.com>
|
|---|
| 11480 |
|
|---|
| 11481 | reject reversed-endpoint ranges, with all regex variants
|
|---|
| 11482 | * src/search.c: Add RE_NO_EMPTY_RANGES to the syntax bits
|
|---|
| 11483 | in three places, so that all of grep, egrep, and grep -E reject
|
|---|
| 11484 | a range with reversed endpoints like '[b-a]'. This is required,
|
|---|
| 11485 | when using the latest version of gnulib's regex module, since it
|
|---|
| 11486 | now honors the RE_NO_EMPTY_RANGES flag, rather than acting as if
|
|---|
| 11487 | it were always set.
|
|---|
| 11488 | Based on a change by Matthew Burgess.
|
|---|
| 11489 |
|
|---|
| 11490 | 2010-03-19 Jim Meyering <meyering@redhat.com>
|
|---|
| 11491 |
|
|---|
| 11492 | maint: correct macro parameter parentheses
|
|---|
| 11493 | * src/dfa.c (FETCH_WC, FETCH): Parenthesize macro parameters.
|
|---|
| 11494 |
|
|---|
| 11495 | 2010-03-19 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 11496 |
|
|---|
| 11497 | tests: change help-version to per-program functions
|
|---|
| 11498 | * help-version: Change each *_args variable to a *_setup function.
|
|---|
| 11499 |
|
|---|
| 11500 | dfa: fix wchar_t/wint_t type mismatch
|
|---|
| 11501 | * src/dfa.c (FETCH_WC): Pass a local wchar_t variable to mbrtowc.
|
|---|
| 11502 | (FETCH): Rename temporary second argument to FETCH_WC.
|
|---|
| 11503 | (parse_bracket_exp): Always use FETCH_WC.
|
|---|
| 11504 |
|
|---|
| 11505 | 2010-03-19 Jim Meyering <meyering@redhat.com>
|
|---|
| 11506 |
|
|---|
| 11507 | doc: add README-prereq, referenced from README-hacking
|
|---|
| 11508 | * README-prereq: New file. Cloned from coreutils, s/coreutils/grep/
|
|---|
| 11509 | Reported by Tony Abou-Assaleh.
|
|---|
| 11510 |
|
|---|
| 11511 | 2010-03-19 Arnold Robbins <arnold@skeeve.com>
|
|---|
| 11512 |
|
|---|
| 11513 | maint: sync dfa comments from gawk
|
|---|
| 11514 | * src/dfa.h (struct dfa) [newlines]: Amend comment.
|
|---|
| 11515 | * src/dfa.c: Update copyright year list to include gawk's.
|
|---|
| 11516 |
|
|---|
| 11517 | 2010-03-17 Jim Meyering <meyering@redhat.com>
|
|---|
| 11518 |
|
|---|
| 11519 | maint: remove obsolete "cvs-clean" make target
|
|---|
| 11520 | * Makefile.am (cvs-clean): Remove obsolete target.
|
|---|
| 11521 |
|
|---|
| 11522 | 2010-03-17 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 11523 |
|
|---|
| 11524 | dfa: initialize struct mbcset using memset
|
|---|
| 11525 | * src/dfa.c (parse_bracket_exp): Use memset to initialize workmbc.
|
|---|
| 11526 |
|
|---|
| 11527 | dfa: spell out "unsigned int"
|
|---|
| 11528 | * dfa.c (setbit, tstbit, clrbit, setbit_case_fold, lex, dfaoptimize,
|
|---|
| 11529 | free_mbdata): Put "int" after unsigned.
|
|---|
| 11530 | * dfa.h (struct position, struct dfa): Likewise.
|
|---|
| 11531 |
|
|---|
| 11532 | 2010-03-17 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 11533 |
|
|---|
| 11534 | dfa: optimize simple character sets under UTF-8 charsets
|
|---|
| 11535 | Only use a bitset when possible without involving MBCSET. Testcase:
|
|---|
| 11536 | yes 'the quick brown fox jumps over the lazy dog' | sed 100000q | \
|
|---|
| 11537 | time grep -c [ABCDEFGHIJKLMNOPQRSTUVWXYZ,]
|
|---|
| 11538 |
|
|---|
| 11539 | Before: 51ms (best of three runs); after: 16ms(best of three runs).
|
|---|
| 11540 |
|
|---|
| 11541 | * src/dfa.c (parse_bracket_exp): For simple bracket expressions
|
|---|
| 11542 | under UTF-8, use a CSET.
|
|---|
| 11543 |
|
|---|
| 11544 | 2010-03-17 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 11545 |
|
|---|
| 11546 | dfa: speed up handling of brackets
|
|---|
| 11547 | This patch has two sides. One is to fold the parsing of brackets in the
|
|---|
| 11548 | single- and multi-byte cases. The second is to leverage this change,
|
|---|
| 11549 | and use a bitset to test for single-byte characters in the charset.
|
|---|
| 11550 | Splitting the two would be very hard.
|
|---|
| 11551 |
|
|---|
| 11552 | Testcase:
|
|---|
| 11553 | yes 'the quick brown fox jumps over the lazy dog' | sed 100000q | \
|
|---|
| 11554 | time grep -c [ABCDEFGHIJKLMNOPQRSTUVWXYZ,]
|
|---|
| 11555 |
|
|---|
| 11556 | Before: 59ms (best of three runs); after: 51ms (best of three runs).
|
|---|
| 11557 | Nice, but mostly providing infrastructure for the next patch.
|
|---|
| 11558 |
|
|---|
| 11559 | * src/dfa.c (setbit_case_fold): Try applying towlower/towupper.
|
|---|
| 11560 | (looking_at): Remove.
|
|---|
| 11561 | (FETCH_WC): New.
|
|---|
| 11562 | (fetch_wc): Merge into FETCH_WC [MBS_SUPPORT].
|
|---|
| 11563 | (FETCH) [MBS_SUPPORT]: Call FETCH_WC.
|
|---|
| 11564 | (prednames, find_pred, is_blank and other predicates): Move above,
|
|---|
| 11565 | remove K&R syntax support.
|
|---|
| 11566 | (parse_bracket_exp): New name of parse_bracket_exp_mb, rewritten to
|
|---|
| 11567 | include single-byte character set parsing of brackets.
|
|---|
| 11568 | (lex): Adjust for fetch_wc->FETCH_WC change, remove single-byte
|
|---|
| 11569 | character set parsing of brackets.
|
|---|
| 11570 | (match_mb_charset): Test against work_mbc->cset.
|
|---|
| 11571 | * src/dfa.h (struct mb_char_classes): Add cset.
|
|---|
| 11572 |
|
|---|
| 11573 | 2010-03-17 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 11574 |
|
|---|
| 11575 | syntax-check: remove space-tab exception
|
|---|
| 11576 | * .x-sc_space_tab: Remove.
|
|---|
| 11577 | * src/dfa.c: Fix space-tab occurrence.
|
|---|
| 11578 |
|
|---|
| 11579 | THANKS: fix Jim Meyering's email address
|
|---|
| 11580 | * THANKS: Jim is now with Red Hat.
|
|---|
| 11581 |
|
|---|
| 11582 | dfa: add missing function
|
|---|
| 11583 | * src/dfa.c (using_utf8): New.
|
|---|
| 11584 | (addtok_wc, free_mbdata, dfaoptimize) [!MBS_SUPPORT]: Do not define.
|
|---|
| 11585 | (dfacomp) [!MBS_SUPPORT]: Do not call dfaoptimize.
|
|---|
| 11586 |
|
|---|
| 11587 | tests: fix typo
|
|---|
| 11588 | * fedora: Fix typo.
|
|---|
| 11589 |
|
|---|
| 11590 | tests: use Exit
|
|---|
| 11591 | * euc-mb: exit with "Exit 0".
|
|---|
| 11592 |
|
|---|
| 11593 | grep: remove more register keywords
|
|---|
| 11594 | * dosbuf.c: Remove register keywords.
|
|---|
| 11595 | * grep.c: Remove register keywords.
|
|---|
| 11596 | * kwset.c: Remove register keywords.
|
|---|
| 11597 | * search.c: Remove register keywords.
|
|---|
| 11598 |
|
|---|
| 11599 | 2010-03-17 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 11600 |
|
|---|
| 11601 | dfa: run simple UTF-8 regexps as a single-byte character set
|
|---|
| 11602 | This provides a speedup whenever fgrep is "almost" sufficient but
|
|---|
| 11603 | not quite (e.g. grep ^abc). This affects test cases such as
|
|---|
| 11604 | https://savannah.gnu.org/bugs/?29117, which are already worked around
|
|---|
| 11605 | by the line-by-line matching patch c32c04; without that patch the
|
|---|
| 11606 | speedup can reach 1000x even on non-contrived testcases.
|
|---|
| 11607 |
|
|---|
| 11608 | * src/dfa.c (dfaoptimize): New.
|
|---|
| 11609 | (dfacomp): Call it.
|
|---|
| 11610 |
|
|---|
| 11611 | 2010-03-17 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 11612 |
|
|---|
| 11613 | tests: fix syntax-check failures
|
|---|
| 11614 | * tests/case-fold-backref: Use "foo" instead of "the".
|
|---|
| 11615 | * tests/dfaexec-multibyte: Remove trailing blanks.
|
|---|
| 11616 |
|
|---|
| 11617 | 2010-03-17 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 11618 |
|
|---|
| 11619 | grep: remove check_multibyte_string, fix non-UTF8 missed match
|
|---|
| 11620 | Avoid computing ahead something that can be computed lazily as efficiently
|
|---|
| 11621 | (or more efficiently in the case of UTF-8, though this is left as TODO).
|
|---|
| 11622 | At the same time, "soften" the rejection condition for matching in the
|
|---|
| 11623 | middle of a multibyte sequence to fix bug 23814.
|
|---|
| 11624 |
|
|---|
| 11625 | Multibyte "grep -i" would still be very slow if it wasn't for the workaround
|
|---|
| 11626 | patch c32c042 (grep: match multibyte charsets line-by-line when using -i,
|
|---|
| 11627 | 2010-03-08).
|
|---|
| 11628 |
|
|---|
| 11629 | * NEWS: Document bugfix.
|
|---|
| 11630 | * src/search.c (check_multibyte_string): Rewrite as...
|
|---|
| 11631 | (is_mb_middle): ... this.
|
|---|
| 11632 | (EGexecute, Fexecute): Adjust.
|
|---|
| 11633 | * tests/Makefile.am (TESTS): Add euc-mb.
|
|---|
| 11634 | * tests/euc-mb: New testcase.
|
|---|
| 11635 |
|
|---|
| 11636 | 2010-03-17 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 11637 |
|
|---|
| 11638 | dfa: cache MB_CUR_MAX for dfaexec
|
|---|
| 11639 | * src/dfa.c (state_index, dfaexec): Use d->mb_cur_max.
|
|---|
| 11640 | (dfainit): Initialize it.
|
|---|
| 11641 | (free_mbdata): New, extracted out of dfafree.
|
|---|
| 11642 | (dfafree): Use it.
|
|---|
| 11643 |
|
|---|
| 11644 | dfa: improve documentation of struct dfa
|
|---|
| 11645 | * src/dfa.h (struct dfa): Reword some comments.
|
|---|
| 11646 |
|
|---|
| 11647 | tests: factor name of output files into a variable
|
|---|
| 11648 | * tests/case-fold-backref, tests/case-fold-char-class,
|
|---|
| 11649 | tests/case-fold-char-range, tests/case-fold-char-type,
|
|---|
| 11650 | tests/dfaexec-multibyte: Use a variable for the output filename,
|
|---|
| 11651 | as it is common to the grep and compare invocations.
|
|---|
| 11652 |
|
|---|
| 11653 | tests: use different output files to simplify reading failed .log files
|
|---|
| 11654 | * tests/case-fold-backref, tests/case-fold-char-class,
|
|---|
| 11655 | tests/case-fold-char-range, tests/case-fold-char-type: Use a different
|
|---|
| 11656 | name for each output file from grep.
|
|---|
| 11657 | * tests/dfaexec-multibyte: Likewise, and merge some grep invocations.
|
|---|
| 11658 |
|
|---|
| 11659 | tests: add another grep -i testcase, from bug 16179
|
|---|
| 11660 | * tests/case-fold-backref: New.
|
|---|
| 11661 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 11662 |
|
|---|
| 11663 | 2010-03-16 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 11664 |
|
|---|
| 11665 | dfa: rewrite handling of multibyte case_fold lexing
|
|---|
| 11666 | Let dfacomp do the folding to lowercase of multibyte input strings,
|
|---|
| 11667 | and remove it from grep.c. Input strings to kwset.c are still folded
|
|---|
| 11668 | outside kwset.c, so we still need to do mbtolower in search.c.
|
|---|
| 11669 |
|
|---|
| 11670 | * NEWS: Document bugfixes.
|
|---|
| 11671 | * .x-sc_cast_of_argument_to_free: Remove.
|
|---|
| 11672 | * src/dfa.c (wctok, addtok_wc): New.
|
|---|
| 11673 | (cur_mb_index, update_mb_len_index): Remove.
|
|---|
| 11674 | (FETCH): Do not call it.
|
|---|
| 11675 | (parse_bracket_exp_mb) [GREP]: Disable case-folding of ranges and
|
|---|
| 11676 | characters.
|
|---|
| 11677 | (addtok): Extract part to...
|
|---|
| 11678 | (addtok_mb): ... this new function.
|
|---|
| 11679 | (lex): Call fetch_wc in the main loop for MB_CUR_MAX > 1. Return WCHAR
|
|---|
| 11680 | for normal characters if MB_CUR_MAX > 1.
|
|---|
| 11681 | (atom): Handle WCHAR instead of treating multibyte characters specially.
|
|---|
| 11682 | Do case folding of multibyte characters here.
|
|---|
| 11683 | (dfacomp): Remove case_fold special casing.
|
|---|
| 11684 | * src/dfa.h (WCHAR): New.
|
|---|
| 11685 | * src/grep.c (mb_icase_keys): Remove.
|
|---|
| 11686 | (main): Do not call it.
|
|---|
| 11687 | * src/search.c (kwsinit): Init transition table only for MB_CUR_MAX == 1.
|
|---|
| 11688 | (mbtolower): New.
|
|---|
| 11689 | (kwsincr_case): New.
|
|---|
| 11690 | (kwsmusts): Call it instead of kwsincr.
|
|---|
| 11691 | (check_multibyte_string): Remove.
|
|---|
| 11692 | (check_multibyte_string_no_icase): Rename to check_multibyte_string.
|
|---|
| 11693 | (GEAcompile, EGexecute, Fcompile): Use mbtolower instead of the old
|
|---|
| 11694 | check_multibyte_string.
|
|---|
| 11695 | * tests/Makefile.am (TESTS): Add case-fold-backslash-w.
|
|---|
| 11696 | * tests/foad1.sh: Enable fixed tests.
|
|---|
| 11697 | * tests/case-fold-backslash-w: New.
|
|---|
| 11698 |
|
|---|
| 11699 | 2010-03-16 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 11700 |
|
|---|
| 11701 | grep: match multibyte charsets line-by-line when using -i
|
|---|
| 11702 | The turtle combination -i + MB_CUR_MAX>1 requires case conversion ahead
|
|---|
| 11703 | of time. Avoid doing this repeatedly when many matches succeed. Together
|
|---|
| 11704 | with the previous changes, this fixes https://savannah.gnu.org/bugs/?29117
|
|---|
| 11705 | and https://savannah.gnu.org/bugs/?14472.
|
|---|
| 11706 |
|
|---|
| 11707 | * NEWS: Document new speedup.
|
|---|
| 11708 | * src/grep.c (do_execute): New.
|
|---|
| 11709 | (grepbuf): Use it.
|
|---|
| 11710 |
|
|---|
| 11711 | 2010-03-15 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 11712 |
|
|---|
| 11713 | dfa: fix handling of ranges in multibyte character sets
|
|---|
| 11714 | * src/dfa.c (parse_bracket_exp_mb): Add separate ranges for
|
|---|
| 11715 | lowercase and uppercase endpoints if folding case.
|
|---|
| 11716 | * tests/Makefile.am (TESTS): Add case-fold-char-range.
|
|---|
| 11717 | * tests/case-fold-char-range: New.
|
|---|
| 11718 |
|
|---|
| 11719 | tests: add more UTF-8 test cases
|
|---|
| 11720 | * tests/Makefile.am (TESTS): Add spencer1-locale.
|
|---|
| 11721 | (EXTRA_DIST): Add spencer1-locale.awk.
|
|---|
| 11722 | * tests/spencer1-locale.awk: New.
|
|---|
| 11723 | * tests/spencer1-locale: New.
|
|---|
| 11724 |
|
|---|
| 11725 | 2010-03-15 Jim Meyering <meyering@redhat.com>
|
|---|
| 11726 |
|
|---|
| 11727 | tests: complete the renaming fedora.sh -> fedora
|
|---|
| 11728 | * tests/Makefile.am (TESTS): Rename fedora.sh -> fedora here, too.
|
|---|
| 11729 |
|
|---|
| 11730 | 2010-03-15 Jim Meyering <meyering@redhat.com>
|
|---|
| 11731 |
|
|---|
| 11732 | * tests/fedora.sh: Rename to...
|
|---|
| 11733 | * tests/fedora: ...this, to reflect new convention:
|
|---|
| 11734 | Use the lack of a suffix to indicate we've converted to the new
|
|---|
| 11735 | init.sh-using test framework.
|
|---|
| 11736 |
|
|---|
| 11737 | tests: adjust fedora.sh to handle traps more portably
|
|---|
| 11738 |
|
|---|
| 11739 | 2010-03-15 Jim Meyering <meyering@redhat.com>
|
|---|
| 11740 |
|
|---|
| 11741 | tests: adjust fedora.sh to handle traps more portably
|
|---|
| 11742 | * tests/fedora.sh: Use "Exit", not "exit".
|
|---|
| 11743 |
|
|---|
| 11744 | tests: for each test, set an envvar to its name
|
|---|
| 11745 | * tests/Makefile.am (TESTS_ENVIRONMENT): Set GREP_TEST_NAME for
|
|---|
| 11746 | each test. This is used to help make the output of hundreds of
|
|---|
| 11747 | independent, often-parallel valgrind runs more manageable.
|
|---|
| 11748 |
|
|---|
| 11749 | 2010-03-14 Jim Meyering <meyering@redhat.com>
|
|---|
| 11750 |
|
|---|
| 11751 | tests: clean up fedora.sh
|
|---|
| 11752 | * tests/fedora.sh: Use "grep", not ${GREP}.
|
|---|
| 11753 | Use init.sh.
|
|---|
| 11754 | Use timeout 10, not sleep 1 (three times).
|
|---|
| 11755 | The latter would always sleep for 3 seconds, and the test would
|
|---|
| 11756 | fail with a false positive on a slow system or with a heavily
|
|---|
| 11757 | instrumented (valgrind) executable.
|
|---|
| 11758 |
|
|---|
| 11759 | 2010-03-12 Jim Meyering <meyering@redhat.com>
|
|---|
| 11760 |
|
|---|
| 11761 | build: avoid build failure with --enable-gcc-warnings
|
|---|
| 11762 | * src/dfa.c: Don't include <assert.h>, now that it is not used.
|
|---|
| 11763 | [DEBUG]: Remove #ifdef block.
|
|---|
| 11764 |
|
|---|
| 11765 | 2010-03-12 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 11766 |
|
|---|
| 11767 | syntax-check: enable space-tab
|
|---|
| 11768 | * cfg.mk (local-checks-to-skip): Enable space-tab.
|
|---|
| 11769 | * .x-sc_space_tab: Add exceptions.
|
|---|
| 11770 | * tests/status.sh: Fix occurrence.
|
|---|
| 11771 |
|
|---|
| 11772 | syntax-check: enable m4-quote-check
|
|---|
| 11773 | * cfg.mk (local-checks-to-skip): Enable m4-quote-check.
|
|---|
| 11774 | * configure.ac: Fix occurrence.
|
|---|
| 11775 |
|
|---|
| 11776 | syntax-check: enable makefile-TAB-only-indentation
|
|---|
| 11777 | * cfg.mk (local-checks-to-skip): Enable makefile-TAB-only-indentation.
|
|---|
| 11778 | * Makefile.am: Fix only occurrence.
|
|---|
| 11779 |
|
|---|
| 11780 | grep: fix error-message-uppercase
|
|---|
| 11781 | * cfg.mk (local-checks-to-skip): Enable error-message-uppercase.
|
|---|
| 11782 | * src/dfa.c (parse_bracket_exp_mb, lex, dfaparse): Fix occurrences.
|
|---|
| 11783 | * src/search.c (Pcompile, Pexecute): Fix occurrences.
|
|---|
| 11784 |
|
|---|
| 11785 | dfa, grep: cleanup if-before-free and cast-of-argument-to-free
|
|---|
| 11786 | * .x-sc_avoid_if_before_free: Remove.
|
|---|
| 11787 | * .x-sc_cast_of_alloca_return_value: Remove.
|
|---|
| 11788 | * .x-sc_cast_of_x_alloc_return_value: Remove.
|
|---|
| 11789 | * .x-sc_cast_of_argument_to_free: Temporarily add src/search.c.
|
|---|
| 11790 | * cfg.mk (local-checks-to-skip): Remove sc_cast_of_argument_to_free.
|
|---|
| 11791 | * src/dfa.c (ifree): Remove.
|
|---|
| 11792 | (dfamust, build_state, transit_state, dfafree): Do not do if-before-free,
|
|---|
| 11793 | do not cast free argument to ptr_t or char *.
|
|---|
| 11794 | (freelist): Call free instead of ifree.
|
|---|
| 11795 | * src/dfa.h (ptr_t): Remove.
|
|---|
| 11796 |
|
|---|
| 11797 | 2010-03-12 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 11798 |
|
|---|
| 11799 | dfa: remove CRANGE dead code
|
|---|
| 11800 | The only use of CRANGE was removed by commit 193830d. In theory it is
|
|---|
| 11801 | more correct to do what CRANGE did, but in practice it seems like it did
|
|---|
| 11802 | not work.
|
|---|
| 11803 |
|
|---|
| 11804 | * src/dfa.h (token): Remove CRANGE.
|
|---|
| 11805 | * src/dfa.c (atom): Do not handle CRANGE.
|
|---|
| 11806 | (prtok): Likewise.
|
|---|
| 11807 |
|
|---|
| 11808 | 2010-03-12 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 11809 |
|
|---|
| 11810 | dfa: get rid of x*alloc
|
|---|
| 11811 | * src/dfa.c: Include xalloc.h.
|
|---|
| 11812 | (xmalloc, xrealloc, xcalloc): Remove.
|
|---|
| 11813 |
|
|---|
| 11814 | grep: cleanup one const cast
|
|---|
| 11815 | * src/search.c (GEAcompile): Do not reuse motif when operating on the
|
|---|
| 11816 | (const) pattern, so we can make it non-const. Remove cast from free.
|
|---|
| 11817 |
|
|---|
| 11818 | kwset/system: remove ptr_t
|
|---|
| 11819 | * src/kwset.h: Declare kwset using an incomplete struct type.
|
|---|
| 11820 | * src/system.h (ptr_t): Remove.
|
|---|
| 11821 |
|
|---|
| 11822 | 2010-03-12 Jim Meyering <meyering@redhat.com>
|
|---|
| 11823 |
|
|---|
| 11824 | tests: add test cases for dfaexec bug
|
|---|
| 11825 | * tests/dfaexec-multibyte: New test.
|
|---|
| 11826 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 11827 | Reported by Paolo Bonzini in http://bugzilla.redhat.com/544407
|
|---|
| 11828 | and http://bugzilla.redhat.com/544406 .
|
|---|
| 11829 |
|
|---|
| 11830 | 2010-03-12 Jim Meyering <meyering@redhat.com>
|
|---|
| 11831 |
|
|---|
| 11832 | dfa: manually merge gawk's dfaexec
|
|---|
| 11833 | * src/dfa.c (dfaexec): Adjust API: return pointer, not offset, and
|
|---|
| 11834 | take an "end" pointer parameter, rather than integral "size".
|
|---|
| 11835 | Adjust comment accordingly.
|
|---|
| 11836 | (build_state): Maintain d->newlines.
|
|---|
| 11837 | (copytoks): Update multibyte_prop indices.
|
|---|
| 11838 | (SKIP_REMAINS_MB_IF_INITIAL_STATE): Update a cast.
|
|---|
| 11839 | Return NULL, rather than (size_t) -1.
|
|---|
| 11840 | (realloc_trans_if_necessary): Realloc d->newlines.
|
|---|
| 11841 | * src/dfa.h (struct dfa): New member, "newlines".
|
|---|
| 11842 | (struct dfa) [GAWK]: New member, "broken".
|
|---|
| 11843 | (dfaexec): Update prototype and copy the new comment from dfa.c.
|
|---|
| 11844 |
|
|---|
| 11845 | dfa: make search.c use the new dfaexec API
|
|---|
| 11846 |
|
|---|
| 11847 | * src/search.c: Adjust to new dfaexec API.
|
|---|
| 11848 | Now, dfaexec returns a pointer, not an integer,
|
|---|
| 11849 | and the third parameter is END, not buffer size.
|
|---|
| 11850 | * src/dfa.c (dfaexec): Rewrite the function's comment.
|
|---|
| 11851 | Don't just clobber *END. While doing that happens to be
|
|---|
| 11852 | fine for gawk's usage, in grep, *END usually points to the
|
|---|
| 11853 | first byte of the next buffer. Save the initial value,
|
|---|
| 11854 | and restore it just before returning.
|
|---|
| 11855 | * src/dfa.h (dfaexec): Update comment; include parameter names.
|
|---|
| 11856 |
|
|---|
| 11857 | 2010-03-12 Jim Meyering <meyering@redhat.com>
|
|---|
| 11858 |
|
|---|
| 11859 | dfa: appease static analyzers
|
|---|
| 11860 | * src/dfa.c (transit_state_singlebyte): Call abort rather
|
|---|
| 11861 | than returning in a "can't happen" scenario.
|
|---|
| 11862 | This stops clang from emitting a false-positive report (I think it
|
|---|
| 11863 | was used-uninitialized) about a caller.
|
|---|
| 11864 |
|
|---|
| 11865 | 2010-03-11 Jim Meyering <meyering@redhat.com>
|
|---|
| 11866 |
|
|---|
| 11867 | dfa: do not accept [[:UPPER:]] or [[:LOWER:]] internally
|
|---|
| 11868 | * src/dfa.c (parse_bracket_exp_mb): Those class names are not
|
|---|
| 11869 | valid, and rejected elsewhere, so there is no point in allowing
|
|---|
| 11870 | upper or mixed-case versions here.
|
|---|
| 11871 |
|
|---|
| 11872 | 2010-03-11 Jim Meyering <meyering@redhat.com>
|
|---|
| 11873 |
|
|---|
| 11874 | maint: remove a trailing space
|
|---|
| 11875 | * src/search.c (EXECUTE_FCT): Remove trailing space.
|
|---|
| 11876 |
|
|---|
| 11877 | maint: remove all uses of PARAMS
|
|---|
| 11878 | Remove most with this:
|
|---|
| 11879 | git grep -lw PARAMS |xargs perl -pi -e 's/\bPARAMS *\((.*)\);/$1;/'
|
|---|
| 11880 | Remove the remainder manually.
|
|---|
| 11881 |
|
|---|
| 11882 | 2010-03-11 Jim Meyering <meyering@redhat.com>
|
|---|
| 11883 |
|
|---|
| 11884 | maint: remove all uses of PARAMS
|
|---|
| 11885 | * lib/savedir.h (PARAMS): Remove definitions manually.
|
|---|
| 11886 | Remove the remaining ones via this command:
|
|---|
| 11887 | git grep -l define.PARAMS |xargs perl -ni -e '/define PARAMS/ or print'
|
|---|
| 11888 | * src/dfa.h (PARAMS): Remove definitions.
|
|---|
| 11889 | * src/system.h (PARAMS): Likewise.
|
|---|
| 11890 | Remove most uses with this:
|
|---|
| 11891 | git grep -lw PARAMS |xargs perl -pi -e 's/\bPARAMS *\((.*)\);/$1;/'
|
|---|
| 11892 | Remove the remainder manually.
|
|---|
| 11893 |
|
|---|
| 11894 | maint: remove now-useless prototypes
|
|---|
| 11895 | * src/dfa.c: Remove the prototype of each static, non-recursive
|
|---|
| 11896 | function whose definition precedes first use.
|
|---|
| 11897 |
|
|---|
| 11898 | grep: plug an inconsequential leak
|
|---|
| 11899 | * src/grep.c (main): Plug a leak: free "keys".
|
|---|
| 11900 |
|
|---|
| 11901 | grep: avoid useless allocations for empty GREP_OPTIONS
|
|---|
| 11902 | * src/grep.c (prepend_default_options): Ignore GREP_OPTIONS
|
|---|
| 11903 | when it's empty, not just when it's undefined.
|
|---|
| 11904 | There are still relatively harmless leaks when GREP_OPTIONS
|
|---|
| 11905 | is set and non-empty. We'll address those, eventually.
|
|---|
| 11906 |
|
|---|
| 11907 | 2010-03-09 Jim Meyering <meyering@redhat.com>
|
|---|
| 11908 |
|
|---|
| 11909 | build: record build-from-clone tool requirements
|
|---|
| 11910 | * bootstrap.conf (buildreq): This makes bootstrap fail with
|
|---|
| 11911 | a clear explanation of the problem. Otherwise, you'd get into
|
|---|
| 11912 | the build process and fail with something far more cryptic.
|
|---|
| 11913 |
|
|---|
| 11914 | dfa: remove a trailing blank
|
|---|
| 11915 | * src/dfa.c (dfaexec): No trailing blanks allowed.
|
|---|
| 11916 |
|
|---|
| 11917 | dfa: sync a tiny change from gawk
|
|---|
| 11918 | * src/dfa.c (state_index) [MBS_SUPPORT]: Initialize .mpbs.nelem member
|
|---|
| 11919 | unconditionally. Also initialize .mbps.elems.
|
|---|
| 11920 |
|
|---|
| 11921 | dfa: avoid a leak (work_mbc->chars)
|
|---|
| 11922 | * src/dfa.c (parse_bracket_exp_mb): Remove useless (and leaked MALLOC).
|
|---|
| 11923 |
|
|---|
| 11924 | doc+bootstrap: document build-from-git-clone process
|
|---|
| 11925 | * bootstrap: Update from coreutils/gnulib.
|
|---|
| 11926 | * README-hacking: New file, nearly identical to the one in coreutils.
|
|---|
| 11927 |
|
|---|
| 11928 | 2010-03-08 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 11929 |
|
|---|
| 11930 | more work on TODO
|
|---|
| 11931 | * TODO: More work on the first section. Use clearer section headers.
|
|---|
| 11932 |
|
|---|
| 11933 | 2010-03-08 Reuben Thomas <rrt@sc3d.org>
|
|---|
| 11934 |
|
|---|
| 11935 | bring TODO up-to-date
|
|---|
| 11936 | * TODO: merge with TODO section of http://www.gnu.org/software/grep/devel.html
|
|---|
| 11937 | and remove done items. Some small bits of tidying also.
|
|---|
| 11938 |
|
|---|
| 11939 | 2010-03-07 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 11940 |
|
|---|
| 11941 | simplify parsing of [a-z]
|
|---|
| 11942 | * src/dfa.c (in_coll_range): New.
|
|---|
| 11943 | (lex): Use it instead of regcomp/regexec.
|
|---|
| 11944 |
|
|---|
| 11945 | Small refactoring in src/dfa.c
|
|---|
| 11946 | * src/dfa.c (parse_bracket_exp_mb): Return MBCSET.
|
|---|
| 11947 | (lex): Assign return value of parse_bracket_exp_mb to lasttok, return it.
|
|---|
| 11948 |
|
|---|
| 11949 | use do...while(0) idiom
|
|---|
| 11950 | * dfa.c (FETCH): Wrap with do...while(0).
|
|---|
| 11951 |
|
|---|
| 11952 | 2010-03-06 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 11953 |
|
|---|
| 11954 | extract common code from if/else
|
|---|
| 11955 | * dfa.c (dfaexec): Simplify logic for MB_CUR_MAX > 1 case.
|
|---|
| 11956 |
|
|---|
| 11957 | remove register variable hacks
|
|---|
| 11958 | * dfa.c (dfaexec): We can extract the address of a variable without fearing
|
|---|
| 11959 | performance problems, modern compilers know better.
|
|---|
| 11960 |
|
|---|
| 11961 | remove register keywords
|
|---|
| 11962 | * dfa.c (dfaexec): Modern compilers just ignore it.
|
|---|
| 11963 |
|
|---|
| 11964 | allow grep -Pz
|
|---|
| 11965 | * NEWS: Document grep -P improvements.
|
|---|
| 11966 | * src/search.c (Pcompile): Remove restriction on grep -Pz.
|
|---|
| 11967 | * tests/pcre-z: New.
|
|---|
| 11968 | * tests/Makefile.am (TESTS): Add pcre-z.
|
|---|
| 11969 |
|
|---|
| 11970 | fix cross-line matching in PCRE backend
|
|---|
| 11971 | * search.c (Pexecute): Split the buffer in lines and match each line
|
|---|
| 11972 | separately.
|
|---|
| 11973 | * tests/fedora.sh: Add regression testsuite.
|
|---|
| 11974 |
|
|---|
| 11975 | fix formatting of NEWS
|
|---|
| 11976 | * NEWS: fix formatting of 2.6 entries.
|
|---|
| 11977 |
|
|---|
| 11978 | fix a bug in handling of -i and character type
|
|---|
| 11979 | * dfa.c (parse_bracket_exp_mb): Convert [[:lower:]] and [[:upper]] to
|
|---|
| 11980 | [[:alpha:]] when folding case.
|
|---|
| 11981 | * tests/case-fold-char-type: New file. Test for the bug.
|
|---|
| 11982 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 11983 | * NEWS (Bug fixes): Mention it.
|
|---|
| 11984 |
|
|---|
| 11985 | fix previous test case change
|
|---|
| 11986 | * tests/case-fold-char-class: Do not reset fail to 0 after first test.
|
|---|
| 11987 |
|
|---|
| 11988 | 2010-03-06 Mike Frysinger <vapier@gentoo.org>
|
|---|
| 11989 |
|
|---|
| 11990 | grep(1) man page: touchup --label option
|
|---|
| 11991 | * doc/grep.1 (--label): Don't italicize ending period. Point to -H
|
|---|
| 11992 | option.
|
|---|
| 11993 |
|
|---|
| 11994 | 2010-03-06 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 11995 |
|
|---|
| 11996 | augment case-fold-char-class test case
|
|---|
| 11997 | * tests/case-fold-char-class: Test matching lowercase against uppercase
|
|---|
| 11998 | as well as vice versa.
|
|---|
| 11999 |
|
|---|
| 12000 | 2010-03-05 Reuben Thomas <rrt@sc3d.org>
|
|---|
| 12001 |
|
|---|
| 12002 | doc: improve the discussion of PCRE
|
|---|
| 12003 | * doc/grep.1: Add a sentence about Perl regular expressions,
|
|---|
| 12004 | and point to pcresyntax(3) and pcrepattern(3).
|
|---|
| 12005 | * doc/grep.texi: Likewise.
|
|---|
| 12006 |
|
|---|
| 12007 | 2010-03-05 Jim Meyering <meyering@redhat.com>
|
|---|
| 12008 |
|
|---|
| 12009 | maint: dfa-sync: comment and dead-to-grep code: no semantic change
|
|---|
| 12010 | * src/dfa.c: Sync a comment and some #ifdef GAWK code.
|
|---|
| 12011 |
|
|---|
| 12012 | maint: dfa-sync: don't malloc zero
|
|---|
| 12013 | * src/dfa.c (dfacomp): Skip case_fold logic when length is zero.
|
|---|
| 12014 | This probably "no semantic change", but does improve efficiency in
|
|---|
| 12015 | a degenerate case.
|
|---|
| 12016 |
|
|---|
| 12017 | maint: dfa-sync: use CALLOC rather than equiv. MALLOC+initialize-loop
|
|---|
| 12018 | * src/dfa.c (dfaanalyze): Sync from gawk. No semantic change.
|
|---|
| 12019 |
|
|---|
| 12020 | dfa.c: add support for \s and \S
|
|---|
| 12021 | * src/dfa.c (lex): Sync from gawk's dfa.c.
|
|---|
| 12022 |
|
|---|
| 12023 | maint: dfa-sync: add omitted array initializer
|
|---|
| 12024 | * src/dfa.c (prednames): Add a "0" to final initializer.
|
|---|
| 12025 | No semantic change.
|
|---|
| 12026 |
|
|---|
| 12027 | fix a bug in handling of -i and character classes
|
|---|
| 12028 | * dfa.c (parse_bracket_exp_mb): Sync one part of this function
|
|---|
| 12029 | from gawk's dfa.c, which was patched by Arnold D. Robbins.
|
|---|
| 12030 | * tests/case-fold-char-class: New file. Test for the bug.
|
|---|
| 12031 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 12032 | (TESTS_ENVIRONMENT): Propagate LOCALE_FR and LOCALE_FR_UTF8
|
|---|
| 12033 | definitions into tests.
|
|---|
| 12034 | * NEWS (Bug fixes): Mention it.
|
|---|
| 12035 |
|
|---|
| 12036 | 2010-03-05 Paolo Bonzini <pbonzini@redhat.com>
|
|---|
| 12037 |
|
|---|
| 12038 | Fedora Grep regression test suite
|
|---|
| 12039 | * tests/Makefile.am (TESTS): Add fedora.sh.
|
|---|
| 12040 | (CLEANFILES): Add several new files.
|
|---|
| 12041 | * tests/fedora.sh: New file, originally by Lubomir Rintel but somewhat
|
|---|
| 12042 | rewritten to avoid bashisms.
|
|---|
| 12043 |
|
|---|
| 12044 | 2010-03-05 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 12045 |
|
|---|
| 12046 | convert AUTHORS file to UTF-8
|
|---|
| 12047 | * AUTHORS: Convert to UTF-8.
|
|---|
| 12048 |
|
|---|
| 12049 | eliminate invalid "ptr += (ptr2 - ptr1)"
|
|---|
| 12050 | * lib/savedir.c (savedir): new_name_space and name_space do not point into
|
|---|
| 12051 | the same object, so computing their difference is invalid. Similarly,
|
|---|
| 12052 | summing the difference to namep is invalid because namep and the result
|
|---|
| 12053 | point into different objects. Avoid this.
|
|---|
| 12054 |
|
|---|
| 12055 | fix for bug 21276
|
|---|
| 12056 | * lib/savedir.c (isdir1): Use realloc instead of calloc. Remove
|
|---|
| 12057 | dead code.
|
|---|
| 12058 | (savedir): Do not leak name_space if allocation of new_name_space fails.
|
|---|
| 12059 |
|
|---|
| 12060 | 2010-03-04 Jim Meyering <meyering@redhat.com>
|
|---|
| 12061 |
|
|---|
| 12062 | tests: add a test based on an example from Paolo Bonzini
|
|---|
| 12063 | * tests/word-multi-file: New test.
|
|---|
| 12064 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 12065 |
|
|---|
| 12066 | doc: document release procedure
|
|---|
| 12067 | * README-release: New file.
|
|---|
| 12068 |
|
|---|
| 12069 | build: update gnulib submodule to latest
|
|---|
| 12070 |
|
|---|
| 12071 | 2010-02-22 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 12072 |
|
|---|
| 12073 | add --group-separator=FOO and --no-group-separator
|
|---|
| 12074 | * src/grep.c (group_separator): New.
|
|---|
| 12075 | (long_options): Add --group-separator=FOO and --no-group-separator.
|
|---|
| 12076 | (prtext): Print group_separator instead of SEP_STR_GROUP. Optionally
|
|---|
| 12077 | suppress the separator altogether.
|
|---|
| 12078 | (main) Handle GROUP_SEPARATOR_OPTION.
|
|---|
| 12079 | * doc/grep.texi (Context control): Document it.
|
|---|
| 12080 | * NEWS: Mention it.
|
|---|
| 12081 | * tests/yesno.sh: Add testcases.
|
|---|
| 12082 |
|
|---|
| 12083 | 2010-02-21 Jim Meyering <meyering@redhat.com>
|
|---|
| 12084 |
|
|---|
| 12085 | tests: don't use "echo -n"
|
|---|
| 12086 | * tests/foad1.sh: Use printf, not echo -n. The latter is not portable.
|
|---|
| 12087 | Reported by Daniel Richman.
|
|---|
| 12088 |
|
|---|
| 12089 | 2010-02-08 Jim Meyering <meyering@redhat.com>
|
|---|
| 12090 |
|
|---|
| 12091 | remove useless DJGPP-specific code
|
|---|
| 12092 | * src/grep.c (grepfile): Remove now-useless DJGPP-specific code.
|
|---|
| 12093 | Now, all S_IS* macros are guaranteed to be defined via gnulib.
|
|---|
| 12094 |
|
|---|
| 12095 | 2010-02-07 Jim Meyering <meyering@redhat.com>
|
|---|
| 12096 |
|
|---|
| 12097 | tests: add help-version sanity tests from coreutils
|
|---|
| 12098 | * tests/help-version: New test, from coreutils.
|
|---|
| 12099 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 12100 | (TESTS_ENVIRONMENT) [built_programs]: Define it.
|
|---|
| 12101 |
|
|---|
| 12102 | tests: correct TESTS_ENVIRONMENT's PATH setting
|
|---|
| 12103 | * tests/Makefile.am (TESTS_ENVIRONMENT): Set PATH to start with
|
|---|
| 12104 | $(abs_top_builddir)/src, so that we test the programs we've just built.
|
|---|
| 12105 |
|
|---|
| 12106 | grep: use the correct exit status (2) upon write failure, not 1
|
|---|
| 12107 | * src/grep.c (main): Initialize exit_failure to EXIT_TROUBLE.
|
|---|
| 12108 | * NEWS (Bug fixes): Mention this fix.
|
|---|
| 12109 |
|
|---|
| 12110 | maint: enable the prohibit_magic_number_exit syntax check
|
|---|
| 12111 | * cfg.mk (local-checks-to-skip): Remove sc_prohibit_magic_number_exit,
|
|---|
| 12112 | to enable that check.
|
|---|
| 12113 | * src/system.h (EXIT_TROUBLE): Define.
|
|---|
| 12114 | * src/grep.c: Use symbolic names, EXIT_SUCCESS, EXIT_FAILURE, and
|
|---|
| 12115 | EXIT_TROUBLE, not 0, 1, 2.
|
|---|
| 12116 | * src/search.c: Likewise.
|
|---|
| 12117 | * src/vms_fab.c (string): Likewise.
|
|---|
| 12118 |
|
|---|
| 12119 | 2010-02-04 Jim Meyering <meyering@redhat.com>
|
|---|
| 12120 |
|
|---|
| 12121 | doc: adjust NEWS item
|
|---|
| 12122 | * NEWS: Correct a description.
|
|---|
| 12123 |
|
|---|
| 12124 | 2010-02-03 Jim Meyering <meyering@redhat.com>
|
|---|
| 12125 |
|
|---|
| 12126 | tests: exercise surprising -m1 vs. --context behavior
|
|---|
| 12127 | * tests/max-count-vs-context: New test. Exercise the surprising,
|
|---|
| 12128 | but documented, behavior reported by Markus Jochim in
|
|---|
| 12129 | http://savannah.gnu.org/bugs/?28588.
|
|---|
| 12130 | * tests/Makefile.am (TESTS): Add it.
|
|---|
| 12131 |
|
|---|
| 12132 | tests: use init.sh from gnulib
|
|---|
| 12133 | * tests/init.sh: New file, from gnulib.
|
|---|
| 12134 | * tests/Makefile.am (EXTRA_DIST): Add it.
|
|---|
| 12135 | (TESTS_ENVIRONMENT): Add variables and features.
|
|---|
| 12136 | (VERBOSE): Define.
|
|---|
| 12137 |
|
|---|
| 12138 | maint: remove unused Makefile rule
|
|---|
| 12139 | * tests/Makefile.am (dist-hook): Remove rule. No longer needed.
|
|---|
| 12140 |
|
|---|
| 12141 | maint: adjust formatting in tests/Makefile.am
|
|---|
| 12142 | * tests/Makefile.am (TESTS, CLEANFILES): Align and sort.
|
|---|
| 12143 |
|
|---|
| 12144 | build: avoid warnings in gnulib-supplied regex files
|
|---|
| 12145 | Now that we enable more warnings in lib/, we choose
|
|---|
| 12146 | to avoid some via patches applied by bootstrap, using
|
|---|
| 12147 | files in the gl/ hierarchy. Other, less-important
|
|---|
| 12148 | warnings are avoided simply by turning off the
|
|---|
| 12149 | -Wold-style-definition option and using a slightly
|
|---|
| 12150 | relaxed set of warnings $(GNULIB_WARN_CFLAGS) in lib/.
|
|---|
| 12151 | * gl/lib/regcomp.c.diff: Avoid warnings.
|
|---|
| 12152 | * gl/lib/regex_internal.c.diff: Likewise.
|
|---|
| 12153 | * gl/lib/regex_internal.h.diff: Likewise.
|
|---|
| 12154 | * gl/lib/regexec.c.diff: Likewise.
|
|---|
| 12155 | * configure.ac (GNULIB_PORTCHECK): Disable only -Wold-style-definition.
|
|---|
| 12156 | * lib/Makefile.am (AM_CFLAGS): Use $(GNULIB_WARN_CFLAGS) rather
|
|---|
| 12157 | than the slightly more strict $(WARN_CFLAGS).
|
|---|
| 12158 |
|
|---|
| 12159 | tests: adjust spencer #37 to pass with gnulib's regex code
|
|---|
| 12160 | * tests/spencer1.tests: Change #37 to expect an exit status of 2, not 1.
|
|---|
| 12161 | grep 'a[b-a]' reports "Invalid range end".
|
|---|
| 12162 |
|
|---|
| 12163 | maint: use regex from gnulib, rather than our bit-rotting one
|
|---|
| 12164 | * bootstrap.conf (gnulib_modules): Add regex.
|
|---|
| 12165 | * configure.ac: Don't use jm_INCLUDED_REGEX.
|
|---|
| 12166 | Update use of cache variable.
|
|---|
| 12167 | * lib/regex.c: Remove file.
|
|---|
| 12168 | * lib/regex.h: Likewise.
|
|---|
| 12169 | * m4/regex.m4: Likewise.
|
|---|
| 12170 | * POTFILES.in: Update to match.
|
|---|
| 12171 |
|
|---|
| 12172 | build: update gnulib submodule to latest
|
|---|
| 12173 |
|
|---|
| 12174 | 2010-01-28 Jim Meyering <meyering@redhat.com>
|
|---|
| 12175 |
|
|---|
| 12176 | maint: update to latest gnulib; adjust cfg.mk
|
|---|
| 12177 | * gnulib: Update submodule to latest.
|
|---|
| 12178 | * cfg.mk (old_NEWS_hash): Update to reflect NEWS Copyright line change.
|
|---|
| 12179 |
|
|---|
| 12180 | 2010-01-06 Jim Meyering <meyering@redhat.com>
|
|---|
| 12181 |
|
|---|
| 12182 | maint: avoid old jm_* macros
|
|---|
| 12183 | There were jm_* macros here, until very recently.
|
|---|
| 12184 | * cfg.mk (sc_prohibit_jm_in_m4): New rule, from coreutils.
|
|---|
| 12185 |
|
|---|
| 12186 | maint: remove decl.m4
|
|---|
| 12187 | * m4/decl.m4: Remove unused file.
|
|---|
| 12188 |
|
|---|
| 12189 | maint: rely on gnulib's new isdir.h
|
|---|
| 12190 | * src/grep.c: Include "isdir.h".
|
|---|
| 12191 | * src/system.h: Remove declaration of isdir.
|
|---|
| 12192 |
|
|---|
| 12193 | build: rename local to avoid shadowing global, dfa
|
|---|
| 12194 | * src/dfa.c (dfamust): Rename parameter: s/dfa/d/.
|
|---|
| 12195 |
|
|---|
| 12196 | build: avoid warning from -Wmissing-prototypes
|
|---|
| 12197 | * src/dfa.c (match_mb_charset): Declare to be static.
|
|---|
| 12198 |
|
|---|
| 12199 | build: avoid shadowing warning for "link"
|
|---|
| 12200 | * src/kwset.c (link): Define to kwset_link, to avoid shadowing
|
|---|
| 12201 | the function.
|
|---|
| 12202 |
|
|---|
| 12203 | build: avoid shadowing warning for unused "rs"
|
|---|
| 12204 | * src/dfa.c (transit_state): Remove dead stores;
|
|---|
| 12205 | move a declaration "down".
|
|---|
| 12206 | Ignore transit_state_consume_1char return value.
|
|---|
| 12207 |
|
|---|
| 12208 | build: avoid shadowing warnings
|
|---|
| 12209 | * src/dfa.c (match_mb_charset): Rename parameter: s/index/idx/.
|
|---|
| 12210 | (check_matching_with_multibyte_ops, match_anychar): Likewise.
|
|---|
| 12211 |
|
|---|
| 12212 | build: avoid warning about unused definition of N_
|
|---|
| 12213 | * src/dfa.c (N_): Remove unused definition.
|
|---|
| 12214 |
|
|---|
| 12215 | build: avoid format-string warnings
|
|---|
| 12216 | * src/search.c (dfaerror): Use literal "%s" as format string.
|
|---|
| 12217 | (kwsmusts, GEAcompile): Likewise.
|
|---|
| 12218 | (Pcompile): Likewise.
|
|---|
| 12219 |
|
|---|
| 12220 | build: add configure-time --enable-gcc-warnings option; avoid warnings
|
|---|
| 12221 | * bootstrap.conf (gnulib_modules): Add "manywarnings" module.
|
|---|
| 12222 | * configure.ac: Add --enable-gcc-warnings, derived from code in bison.
|
|---|
| 12223 | * src/Makefile.am (AM_CFLAGS): Set to $(WARN_CFLAGS) $(WERROR_CFLAGS)
|
|---|
| 12224 | * lib/Makefile.am (AM_CFLAGS): Likewise, but append.
|
|---|
| 12225 |
|
|---|
| 12226 | build: remove now-useless -I../intl option
|
|---|
| 12227 | * src/Makefile.am (INCLUDES): Remove -I../intl, now that intl is gone.
|
|---|
| 12228 |
|
|---|
| 12229 | maint: avoid more warnings
|
|---|
| 12230 | * src/grep.c (MAX): Remove definition of unused macro.
|
|---|
| 12231 | (usage): Declare with __attribute__ ((noreturn)).
|
|---|
| 12232 | Split long strings into chunks of length < 509.
|
|---|
| 12233 |
|
|---|
| 12234 | fix a possible bug: remove errant semicolon
|
|---|
| 12235 | * src/grep.c (prline): Remove erroneous semicolon-after-if-expr.
|
|---|
| 12236 |
|
|---|
| 12237 | maint: avoid compilation warnings
|
|---|
| 12238 | * bootstrap.conf (gnulib_modules): Add ignore-value.
|
|---|
| 12239 | * src/search.c (check_multibyte_string_no_icase): A variant of
|
|---|
| 12240 | check_multibyte_string that does *not* convert case, and hence
|
|---|
| 12241 | does not modify its BUF parameter.
|
|---|
| 12242 | (check_multibyte_string): Use xcalloc in place of xmalloc+memset.
|
|---|
| 12243 | Use ignore_value to ignore the return value from wcrtomb. This is
|
|---|
| 12244 | ok, since we know the input is a valid upper case wide character.
|
|---|
| 12245 | (Fexecute, EGexecute): Update callers of check_multibyte_string
|
|---|
| 12246 | to use both it and check_multibyte_string_no_icase.
|
|---|
| 12247 |
|
|---|
| 12248 | maint: avoid warnings about unused fwrite return value
|
|---|
| 12249 | * bootstrap.conf (gnulib_modules): Add unlocked-io.
|
|---|
| 12250 | * src/system.h: Include "unlocked-io.h".
|
|---|
| 12251 |
|
|---|
| 12252 | maint: remove {m4,lib}/.gitignore; they were undergoing too much churn
|
|---|
| 12253 | * .gitignore: Ignore all of m4/* except m4/djgpp.m4
|
|---|
| 12254 | and all of lib/* except Makefile.am, savedir.c and savedir.h.
|
|---|
| 12255 | * m4/.gitignore: Remove file.
|
|---|
| 12256 | * lib/.gitignore: Remove file.
|
|---|
| 12257 |
|
|---|
| 12258 | 2010-01-05 Jim Meyering <meyering@redhat.com>
|
|---|
| 12259 |
|
|---|
| 12260 | build: run gnulib's tests, too
|
|---|
| 12261 | * Makefile.am (SUBDIRS): Add gnulib-tests.
|
|---|
| 12262 | * gnulib-tests/Makefile.am: New file.
|
|---|
| 12263 | * bootstrap.conf (bootstrap_epilogue): New function, from coreutils.
|
|---|
| 12264 | (gnulib_tool_option_extras): Define.
|
|---|
| 12265 | * configure.ac: Add gnulib-tests/Makefile.
|
|---|
| 12266 |
|
|---|
| 12267 | 2010-01-03 Jim Meyering <meyering@redhat.com>
|
|---|
| 12268 |
|
|---|
| 12269 | maint: record update-copyright options for this package
|
|---|
| 12270 | * cfg.mk: Next time, just run "make update-copyright".
|
|---|
| 12271 |
|
|---|
| 12272 | 2010-01-01 Jim Meyering <meyering@redhat.com>
|
|---|
| 12273 |
|
|---|
| 12274 | maint: update all FSF copyright year lists to include 2010
|
|---|
| 12275 | Use this command:
|
|---|
| 12276 | git ls-files |grep -vE '^(\..*|COPYING|gnulib)$' |xargs \
|
|---|
| 12277 | env UPDATE_COPYRIGHT_USE_INTERVALS=1 build-aux/update-copyright
|
|---|
| 12278 |
|
|---|
| 12279 | 2009-12-23 Jim Meyering <meyering@redhat.com>
|
|---|
| 12280 |
|
|---|
| 12281 | fix multi-byte-locale read-beyond-end-of-buffer error
|
|---|
| 12282 | Avoid read-beyond-end-of-buffer errors, evoked by running this:
|
|---|
| 12283 | LC_ALL=en_US.UTF-8 valgrind src/grep -f <(printf 'a\nb\n') <(echo c)
|
|---|
| 12284 |
|
|---|
| 12285 | Conditional jump or move depends on uninitialised value(s)
|
|---|
| 12286 | at 0x78136D: __gconv_transform_utf8_internal (in /lib/libc-2.11.so)
|
|---|
| 12287 | by 0x7E7232: mbrtowc (in /lib/libc-2.11.so)
|
|---|
| 12288 | by 0x8055773: dfaexec (dfa.c:2816)
|
|---|
| 12289 | by 0x804D7B0: EGexecute (search.c:353)
|
|---|
| 12290 | by 0x804ACD8: grepbuf (grep.c:1036)
|
|---|
| 12291 | by 0x804B023: grep (grep.c:1156)
|
|---|
| 12292 | by 0x804B460: grepfile (grep.c:1287)
|
|---|
| 12293 | by 0x804CF0D: main (grep.c:2282)
|
|---|
| 12294 |
|
|---|
| 12295 | Conditional jump or move depends on uninitialised value(s)
|
|---|
| 12296 | at 0x7E7248: mbrtowc (in /lib/libc-2.11.so)
|
|---|
| 12297 | by 0x8055773: dfaexec (dfa.c:2816)
|
|---|
| 12298 | by 0x804D7B0: EGexecute (search.c:353)
|
|---|
| 12299 | by 0x804ACD8: grepbuf (grep.c:1036)
|
|---|
| 12300 | by 0x804B023: grep (grep.c:1156)
|
|---|
| 12301 | by 0x804B460: grepfile (grep.c:1287)
|
|---|
| 12302 | by 0x804CF0D: main (grep.c:2282)
|
|---|
| 12303 |
|
|---|
| 12304 | * src/dfa.c (dfaexec) [MBS_SUPPORT]: Do not access one byte beyond
|
|---|
| 12305 | end of buffer.
|
|---|
| 12306 |
|
|---|
| 12307 | 2009-12-23 Jim Meyering <meyering@redhat.com>
|
|---|
| 12308 |
|
|---|
| 12309 | build: update gnulib submodule to latest
|
|---|
| 12310 |
|
|---|
| 12311 | 2009-12-23 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 12312 |
|
|---|
| 12313 | Speed up insert.
|
|---|
| 12314 | Suggested by Johan Walles <johan.walles@gmail.com> (bug 23354).
|
|---|
| 12315 |
|
|---|
| 12316 | * src/dfa.c (insert): Use binary search.
|
|---|
| 12317 |
|
|---|
| 12318 | 2009-12-23 Johan Walles <johan.walles@gmail.com>
|
|---|
| 12319 |
|
|---|
| 12320 | Decrease epsclosure memory usage
|
|---|
| 12321 | Fixes bug 23321.
|
|---|
| 12322 |
|
|---|
| 12323 | * src/dfa.c (epsclosure): Make visited an array of char.
|
|---|
| 12324 |
|
|---|
| 12325 | 2009-12-22 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 12326 |
|
|---|
| 12327 | Make 'grep -1 -2' and 'grep -1v2' equivalent to grep -2
|
|---|
| 12328 | Fixes bug 12128.
|
|---|
| 12329 |
|
|---|
| 12330 | * src/grep.c (get_nondigit_option): Reset the buffer every time
|
|---|
| 12331 | a non-digit option is found or a new argument is started.
|
|---|
| 12332 |
|
|---|
| 12333 | 2009-12-22 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 12334 |
|
|---|
| 12335 | Improve description of --label
|
|---|
| 12336 | Fixes bug 22681.
|
|---|
| 12337 |
|
|---|
| 12338 | * doc/grep.1 (--label): Use -H in the example, improve wording.
|
|---|
| 12339 | * doc/grep.texi (Output Line Prefix Control): Likewise.
|
|---|
| 12340 |
|
|---|
| 12341 | 2009-12-22 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 12342 |
|
|---|
| 12343 | Avoid using an invalid memchr result.
|
|---|
| 12344 | Related to bug 13161. I cannot find a testcase, but it is better to be
|
|---|
| 12345 | defensive considering that these bug were found in the past.
|
|---|
| 12346 |
|
|---|
| 12347 | * src/search.c (EGexecute, Fexecute): Check for memchr return values.
|
|---|
| 12348 |
|
|---|
| 12349 | 2009-12-11 Jim Meyering <meyering@redhat.com>
|
|---|
| 12350 |
|
|---|
| 12351 | build: update gnulib submodule to latest
|
|---|
| 12352 |
|
|---|
| 12353 | 2009-12-04 Jim Meyering <meyering@redhat.com>
|
|---|
| 12354 |
|
|---|
| 12355 | maint: enable prohibit_have_config_h check
|
|---|
| 12356 | * cfg.mk (local-checks-to-skip): Enable sc_prohibit_have_config_h
|
|---|
| 12357 | * lib/regex.c: Remove useless cpp test of HAVE_CONFIG_H.
|
|---|
| 12358 | * lib/savedir.c: Likewise.
|
|---|
| 12359 | * src/grep.c: Likewise.
|
|---|
| 12360 | * src/kwset.c: Likewise.
|
|---|
| 12361 | * src/search.c: Likewise.
|
|---|
| 12362 |
|
|---|
| 12363 | maint: enable cast_of_x_alloc_return_value check
|
|---|
| 12364 | * cfg.mk (local-checks-to-skip): Enable sc_cast_of_x_alloc_return_value.
|
|---|
| 12365 | * .x-sc_cast_of_x_alloc_return_value:
|
|---|
| 12366 | * src/dfa.c (CALLOC, MALLOC, REALLOC): Remove casts.
|
|---|
| 12367 | * src/dosbuf.c (undossify_input): Likewise.
|
|---|
| 12368 | * src/grep.c (print_line_middle, prepend_default_options): Likewise.
|
|---|
| 12369 |
|
|---|
| 12370 | maint: enable cast_of_alloca_return_value check
|
|---|
| 12371 | * cfg.mk (local-checks-to-skip): Enable sc_cast_of_alloca_return_value.
|
|---|
| 12372 | * .x-sc_cast_of_alloca_return_value: New file.
|
|---|
| 12373 |
|
|---|
| 12374 | 2009-12-04 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 12375 |
|
|---|
| 12376 | fix "grep -Ff" on CRLF-terminated files
|
|---|
| 12377 | * src/search.c (Fcompile) [HAVE_DOS_FILE_CONTENTS]: Recognize \r\n as
|
|---|
| 12378 | a line terminator.
|
|---|
| 12379 |
|
|---|
| 12380 | fix compilation with included regex
|
|---|
| 12381 | * Makefile.am (libgreputils_a_DEPENDENCIES): New.
|
|---|
| 12382 |
|
|---|
| 12383 | switch to pkg-config for PCRE detection
|
|---|
| 12384 | * configure.ac: use pkg-config to detect PCRE
|
|---|
| 12385 | * src/Makefile.am (grep_LDADD): link grep with PCRE_LIBS
|
|---|
| 12386 |
|
|---|
| 12387 | 2009-12-04 Jim Meyering <meyering@redhat.com>
|
|---|
| 12388 |
|
|---|
| 12389 | maint: remove "missing" script
|
|---|
| 12390 | * missing: Remove now-unused file.
|
|---|
| 12391 |
|
|---|
| 12392 | maint: make .gitignore ignore more
|
|---|
| 12393 | * .gitignore: Ignore more.
|
|---|
| 12394 |
|
|---|
| 12395 | maint: enable useless-if-before-free check
|
|---|
| 12396 | * cfg.mk (local-checks-to-skip): Enable sc_avoid_if_before_free.
|
|---|
| 12397 | * .x-sc_avoid_if_before_free: New file. Exempt regex.c and dfa.c,
|
|---|
| 12398 | in case anyone ever tries to merge their contents with other versions.
|
|---|
| 12399 | * src/grep.c (print_line_middle, grepdir): Remove useless if-before-free.
|
|---|
| 12400 | * src/search.c (IF_BK, EXECUTE_FCT): Likewise.
|
|---|
| 12401 |
|
|---|
| 12402 | maint: enable po-check
|
|---|
| 12403 | * cfg.mk (local-checks-to-skip): Enable sc_po_check.
|
|---|
| 12404 | * po/POTFILES.in: Sort and update.
|
|---|
| 12405 |
|
|---|
| 12406 | 2009-12-03 Paolo Bonzini <bonzini@gnu.org>
|
|---|
| 12407 |
|
|---|
| 12408 | update gnulib, fixing missing inclusion of stdbool.h
|
|---|
| 12409 | * gnulib: Update.
|
|---|
| 12410 |
|
|---|
| 12411 | 2009-11-30 Jim Meyering <meyering@redhat.com>
|
|---|
| 12412 |
|
|---|
| 12413 | maint: enable two checks
|
|---|
| 12414 | * cfg.mk (local-checks-to-skip): Enable two:
|
|---|
| 12415 | sc_prohibit_xalloc_without_use sc_two_space_separator_in_usage
|
|---|
| 12416 | * src/grep.c (usage): Conform: use two spaces, not 1.
|
|---|
| 12417 | * src/kwset.c (malloc): Define as a function-macro so that the
|
|---|
| 12418 | syntax-check rule sees that we are indeed using xmalloc here.
|
|---|
| 12419 |
|
|---|
| 12420 | maint: enable makefile_path_separator check
|
|---|
| 12421 | * cfg.mk (local-checks-to-skip): Enable sc_makefile_path_separator_check,
|
|---|
| 12422 | now that the sole offender, an old po/Makefile.in.in, is gone.
|
|---|
| 12423 |
|
|---|
| 12424 | maint: remove now-generated file: po/Makefile.in.in
|
|---|
| 12425 | * po/Makefile.in.in: Remove file, now generated via bootstrap.
|
|---|
| 12426 |
|
|---|
| 12427 | maint: enable makefile @...@ check
|
|---|
| 12428 | * cfg.mk (local-checks-to-skip): Enable sc_makefile_check.
|
|---|
| 12429 | * lib/Makefile.am (libgreputils_a_LIBADD): Use $(...), rather than
|
|---|
| 12430 | anachronistic @...@ notation.
|
|---|
| 12431 | * src/Makefile.am (LDADD): Likewise.
|
|---|
| 12432 | * tests/Makefile.am (AWK): Remove definition.
|
|---|
| 12433 |
|
|---|
| 12434 | maint: enable trailing_blank check
|
|---|
| 12435 | * cfg.mk (local-checks-to-skip): Enable sc_trailing_blank.
|
|---|
| 12436 | * AUTHORS: Remove trailing blanks.
|
|---|
| 12437 | * COPYING: Likewise.
|
|---|
| 12438 | * README: Likewise.
|
|---|
| 12439 | * README-alpha: Likewise.
|
|---|
| 12440 | * README-boot: Likewise.
|
|---|
| 12441 | * THANKS: Likewise.
|
|---|
| 12442 | * TODO: Likewise.
|
|---|
| 12443 | * src/dfa.c: Likewise.
|
|---|
| 12444 | * src/mbsupport.h: Likewise.
|
|---|
| 12445 | * tests/backref.sh: Likewise.
|
|---|
| 12446 | * tests/file.sh: Likewise.
|
|---|
| 12447 | * tests/options.sh: Likewise.
|
|---|
| 12448 | * tests/tests: Likewise.
|
|---|
| 12449 | * vms/README: Likewise.
|
|---|
| 12450 | * vms/make.com: Likewise.
|
|---|
| 12451 |
|
|---|
| 12452 | maint: enable unmarked_diagnostics check
|
|---|
| 12453 | * cfg.mk (local-checks-to-skip): Enable sc_unmarked_diagnostics
|
|---|
| 12454 | * src/grep.c (fillbuf): Mark a diagnostic for translation.
|
|---|
| 12455 | (reset): Likewise.
|
|---|
| 12456 |
|
|---|
| 12457 | maint: enable require_config_h checks
|
|---|
| 12458 | * cfg.mk (local-checks-to-skip): Enable sc_require_config_h
|
|---|
| 12459 | and sc_require_config_h_first.
|
|---|
| 12460 | * src/dosbuf.c: Include <config.h>.
|
|---|
| 12461 | * src/vms_fab.c: Likewise.
|
|---|
| 12462 | * .x-sc_require_config_h: New file: list the exceptions.
|
|---|
| 12463 | * .x-sc_require_config_h_first: Likewise.
|
|---|
| 12464 |
|
|---|
| 12465 | maint: use gnulib's progname module; enable set_program_name check
|
|---|
| 12466 | * bootstrap.conf (gnulib_modules): Add progname.
|
|---|
| 12467 | * src/grep.c: Include "progname.h".
|
|---|
| 12468 | (program_name): Remove declaration.
|
|---|
| 12469 | (main): Call set_program_name.
|
|---|
| 12470 | * cfg.mk (local-checks-to-skip): Add sc_program_name.
|
|---|
| 12471 |
|
|---|
| 12472 | maint: enable "file system" check
|
|---|
| 12473 | * cfg.mk (local-checks-to-skip): Enable sc_file_system.
|
|---|
| 12474 | * lib/savedir.c (savedir): Tweak spelling. Remove trailing blanks.
|
|---|
| 12475 |
|
|---|
| 12476 | maint: enable immutable_NEWS check
|
|---|
| 12477 | * NEWS: Move copyright to the bottom.
|
|---|
| 12478 | Use the format required by release-related tools.
|
|---|
| 12479 | * .prev-version: New file.
|
|---|
| 12480 | * cfg.mk (old_NEWS_hash): Define.
|
|---|
| 12481 | (local-checks-to-skip): Enable check: sc_immutable_NEWS.
|
|---|
| 12482 |
|
|---|
| 12483 | maint: disable the many failing syntax-checks
|
|---|
| 12484 | * cfg.mk: New file.
|
|---|
| 12485 | (local-checks-to-skip): Define to the list of disabled rules.
|
|---|
| 12486 | Subsequent change-sets will enable them, one by one.
|
|---|
| 12487 |
|
|---|
| 12488 | build: require automake-1.11, enable silent-rules, parallel tests, xz
|
|---|
| 12489 | * configure.ac (AM_INIT_AUTOMAKE): Create xz-compressed tarballs,
|
|---|
| 12490 | not bzip2-compressed ones. Enable automake's silent-rules,
|
|---|
| 12491 | parallel tests, and test PASS/FAIL coloring options.
|
|---|
| 12492 | Use AC_CONFIG_HEADERS, not AM_CONFIG_HEADER. Quote the argument.
|
|---|
| 12493 |
|
|---|
| 12494 | build: use git-version-gen for inter-release version strings
|
|---|
| 12495 | * configure.ac (AC_INIT): Use git-version-gen.
|
|---|
| 12496 |
|
|---|
| 12497 | build: add several build- and release-related gnulib modules
|
|---|
| 12498 | * bootstrap.conf (gnulib_modules): Add announce-gen update-copyright
|
|---|
| 12499 | do-release-commit-and-tag git-version-gen gnu-web-doc-update
|
|---|
| 12500 | gnupload maintainer-makefile useless-if-before-free
|
|---|
| 12501 |
|
|---|
| 12502 | build: adapt to the newer closeout module from gnulib
|
|---|
| 12503 | * src/grep.c: Include "exitfail.h".
|
|---|
| 12504 | (main) [-q]: Set the global variable, exit_failure, rather than
|
|---|
| 12505 | calling the now-removed close_stdout_set_file_name function.
|
|---|
| 12506 |
|
|---|
| 12507 | build: adapt to the newer exclude API we now get from gnulib
|
|---|
| 12508 | * src/grep.c (main): Adapt to newer exclude.c: add EXCLUDE_WILDCARDS as
|
|---|
| 12509 | the new "option" argument in calls to add_exclude and add_exclude_file.
|
|---|
| 12510 |
|
|---|
| 12511 | build: get more lib/* files from gnulib, adjust savedir
|
|---|
| 12512 | * bootstrap.conf (gnulib_modules): Add the following:
|
|---|
| 12513 | closeout exclude hard-locale isdir strtoumax.
|
|---|
| 12514 | * lib/.gitignore, m4/.gitignore: Update.
|
|---|
| 12515 | * lib/closeout.c, lib/closeout.h: Remove.
|
|---|
| 12516 | * lib/exclude.c, lib/exclude.h: Remove.
|
|---|
| 12517 | * lib/hard-locale.c, lib/hard-locale.h: Remove.
|
|---|
| 12518 | * lib/strtoumax.c: Remove.
|
|---|
| 12519 | * lib/isdir.c: Remove.
|
|---|
| 12520 | * lib/Makefile.am: Remove here, too.
|
|---|
| 12521 | * lib/savedir.c: Adapt to new exclude module:
|
|---|
| 12522 | s/excluded_filename/excluded_file_name/ and remove 3rd argument.
|
|---|
| 12523 |
|
|---|
| 12524 | build: update gnulib submodule to latest
|
|---|
| 12525 |
|
|---|
| 12526 | maint: generate ChangeLog from git logs
|
|---|
| 12527 | * Makefile.am (dist-hook, gen-ChangeLog): New rules.
|
|---|
| 12528 | * bootstrap.conf (gnulib_modules): Add gitlog-to-changelog.
|
|---|
| 12529 | Ensure that ChangeLog exists.
|
|---|
| 12530 | * ChangeLog-2009: Rename from ChangeLog
|
|---|
| 12531 | * ChangeLog: Remove file.
|
|---|
| 12532 | * .gitignore: Add ChangeLog.
|
|---|
| 12533 |
|
|---|
| 12534 | maint: list gnulib modules one per line
|
|---|
| 12535 | * bootstrap.conf (gnulib_modules): List them one per line.
|
|---|
| 12536 |
|
|---|
| 12537 | 2009-11-29 Tony Abou-Assaleh <taa@acm.org>
|
|---|
| 12538 |
|
|---|
| 12539 | Acknowledge new maintainers, update README-alpha
|
|---|
| 12540 | * AUTHORS: new maintainers added
|
|---|
| 12541 | * THANKS: same
|
|---|
| 12542 | * README-alpha: change CVS references to Git
|
|---|