webdump, branch HEAD
HTML to plain-text converter for webpages
278d829beb658d1eb18dba03c804d4a949e7bd43
2026-03-13T11:52:07Z
2026-03-13T11:52:07Z
typo: url -> URL
Hiltjo Posthuma
hiltjo@codemadness.org
commit 278d829beb658d1eb18dba03c804d4a949e7bd43
parent 180713b09a920e1e8a8e26fba3a966aaf0a8bc98
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Fri, 13 Mar 2026 12:52:07 +0100
typo: url -> URL
180713b09a920e1e8a8e26fba3a966aaf0a8bc98
2026-03-11T18:04:20Z
2026-03-11T18:04:20Z
bump version to 0.2
Hiltjo Posthuma
hiltjo@codemadness.org
commit 180713b09a920e1e8a8e26fba3a966aaf0a8bc98
parent 5b615707234222a1ecb7c7c637c629d4f45116ff
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Wed, 11 Mar 2026 19:04:20 +0100
bump version to 0.2
5b615707234222a1ecb7c7c637c629d4f45116ff
2026-03-10T23:59:58Z
2026-03-11T00:02:18Z
fix: reassign internal buffer on realloc
Hiltjo Posthuma
hiltjo@codemadness.org
commit 5b615707234222a1ecb7c7c637c629d4f45116ff
parent c455f534c9eaf07f1c1304e74edc245fba57af46
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Wed, 11 Mar 2026 00:59:58 +0100
fix: reassign internal buffer on realloc
This could point to an old buffer in a different memory region.
(realloc does not neccesarily allocate in a continous memory area).
Found by Clang ASAN
c455f534c9eaf07f1c1304e74edc245fba57af46
2026-03-10T23:16:11Z
2026-03-10T23:16:11Z
fix: Value stored to 'datalen' is never read
Hiltjo Posthuma
hiltjo@codemadness.org
commit c455f534c9eaf07f1c1304e74edc245fba57af46
parent a26127c95ac65dc8cc5e1a99884923aa2ed81a04
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Wed, 11 Mar 2026 00:16:11 +0100
fix: Value stored to 'datalen' is never read
Found by clang-analyzer and cppcheck
a26127c95ac65dc8cc5e1a99884923aa2ed81a04
2026-03-10T23:07:46Z
2026-03-10T23:07:46Z
code-style: remove temporary variable, no need to initialize it
Hiltjo Posthuma
hiltjo@codemadness.org
commit a26127c95ac65dc8cc5e1a99884923aa2ed81a04
parent 83f319cce4257402c42efb6c7f784136ed19d528
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Wed, 11 Mar 2026 00:07:46 +0100
code-style: remove temporary variable, no need to initialize it
Removes Unused code Dead assignment warning with clang-analyzer.
83f319cce4257402c42efb6c7f784136ed19d528
2026-03-10T17:38:06Z
2026-03-10T17:40:24Z
bump LICENSE year
Hiltjo Posthuma
hiltjo@codemadness.org
commit 83f319cce4257402c42efb6c7f784136ed19d528
parent b1bbbf832f4d58fa11002a4deed5bfdbcc6d36a0
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Tue, 10 Mar 2026 18:38:06 +0100
bump LICENSE year
b1bbbf832f4d58fa11002a4deed5bfdbcc6d36a0
2026-03-10T17:37:36Z
2026-03-10T17:40:24Z
small code-style fix
Hiltjo Posthuma
hiltjo@codemadness.org
commit b1bbbf832f4d58fa11002a4deed5bfdbcc6d36a0
parent 0d7d55bb8633594c343e92e3a77d34b2b4249374
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Tue, 10 Mar 2026 18:37:36 +0100
small code-style fix
0d7d55bb8633594c343e92e3a77d34b2b4249374
2026-03-09T19:17:36Z
2026-03-10T17:38:43Z
fix: add null check for parent pointer in option handling (false positive)
andrew
sourcehut@lewman.us
commit 0d7d55bb8633594c343e92e3a77d34b2b4249374
parent bc810c876a5d5de1e71796e5579b0c966ca092fd
Author: andrew <sourcehut@lewman.us>
Date: Mon, 9 Mar 2026 12:17:36 -0700
fix: add null check for parent pointer in option handling (false positive)
Prevent null pointer dereference when an <option> tag appears at root
level without a parent <select> element. The code assumed parent would
always exist when processing DisplayOption tags, but malformed or
invalid HTML could have orphan <option> tags at the root level.
Found by Clang static analyzer (core.NullDereference warning).
Added note: this cannot be triggered, because curnode cannot be 0 here and
parent is set (non-NULL). But it should be checked nonetheless.
bc810c876a5d5de1e71796e5579b0c966ca092fd
2025-12-11T19:52:22Z
2025-12-11T19:53:07Z
sync xml.c changes: parse numeric entities more strictly
Hiltjo Posthuma
hiltjo@codemadness.org
commit bc810c876a5d5de1e71796e5579b0c966ca092fd
parent 31fac2476f06b72f3d8bc7ac654cfde4e8452525
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Thu, 11 Dec 2025 20:52:22 +0100
sync xml.c changes: parse numeric entities more strictly
31fac2476f06b72f3d8bc7ac654cfde4e8452525
2025-09-21T12:21:30Z
2025-09-21T12:21:30Z
slightly reduce stack size for entities
Hiltjo Posthuma
hiltjo@codemadness.org
commit 31fac2476f06b72f3d8bc7ac654cfde4e8452525
parent 3d6afd123b27f8bbd2544071047ee3d0cce4c8c1
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Sun, 21 Sep 2025 14:21:30 +0200
slightly reduce stack size for entities
... rename n to len (consistency).
3d6afd123b27f8bbd2544071047ee3d0cce4c8c1
2025-04-25T09:46:32Z
2025-04-25T09:46:32Z
bump LICENSE year
Hiltjo Posthuma
hiltjo@codemadness.org
commit 3d6afd123b27f8bbd2544071047ee3d0cce4c8c1
parent 5cde25b5150bd0375e9b5800bf3855765830c588
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Fri, 25 Apr 2025 11:46:32 +0200
bump LICENSE year
... and tag 0.1
5cde25b5150bd0375e9b5800bf3855765830c588
2024-07-06T11:05:54Z
2024-07-06T11:05:54Z
improve memory usage and allocation
Hiltjo Posthuma
hiltjo@codemadness.org
commit 5cde25b5150bd0375e9b5800bf3855765830c588
parent 72b23084b7c64c298c6b90ae6ad9f53f497cec57
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Sat, 6 Jul 2024 13:05:54 +0200
improve memory usage and allocation
Do not realloc when it is not needed (even when it is the same size).
Decrease the greedy allocation increment size for nested nodes also.
Tested for example using valgrind and "add Beej's Guide to Network Programming" HTML page:
https://git.codemadness.org/webdump_tests/commit/837749abc02f28e1584e5f2cf2b274ae1c69d8e6.html
The buffering for link references (-l option) used way too much memory.
72b23084b7c64c298c6b90ae6ad9f53f497cec57
2024-06-29T16:29:21Z
2024-06-29T16:29:21Z
improve parsing whitespace after end tag names
Hiltjo Posthuma
hiltjo@codemadness.org
commit 72b23084b7c64c298c6b90ae6ad9f53f497cec57
parent a0118e672fd3fa0004ccf2850eaef4ec4bc6fb39
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Sat, 29 Jun 2024 18:29:21 +0200
improve parsing whitespace after end tag names
Real site example:
https://www.gnupg.org/gph/en/manual.html
Has HTML such as:
<P
CLASS="COPYRIGHT"
>Copyright © 1999 by <SPAN
CLASS="HOLDER"
>The Free Software Foundation</SPAN
></P
>
...
This incorrectly showed ">" in the end tag as data.
Reported by Jason Hood, thanks!
a0118e672fd3fa0004ccf2850eaef4ec4bc6fb39
2024-05-23T18:20:42Z
2024-05-23T18:20:42Z
fix possible regression: set tag defaults also
Hiltjo Posthuma
hiltjo@codemadness.org
commit a0118e672fd3fa0004ccf2850eaef4ec4bc6fb39
parent 115f7e68eeccd7f1030fc631c52bab35692c6973
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Thu, 23 May 2024 20:20:42 +0200
fix possible regression: set tag defaults also
115f7e68eeccd7f1030fc631c52bab35692c6973
2024-05-22T17:12:44Z
2024-05-22T17:12:44Z
fix a crash when tag could be uninitialized and not set to a fixed buffer tagname
Hiltjo Posthuma
hiltjo@codemadness.org
commit 115f7e68eeccd7f1030fc631c52bab35692c6973
parent 64010b2be4bc3845ef07db25f8621c7894fe64bb
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Wed, 22 May 2024 19:12:44 +0200
fix a crash when tag could be uninitialized and not set to a fixed buffer tagname
Reported by pi31415 when he was testing webdump on a binary ZIP file, thanks!
64010b2be4bc3845ef07db25f8621c7894fe64bb
2024-05-22T16:47:04Z
2024-05-22T16:47:04Z
xmltagend: fix checking the correct tag for the node
Hiltjo Posthuma
hiltjo@codemadness.org
commit 64010b2be4bc3845ef07db25f8621c7894fe64bb
parent 178ee8229bd4e0cf0cb8dae6a979ccb473b9bf10
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Wed, 22 May 2024 18:47:04 +0200
xmltagend: fix checking the correct tag for the node
178ee8229bd4e0cf0cb8dae6a979ccb473b9bf10
2024-05-22T16:46:21Z
2024-05-22T16:46:21Z
reduce stack size a bit
Hiltjo Posthuma
hiltjo@codemadness.org
commit 178ee8229bd4e0cf0cb8dae6a979ccb473b9bf10
parent 0f038037edcb9d876ced704462f8daf4f2d2c4b2
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Wed, 22 May 2024 18:46:21 +0200
reduce stack size a bit
0f038037edcb9d876ced704462f8daf4f2d2c4b2
2024-05-22T16:45:53Z
2024-05-22T16:45:53Z
bump LICENSE
Hiltjo Posthuma
hiltjo@codemadness.org
commit 0f038037edcb9d876ced704462f8daf4f2d2c4b2
parent 473563a6c16c683af52cb791fbbfdfb997f758bb
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Wed, 22 May 2024 18:45:53 +0200
bump LICENSE
and improve a few comments
473563a6c16c683af52cb791fbbfdfb997f758bb
2024-04-27T01:28:09Z
2024-04-27T07:49:00Z
webdump.1: fix copypasted flag description
Lucas de Sena
lucas@seninha.org
commit 473563a6c16c683af52cb791fbbfdfb997f758bb
parent 1232b5b3d77c458704341ac436ff4230a3077007
Author: Lucas de Sena <lucas@seninha.org>
Date: Fri, 26 Apr 2024 22:28:09 -0300
webdump.1: fix copypasted flag description
1232b5b3d77c458704341ac436ff4230a3077007
2023-10-15T11:47:16Z
2023-10-15T11:47:16Z
README: expand README
Hiltjo Posthuma
hiltjo@codemadness.org
commit 1232b5b3d77c458704341ac436ff4230a3077007
parent bff9fbe51c0f5f5ac37a46deca1016bb56834dac
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Sun, 15 Oct 2023 13:47:16 +0200
README: expand README
Describe the scope and trade-offs a bit more clearly, because webdump is quite
limited.
bff9fbe51c0f5f5ac37a46deca1016bb56834dac
2023-10-06T09:57:10Z
2023-10-06T09:57:10Z
webdump.1: improve man page
Hiltjo Posthuma
hiltjo@codemadness.org
commit bff9fbe51c0f5f5ac37a46deca1016bb56834dac
parent 030644d3ff71c0708d940f9895e76ab99593f61b
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Fri, 6 Oct 2023 11:57:10 +0200
webdump.1: improve man page
030644d3ff71c0708d940f9895e76ab99593f61b
2023-09-27T16:53:56Z
2023-09-27T16:53:56Z
contextual line-wrapping, disabled for now
Hiltjo Posthuma
hiltjo@codemadness.org
commit 030644d3ff71c0708d940f9895e76ab99593f61b
parent 30a42a2ff270ef5e7ff96d8b23ed5ffbd58c665b
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Wed, 27 Sep 2023 18:53:56 +0200
contextual line-wrapping, disabled for now
30a42a2ff270ef5e7ff96d8b23ed5ffbd58c665b
2023-09-27T16:53:02Z
2023-09-27T16:53:02Z
show "[IMG]" as a placeholder if alt text is empty
Hiltjo Posthuma
hiltjo@codemadness.org
commit 30a42a2ff270ef5e7ff96d8b23ed5ffbd58c665b
parent 5f17b244e6f5fd6d954cfe58679807fef3ea91e5
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Wed, 27 Sep 2023 18:53:02 +0200
show "[IMG]" as a placeholder if alt text is empty
Depending on which options are set.
5f17b244e6f5fd6d954cfe58679807fef3ea91e5
2023-09-22T12:21:28Z
2023-09-22T12:21:45Z
declare a few functions as static
Hiltjo Posthuma
hiltjo@codemadness.org
commit 5f17b244e6f5fd6d954cfe58679807fef3ea91e5
parent 4e69626163451a74e090c1bdbdcc3282236d6b33
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Fri, 22 Sep 2023 14:21:28 +0200
declare a few functions as static
4e69626163451a74e090c1bdbdcc3282236d6b33
2023-09-21T21:13:34Z
2023-09-21T21:13:34Z
hide data in <svg> tag
Hiltjo Posthuma
hiltjo@codemadness.org
commit 4e69626163451a74e090c1bdbdcc3282236d6b33
parent ae36c548e48ddea692a87557938441bb7cd54994
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Thu, 21 Sep 2023 23:13:34 +0200
hide data in <svg> tag
Noticed on a zdnet.com page/article which has invalid SVG data inside it. This
would show gibberish. Note that the parser still expects somewhat valid
XML/HTML.
In the future maybe this could be handled the same as <script> or <style>.
ae36c548e48ddea692a87557938441bb7cd54994
2023-09-20T16:51:10Z
2023-09-20T16:51:10Z
for the class and id attribute use the first value set
Hiltjo Posthuma
hiltjo@codemadness.org
commit ae36c548e48ddea692a87557938441bb7cd54994
parent 4793272ce07153284318336426796cb7e3c93af4
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Wed, 20 Sep 2023 18:51:10 +0200
for the class and id attribute use the first value set
+ small code-style tweaks.
4793272ce07153284318336426796cb7e3c93af4
2023-09-19T18:05:02Z
2023-09-19T18:05:02Z
cleanup code a bit and add some comments
Hiltjo Posthuma
hiltjo@codemadness.org
commit 4793272ce07153284318336426796cb7e3c93af4
parent 589d7d1ed851b5226a4782de8c9f00001f25c599
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Tue, 19 Sep 2023 20:05:02 +0200
cleanup code a bit and add some comments
589d7d1ed851b5226a4782de8c9f00001f25c599
2023-09-19T18:04:01Z
2023-09-19T18:04:01Z
strip down tree.h remove unused code and unused macros
Hiltjo Posthuma
hiltjo@codemadness.org
commit 589d7d1ed851b5226a4782de8c9f00001f25c599
parent c0d1a46e3d5e9d291cb731bec2f0511553d87b48
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Tue, 19 Sep 2023 20:04:01 +0200
strip down tree.h remove unused code and unused macros
... only RB_INSERT and RB_FIND are used.
c0d1a46e3d5e9d291cb731bec2f0511553d87b48
2023-09-18T17:08:01Z
2023-09-18T17:08:01Z
sync some small XML parser fixes
Hiltjo Posthuma
hiltjo@codemadness.org
commit c0d1a46e3d5e9d291cb731bec2f0511553d87b48
parent 011b4885a533382d98f1aee6cb9619e280c99947
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Mon, 18 Sep 2023 19:08:01 +0200
sync some small XML parser fixes
011b4885a533382d98f1aee6cb9619e280c99947
2023-09-18T17:06:03Z
2023-09-18T17:06:03Z
various improvements
Hiltjo Posthuma
hiltjo@codemadness.org
commit 011b4885a533382d98f1aee6cb9619e280c99947
parent 89c9108dc27fe27e0f028f67508a1156ed242d2a
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Mon, 18 Sep 2023 19:06:03 +0200
various improvements
Improve link references:
- Add RB tree to lookup link references: this uses a stripped-down version of
OpenBSD tree.h
- Add 2 separate linked-lists for the order of visible and hidden links.
- Hidden links and now also deduplicated.
Improve nested nodes and max depth:
- Rework and increase the allowed depth of nodes. Allocate them on the heap.
89c9108dc27fe27e0f028f67508a1156ed242d2a
2023-09-14T20:31:03Z
2023-09-14T20:31:03Z
various improvements
Hiltjo Posthuma
hiltjo@codemadness.org
commit 89c9108dc27fe27e0f028f67508a1156ed242d2a
parent 62884d7b5684e791bb0cd6466f74367d6d71618d
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Thu, 14 Sep 2023 22:31:03 +0200
various improvements
- add an unique tagid number per tag. This allows checking by tag number.
- add support for the link reference <frame>, <iframe>, <embed src>.
- improve checking for open optional <p> tags when a block element (such as
<section> is open).
- check if the base URI using the -b option is absolute.
62884d7b5684e791bb0cd6466f74367d6d71618d
2023-09-13T18:41:31Z
2023-09-13T18:41:31Z
whoops, check in some related changes from previous commits
Hiltjo Posthuma
hiltjo@codemadness.org
commit 62884d7b5684e791bb0cd6466f74367d6d71618d
parent 8ab6a487c7adcfe44d9d3c07c81a1c07d6dedd2a
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Wed, 13 Sep 2023 20:41:31 +0200
whoops, check in some related changes from previous commits
8ab6a487c7adcfe44d9d3c07c81a1c07d6dedd2a
2023-09-13T18:40:20Z
2023-09-13T18:40:20Z
set DisplaySelectMulti for <select> with multiple attribute
Hiltjo Posthuma
hiltjo@codemadness.org
commit 8ab6a487c7adcfe44d9d3c07c81a1c07d6dedd2a
parent bf60f514843dfa9c2cc5d10fa9e7f3978da5cefb
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Wed, 13 Sep 2023 20:40:20 +0200
set DisplaySelectMulti for <select> with multiple attribute
This bitmask allows easy layout changes or logic.
bf60f514843dfa9c2cc5d10fa9e7f3978da5cefb
2023-09-13T18:39:42Z
2023-09-13T18:39:42Z
if the <input type="submit|reset"> has no value, use a default one
Hiltjo Posthuma
hiltjo@codemadness.org
commit bf60f514843dfa9c2cc5d10fa9e7f3978da5cefb
parent 29ab23324d260dae10475498bfcabd06a4c9ba48
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Wed, 13 Sep 2023 20:39:42 +0200
if the <input type="submit|reset"> has no value, use a default one
29ab23324d260dae10475498bfcabd06a4c9ba48
2023-09-13T18:38:58Z
2023-09-13T18:38:58Z
fixed nested <dl>
Hiltjo Posthuma
hiltjo@codemadness.org
commit 29ab23324d260dae10475498bfcabd06a4c9ba48
parent bc435d97f57537adbce2b1ddac9f0744a57279ae
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Wed, 13 Sep 2023 20:38:58 +0200
fixed nested <dl>
Noticed on the page, for example:
http://man.openbsd.org/ftp
bc435d97f57537adbce2b1ddac9f0744a57279ae
2023-09-13T18:38:27Z
2023-09-13T18:38:27Z
with the -i and -a option highlight links in ANSI reverse
Hiltjo Posthuma
hiltjo@codemadness.org
commit bc435d97f57537adbce2b1ddac9f0744a57279ae
parent f4f2dc53e082fcbf627d567810c399c306009ea0
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Wed, 13 Sep 2023 20:38:27 +0200
with the -i and -a option highlight links in ANSI reverse
f4f2dc53e082fcbf627d567810c399c306009ea0
2023-09-13T18:37:28Z
2023-09-13T18:37:28Z
initial support for <select> <option>
Hiltjo Posthuma
hiltjo@codemadness.org
commit f4f2dc53e082fcbf627d567810c399c306009ea0
parent 20841145c9fd597e82c3da9dfa7c9d9caf606567
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Wed, 13 Sep 2023 20:37:28 +0200
initial support for <select> <option>
Show the first item, or all of the attribute is multiple.
This ignores the actual selected item if <select><option selected>. This would
require a state of all the option nodes which it doesn't do.
20841145c9fd597e82c3da9dfa7c9d9caf606567
2023-09-13T18:36:36Z
2023-09-13T18:36:36Z
support <object data> attribute as a link reference
Hiltjo Posthuma
hiltjo@codemadness.org
commit 20841145c9fd597e82c3da9dfa7c9d9caf606567
parent 7e848a418c711f6857328b5489172a34d44587c8
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Wed, 13 Sep 2023 20:36:36 +0200
support <object data> attribute as a link reference
7e848a418c711f6857328b5489172a34d44587c8
2023-09-13T18:35:17Z
2023-09-13T18:35:17Z
add support for more tags and change the markup and display block-type of some
Hiltjo Posthuma
hiltjo@codemadness.org
commit 7e848a418c711f6857328b5489172a34d44587c8
parent 91d236dab89449465eb123d756a450a17eb4195a
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Wed, 13 Sep 2023 20:35:17 +0200
add support for more tags and change the markup and display block-type of some
... also add initial types: Button, Select, SelectMulti and Option.
91d236dab89449465eb123d756a450a17eb4195a
2023-09-12T18:02:57Z
2023-09-12T18:02:57Z
add option for unique link references (-d)
Hiltjo Posthuma
hiltjo@codemadness.org
commit 91d236dab89449465eb123d756a450a17eb4195a
parent 790402682bab675461f2a12879408dd5ad30c90f
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Tue, 12 Sep 2023 20:02:57 +0200
add option for unique link references (-d)
... also make link type "a" consistently "link" (also at the bottom
references).
... also flush inline link only if needed
790402682bab675461f2a12879408dd5ad30c90f
2023-09-12T18:01:15Z
2023-09-12T18:01:15Z
do not reset ncells or nbytesline if no newline was emitted
Hiltjo Posthuma
hiltjo@codemadness.org
commit 790402682bab675461f2a12879408dd5ad30c90f
parent 8d29c76012b91bbfaad1feca31b2af4cfbc99032
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Tue, 12 Sep 2023 20:01:15 +0200
do not reset ncells or nbytesline if no newline was emitted
Test-case:
<div>
<div><b></b></div>
<p>abc</p>
</div>
This caused an extra indentation due to the nbytesline check in hflush().
8d29c76012b91bbfaad1feca31b2af4cfbc99032
2023-09-12T18:00:18Z
2023-09-12T18:00:18Z
reduce excessive ANSI markup codes using -a
Hiltjo Posthuma
hiltjo@codemadness.org
commit 8d29c76012b91bbfaad1feca31b2af4cfbc99032
parent dc7717417afb10040e4ea5d9472cc1c2658f1c8c
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Tue, 12 Sep 2023 20:00:18 +0200
reduce excessive ANSI markup codes using -a
dc7717417afb10040e4ea5d9472cc1c2658f1c8c
2023-09-12T17:59:33Z
2023-09-12T17:59:33Z
clamp indent count to increment to >= 0 just in case
Hiltjo Posthuma
hiltjo@codemadness.org
commit dc7717417afb10040e4ea5d9472cc1c2658f1c8c
parent 17c9e247a8df6c43cdec6bac73410dd35ece683c
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Tue, 12 Sep 2023 19:59:33 +0200
clamp indent count to increment to >= 0 just in case
17c9e247a8df6c43cdec6bac73410dd35ece683c
2023-09-12T17:58:39Z
2023-09-12T17:58:39Z
add more block-like tags
Hiltjo Posthuma
hiltjo@codemadness.org
commit 17c9e247a8df6c43cdec6bac73410dd35ece683c
parent 83d4fe1dd6996c779d73406a237c9fd470cda9b6
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Tue, 12 Sep 2023 19:58:39 +0200
add more block-like tags
83d4fe1dd6996c779d73406a237c9fd470cda9b6
2023-09-11T17:10:29Z
2023-09-11T17:10:29Z
fix typo
Hiltjo Posthuma
hiltjo@codemadness.org
commit 83d4fe1dd6996c779d73406a237c9fd470cda9b6
parent 2e32abeb2743e5fce55bdfc1591bb66eedd63a45
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Mon, 11 Sep 2023 19:10:29 +0200
fix typo
2e32abeb2743e5fce55bdfc1591bb66eedd63a45
2023-09-11T17:03:25Z
2023-09-11T17:03:25Z
optional tag handling improvements
Hiltjo Posthuma
hiltjo@codemadness.org
commit 2e32abeb2743e5fce55bdfc1591bb66eedd63a45
parent 9f4c3a0a47eb2bb127db5a270dfa27ad368deb6a
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Mon, 11 Sep 2023 19:03:25 +0200
optional tag handling improvements
Much better handling for the optional tags: <p>, <dd>, <dt>, <dl>.
An example page:
https://www.openbsd.org/policy.html
Some tags to add:
- aside
- menu
- address
- details
Maybe:
- search
- hgroup
9f4c3a0a47eb2bb127db5a270dfa27ad368deb6a
2023-09-11T17:01:30Z
2023-09-11T17:01:30Z
hputchar: fix flag to reset hadnewline and improve comments
Hiltjo Posthuma
hiltjo@codemadness.org
commit 9f4c3a0a47eb2bb127db5a270dfa27ad368deb6a
parent a75a21256774e9ccda82f79ad7989f44bfa81e6a
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Mon, 11 Sep 2023 19:01:30 +0200
hputchar: fix flag to reset hadnewline and improve comments
This flag is an extra safety, it can probably be removed.
a75a21256774e9ccda82f79ad7989f44bfa81e6a
2023-09-11T17:00:30Z
2023-09-11T17:00:30Z
micro optimation for counters on indent()
Hiltjo Posthuma
hiltjo@codemadness.org
commit a75a21256774e9ccda82f79ad7989f44bfa81e6a
parent 642198c5b9aabc74e13e4c2e7f044516e76cbf2b
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Mon, 11 Sep 2023 19:00:30 +0200
micro optimation for counters on indent()
642198c5b9aabc74e13e4c2e7f044516e76cbf2b
2023-09-11T16:59:41Z
2023-09-11T16:59:41Z
handle <form> as block element with margin bottom 0
Hiltjo Posthuma
hiltjo@codemadness.org
commit 642198c5b9aabc74e13e4c2e7f044516e76cbf2b
parent 4d0ab293b3f3ecd2e5a491c8e94678811f03e398
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Mon, 11 Sep 2023 18:59:41 +0200
handle <form> as block element with margin bottom 0
4d0ab293b3f3ecd2e5a491c8e94678811f03e398
2023-09-11T16:58:55Z
2023-09-11T16:58:55Z
fix leading white-space in <pre>
Hiltjo Posthuma
hiltjo@codemadness.org
commit 4d0ab293b3f3ecd2e5a491c8e94678811f03e398
parent 1d80db038e35ca3778e2df19d00b9be512df185f
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Mon, 11 Sep 2023 18:58:55 +0200
fix leading white-space in <pre>
Skip first newline only.
1d80db038e35ca3778e2df19d00b9be512df185f
2023-09-08T20:34:46Z
2023-09-08T20:34:46Z
just translate all entities
Hiltjo Posthuma
hiltjo@codemadness.org
commit 1d80db038e35ca3778e2df19d00b9be512df185f
parent f541d79e5f8a6f69df9494a1e96bee17b88fb82f
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Fri, 8 Sep 2023 22:34:46 +0200
just translate all entities
This increases the binary size a lot though...
f541d79e5f8a6f69df9494a1e96bee17b88fb82f
2023-09-08T13:46:25Z
2023-09-08T13:46:25Z
do not define a non-const in this way
Hiltjo Posthuma
hiltjo@codemadness.org
commit f541d79e5f8a6f69df9494a1e96bee17b88fb82f
parent 16f2855bb159b11fd58bc0ccdf9069c00b0bafaa
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Fri, 8 Sep 2023 15:46:25 +0200
do not define a non-const in this way
Add a macro for this case. This fixes a compiler error with tcc.
16f2855bb159b11fd58bc0ccdf9069c00b0bafaa
2023-09-08T13:43:06Z
2023-09-08T13:43:06Z
handpick and add named entity to the shorter named entities list
Hiltjo Posthuma
hiltjo@codemadness.org
commit 16f2855bb159b11fd58bc0ccdf9069c00b0bafaa
parent 62bfd8b37f4b929e5f5a0f06c4cf90e7e07387a9
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Fri, 8 Sep 2023 15:43:06 +0200
handpick and add named entity to the shorter named entities list
62bfd8b37f4b929e5f5a0f06c4cf90e7e07387a9
2023-09-08T13:42:10Z
2023-09-08T13:42:10Z
update mailcap example and add it to the man page as well
Hiltjo Posthuma
hiltjo@codemadness.org
commit 62bfd8b37f4b929e5f5a0f06c4cf90e7e07387a9
parent 0626e06482426d2fe329fd9df5b1f6fb3b946e2a
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Fri, 8 Sep 2023 15:42:10 +0200
update mailcap example and add it to the man page as well
0626e06482426d2fe329fd9df5b1f6fb3b946e2a
2023-09-08T13:33:44Z
2023-09-08T13:33:44Z
small optimization: use the new DisplayType DisplayInput
Hiltjo Posthuma
hiltjo@codemadness.org
commit 0626e06482426d2fe329fd9df5b1f6fb3b946e2a
parent 630f76162a192327a3eecd4fc0adcb9b31cd4504
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Fri, 8 Sep 2023 15:33:44 +0200
small optimization: use the new DisplayType DisplayInput
This reduces a string comparison for each node.
630f76162a192327a3eecd4fc0adcb9b31cd4504
2023-09-08T13:05:38Z
2023-09-08T13:05:38Z
improve forms a bit
Hiltjo Posthuma
hiltjo@codemadness.org
commit 630f76162a192327a3eecd4fc0adcb9b31cd4504
parent 0705fb754f00c7866b2cc8cee0739a88a584a2e1
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Fri, 8 Sep 2023 15:05:38 +0200
improve forms a bit
- Treat fieldset and legend as block elements.
- Support more types, default or unsupported is "text".
- Show the default selected value for radio and checkboxes.
- Don't show hidden input types.
- Add a DisplayType DisplayInput to check the tag faster.
0705fb754f00c7866b2cc8cee0739a88a584a2e1
2023-09-08T11:09:37Z
2023-09-08T11:09:37Z
improve base URL and <base href /> handling
Hiltjo Posthuma
hiltjo@codemadness.org
commit 0705fb754f00c7866b2cc8cee0739a88a584a2e1
parent 7d4723febabeb679e1980c12b5dfd3b656475b4f
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Fri, 8 Sep 2023 13:09:37 +0200
improve base URL and <base href /> handling
- Parse the base URI once and reuse the structure (optimization).
- Once it is parsed it cannot be overwritten again. This matches the browser
more closely.
7d4723febabeb679e1980c12b5dfd3b656475b4f
2023-09-08T09:29:59Z
2023-09-08T09:29:59Z
flush after writing the URL inline
Hiltjo Posthuma
hiltjo@codemadness.org
commit 7d4723febabeb679e1980c12b5dfd3b656475b4f
parent 6365a78f6c050106e64b281d29d8ef550f131bf1
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Fri, 8 Sep 2023 11:29:59 +0200
flush after writing the URL inline
Otherwise the URL could be partially wrapped and incorrectly wrap to a new
line.
Remove a TODO, the flush is really needed there.
6365a78f6c050106e64b281d29d8ef550f131bf1
2023-09-08T09:25:13Z
2023-09-08T09:26:40Z
improve link references, add option to show full URL inline
Hiltjo Posthuma
hiltjo@codemadness.org
commit 6365a78f6c050106e64b281d29d8ef550f131bf1
parent 56ec7ea6c49d79cc3aaf301d2e6040e15d17785a
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Fri, 8 Sep 2023 11:25:13 +0200
improve link references, add option to show full URL inline
- fix URL references not being visible when only the -l option is specified
(without -i). Now each option can be specified separately.
- add -I option to show full URL option inline.
56ec7ea6c49d79cc3aaf301d2e6040e15d17785a
2023-09-08T09:07:57Z
2023-09-08T09:07:57Z
selector syntax: document it and add feature to filter on a specific nth node
Hiltjo Posthuma
hiltjo@codemadness.org
commit 56ec7ea6c49d79cc3aaf301d2e6040e15d17785a
parent 94f0ad42fcfbe17b01d9e573a786435d1acc0232
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Fri, 8 Sep 2023 11:07:57 +0200
selector syntax: document it and add feature to filter on a specific nth node
94f0ad42fcfbe17b01d9e573a786435d1acc0232
2023-09-08T08:31:33Z
2023-09-08T08:31:33Z
add figure and figcaption: improve display of it
Hiltjo Posthuma
hiltjo@codemadness.org
commit 94f0ad42fcfbe17b01d9e573a786435d1acc0232
parent a30cb9a818546a1e7b651fa33e2f0164079b7bc5
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Fri, 8 Sep 2023 10:31:33 +0200
add figure and figcaption: improve display of it
Indent the figure and use a margin. Handle them as block elements.
a30cb9a818546a1e7b651fa33e2f0164079b7bc5
2023-09-07T16:35:58Z
2023-09-07T16:36:23Z
reduce some newlines before the link references
Hiltjo Posthuma
hiltjo@codemadness.org
commit a30cb9a818546a1e7b651fa33e2f0164079b7bc5
parent 20fb149d1183220e2fee0cbfb6d3ac4f288bc67e
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Thu, 7 Sep 2023 18:35:58 +0200
reduce some newlines before the link references
20fb149d1183220e2fee0cbfb6d3ac4f288bc67e
2023-09-07T16:33:52Z
2023-09-07T16:33:52Z
improve documentation
Hiltjo Posthuma
hiltjo@codemadness.org
commit 20fb149d1183220e2fee0cbfb6d3ac4f288bc67e
parent ce2a730d81823f9fc5f1d607296bb4529e9aeef0
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Thu, 7 Sep 2023 18:33:52 +0200
improve documentation
ce2a730d81823f9fc5f1d607296bb4529e9aeef0
2023-09-07T16:25:16Z
2023-09-07T16:25:16Z
initial repo
Hiltjo Posthuma
hiltjo@codemadness.org
commit ce2a730d81823f9fc5f1d607296bb4529e9aeef0
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Thu, 7 Sep 2023 18:25:16 +0200
initial repo
Reset development/chaotic hacking history.