webdump, branch HEAD HTML to plain-text converter for webpages 278d829beb658d1eb18dba03c804d4a949e7bd43 2026-03-13T11:52:07Z 2026-03-13T11:52:07Z typo: url -> URL Hiltjo Posthuma hiltjo@codemadness.org commit 278d829beb658d1eb18dba03c804d4a949e7bd43 parent 180713b09a920e1e8a8e26fba3a966aaf0a8bc98 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Fri, 13 Mar 2026 12:52:07 +0100 typo: url -> URL 180713b09a920e1e8a8e26fba3a966aaf0a8bc98 2026-03-11T18:04:20Z 2026-03-11T18:04:20Z bump version to 0.2 Hiltjo Posthuma hiltjo@codemadness.org commit 180713b09a920e1e8a8e26fba3a966aaf0a8bc98 parent 5b615707234222a1ecb7c7c637c629d4f45116ff Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Wed, 11 Mar 2026 19:04:20 +0100 bump version to 0.2 5b615707234222a1ecb7c7c637c629d4f45116ff 2026-03-10T23:59:58Z 2026-03-11T00:02:18Z fix: reassign internal buffer on realloc Hiltjo Posthuma hiltjo@codemadness.org commit 5b615707234222a1ecb7c7c637c629d4f45116ff parent c455f534c9eaf07f1c1304e74edc245fba57af46 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Wed, 11 Mar 2026 00:59:58 +0100 fix: reassign internal buffer on realloc This could point to an old buffer in a different memory region. (realloc does not neccesarily allocate in a continous memory area). Found by Clang ASAN c455f534c9eaf07f1c1304e74edc245fba57af46 2026-03-10T23:16:11Z 2026-03-10T23:16:11Z fix: Value stored to 'datalen' is never read Hiltjo Posthuma hiltjo@codemadness.org commit c455f534c9eaf07f1c1304e74edc245fba57af46 parent a26127c95ac65dc8cc5e1a99884923aa2ed81a04 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Wed, 11 Mar 2026 00:16:11 +0100 fix: Value stored to 'datalen' is never read Found by clang-analyzer and cppcheck a26127c95ac65dc8cc5e1a99884923aa2ed81a04 2026-03-10T23:07:46Z 2026-03-10T23:07:46Z code-style: remove temporary variable, no need to initialize it Hiltjo Posthuma hiltjo@codemadness.org commit a26127c95ac65dc8cc5e1a99884923aa2ed81a04 parent 83f319cce4257402c42efb6c7f784136ed19d528 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Wed, 11 Mar 2026 00:07:46 +0100 code-style: remove temporary variable, no need to initialize it Removes Unused code Dead assignment warning with clang-analyzer. 83f319cce4257402c42efb6c7f784136ed19d528 2026-03-10T17:38:06Z 2026-03-10T17:40:24Z bump LICENSE year Hiltjo Posthuma hiltjo@codemadness.org commit 83f319cce4257402c42efb6c7f784136ed19d528 parent b1bbbf832f4d58fa11002a4deed5bfdbcc6d36a0 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Tue, 10 Mar 2026 18:38:06 +0100 bump LICENSE year b1bbbf832f4d58fa11002a4deed5bfdbcc6d36a0 2026-03-10T17:37:36Z 2026-03-10T17:40:24Z small code-style fix Hiltjo Posthuma hiltjo@codemadness.org commit b1bbbf832f4d58fa11002a4deed5bfdbcc6d36a0 parent 0d7d55bb8633594c343e92e3a77d34b2b4249374 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Tue, 10 Mar 2026 18:37:36 +0100 small code-style fix 0d7d55bb8633594c343e92e3a77d34b2b4249374 2026-03-09T19:17:36Z 2026-03-10T17:38:43Z fix: add null check for parent pointer in option handling (false positive) andrew sourcehut@lewman.us commit 0d7d55bb8633594c343e92e3a77d34b2b4249374 parent bc810c876a5d5de1e71796e5579b0c966ca092fd Author: andrew <sourcehut@lewman.us> Date: Mon, 9 Mar 2026 12:17:36 -0700 fix: add null check for parent pointer in option handling (false positive) Prevent null pointer dereference when an <option> tag appears at root level without a parent <select> element. The code assumed parent would always exist when processing DisplayOption tags, but malformed or invalid HTML could have orphan <option> tags at the root level. Found by Clang static analyzer (core.NullDereference warning). Added note: this cannot be triggered, because curnode cannot be 0 here and parent is set (non-NULL). But it should be checked nonetheless. bc810c876a5d5de1e71796e5579b0c966ca092fd 2025-12-11T19:52:22Z 2025-12-11T19:53:07Z sync xml.c changes: parse numeric entities more strictly Hiltjo Posthuma hiltjo@codemadness.org commit bc810c876a5d5de1e71796e5579b0c966ca092fd parent 31fac2476f06b72f3d8bc7ac654cfde4e8452525 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Thu, 11 Dec 2025 20:52:22 +0100 sync xml.c changes: parse numeric entities more strictly 31fac2476f06b72f3d8bc7ac654cfde4e8452525 2025-09-21T12:21:30Z 2025-09-21T12:21:30Z slightly reduce stack size for entities Hiltjo Posthuma hiltjo@codemadness.org commit 31fac2476f06b72f3d8bc7ac654cfde4e8452525 parent 3d6afd123b27f8bbd2544071047ee3d0cce4c8c1 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Sun, 21 Sep 2025 14:21:30 +0200 slightly reduce stack size for entities ... rename n to len (consistency). 3d6afd123b27f8bbd2544071047ee3d0cce4c8c1 2025-04-25T09:46:32Z 2025-04-25T09:46:32Z bump LICENSE year Hiltjo Posthuma hiltjo@codemadness.org commit 3d6afd123b27f8bbd2544071047ee3d0cce4c8c1 parent 5cde25b5150bd0375e9b5800bf3855765830c588 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Fri, 25 Apr 2025 11:46:32 +0200 bump LICENSE year ... and tag 0.1 5cde25b5150bd0375e9b5800bf3855765830c588 2024-07-06T11:05:54Z 2024-07-06T11:05:54Z improve memory usage and allocation Hiltjo Posthuma hiltjo@codemadness.org commit 5cde25b5150bd0375e9b5800bf3855765830c588 parent 72b23084b7c64c298c6b90ae6ad9f53f497cec57 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Sat, 6 Jul 2024 13:05:54 +0200 improve memory usage and allocation Do not realloc when it is not needed (even when it is the same size). Decrease the greedy allocation increment size for nested nodes also. Tested for example using valgrind and "add Beej's Guide to Network Programming" HTML page: https://git.codemadness.org/webdump_tests/commit/837749abc02f28e1584e5f2cf2b274ae1c69d8e6.html The buffering for link references (-l option) used way too much memory. 72b23084b7c64c298c6b90ae6ad9f53f497cec57 2024-06-29T16:29:21Z 2024-06-29T16:29:21Z improve parsing whitespace after end tag names Hiltjo Posthuma hiltjo@codemadness.org commit 72b23084b7c64c298c6b90ae6ad9f53f497cec57 parent a0118e672fd3fa0004ccf2850eaef4ec4bc6fb39 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Sat, 29 Jun 2024 18:29:21 +0200 improve parsing whitespace after end tag names Real site example: https://www.gnupg.org/gph/en/manual.html Has HTML such as: <P CLASS="COPYRIGHT" >Copyright &copy; 1999 by <SPAN CLASS="HOLDER" >The Free Software Foundation</SPAN ></P > ... This incorrectly showed ">" in the end tag as data. Reported by Jason Hood, thanks! a0118e672fd3fa0004ccf2850eaef4ec4bc6fb39 2024-05-23T18:20:42Z 2024-05-23T18:20:42Z fix possible regression: set tag defaults also Hiltjo Posthuma hiltjo@codemadness.org commit a0118e672fd3fa0004ccf2850eaef4ec4bc6fb39 parent 115f7e68eeccd7f1030fc631c52bab35692c6973 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Thu, 23 May 2024 20:20:42 +0200 fix possible regression: set tag defaults also 115f7e68eeccd7f1030fc631c52bab35692c6973 2024-05-22T17:12:44Z 2024-05-22T17:12:44Z fix a crash when tag could be uninitialized and not set to a fixed buffer tagname Hiltjo Posthuma hiltjo@codemadness.org commit 115f7e68eeccd7f1030fc631c52bab35692c6973 parent 64010b2be4bc3845ef07db25f8621c7894fe64bb Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Wed, 22 May 2024 19:12:44 +0200 fix a crash when tag could be uninitialized and not set to a fixed buffer tagname Reported by pi31415 when he was testing webdump on a binary ZIP file, thanks! 64010b2be4bc3845ef07db25f8621c7894fe64bb 2024-05-22T16:47:04Z 2024-05-22T16:47:04Z xmltagend: fix checking the correct tag for the node Hiltjo Posthuma hiltjo@codemadness.org commit 64010b2be4bc3845ef07db25f8621c7894fe64bb parent 178ee8229bd4e0cf0cb8dae6a979ccb473b9bf10 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Wed, 22 May 2024 18:47:04 +0200 xmltagend: fix checking the correct tag for the node 178ee8229bd4e0cf0cb8dae6a979ccb473b9bf10 2024-05-22T16:46:21Z 2024-05-22T16:46:21Z reduce stack size a bit Hiltjo Posthuma hiltjo@codemadness.org commit 178ee8229bd4e0cf0cb8dae6a979ccb473b9bf10 parent 0f038037edcb9d876ced704462f8daf4f2d2c4b2 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Wed, 22 May 2024 18:46:21 +0200 reduce stack size a bit 0f038037edcb9d876ced704462f8daf4f2d2c4b2 2024-05-22T16:45:53Z 2024-05-22T16:45:53Z bump LICENSE Hiltjo Posthuma hiltjo@codemadness.org commit 0f038037edcb9d876ced704462f8daf4f2d2c4b2 parent 473563a6c16c683af52cb791fbbfdfb997f758bb Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Wed, 22 May 2024 18:45:53 +0200 bump LICENSE and improve a few comments 473563a6c16c683af52cb791fbbfdfb997f758bb 2024-04-27T01:28:09Z 2024-04-27T07:49:00Z webdump.1: fix copypasted flag description Lucas de Sena lucas@seninha.org commit 473563a6c16c683af52cb791fbbfdfb997f758bb parent 1232b5b3d77c458704341ac436ff4230a3077007 Author: Lucas de Sena <lucas@seninha.org> Date: Fri, 26 Apr 2024 22:28:09 -0300 webdump.1: fix copypasted flag description 1232b5b3d77c458704341ac436ff4230a3077007 2023-10-15T11:47:16Z 2023-10-15T11:47:16Z README: expand README Hiltjo Posthuma hiltjo@codemadness.org commit 1232b5b3d77c458704341ac436ff4230a3077007 parent bff9fbe51c0f5f5ac37a46deca1016bb56834dac Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Sun, 15 Oct 2023 13:47:16 +0200 README: expand README Describe the scope and trade-offs a bit more clearly, because webdump is quite limited. bff9fbe51c0f5f5ac37a46deca1016bb56834dac 2023-10-06T09:57:10Z 2023-10-06T09:57:10Z webdump.1: improve man page Hiltjo Posthuma hiltjo@codemadness.org commit bff9fbe51c0f5f5ac37a46deca1016bb56834dac parent 030644d3ff71c0708d940f9895e76ab99593f61b Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Fri, 6 Oct 2023 11:57:10 +0200 webdump.1: improve man page 030644d3ff71c0708d940f9895e76ab99593f61b 2023-09-27T16:53:56Z 2023-09-27T16:53:56Z contextual line-wrapping, disabled for now Hiltjo Posthuma hiltjo@codemadness.org commit 030644d3ff71c0708d940f9895e76ab99593f61b parent 30a42a2ff270ef5e7ff96d8b23ed5ffbd58c665b Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Wed, 27 Sep 2023 18:53:56 +0200 contextual line-wrapping, disabled for now 30a42a2ff270ef5e7ff96d8b23ed5ffbd58c665b 2023-09-27T16:53:02Z 2023-09-27T16:53:02Z show "[IMG]" as a placeholder if alt text is empty Hiltjo Posthuma hiltjo@codemadness.org commit 30a42a2ff270ef5e7ff96d8b23ed5ffbd58c665b parent 5f17b244e6f5fd6d954cfe58679807fef3ea91e5 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Wed, 27 Sep 2023 18:53:02 +0200 show "[IMG]" as a placeholder if alt text is empty Depending on which options are set. 5f17b244e6f5fd6d954cfe58679807fef3ea91e5 2023-09-22T12:21:28Z 2023-09-22T12:21:45Z declare a few functions as static Hiltjo Posthuma hiltjo@codemadness.org commit 5f17b244e6f5fd6d954cfe58679807fef3ea91e5 parent 4e69626163451a74e090c1bdbdcc3282236d6b33 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Fri, 22 Sep 2023 14:21:28 +0200 declare a few functions as static 4e69626163451a74e090c1bdbdcc3282236d6b33 2023-09-21T21:13:34Z 2023-09-21T21:13:34Z hide data in <svg> tag Hiltjo Posthuma hiltjo@codemadness.org commit 4e69626163451a74e090c1bdbdcc3282236d6b33 parent ae36c548e48ddea692a87557938441bb7cd54994 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Thu, 21 Sep 2023 23:13:34 +0200 hide data in <svg> tag Noticed on a zdnet.com page/article which has invalid SVG data inside it. This would show gibberish. Note that the parser still expects somewhat valid XML/HTML. In the future maybe this could be handled the same as <script> or <style>. ae36c548e48ddea692a87557938441bb7cd54994 2023-09-20T16:51:10Z 2023-09-20T16:51:10Z for the class and id attribute use the first value set Hiltjo Posthuma hiltjo@codemadness.org commit ae36c548e48ddea692a87557938441bb7cd54994 parent 4793272ce07153284318336426796cb7e3c93af4 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Wed, 20 Sep 2023 18:51:10 +0200 for the class and id attribute use the first value set + small code-style tweaks. 4793272ce07153284318336426796cb7e3c93af4 2023-09-19T18:05:02Z 2023-09-19T18:05:02Z cleanup code a bit and add some comments Hiltjo Posthuma hiltjo@codemadness.org commit 4793272ce07153284318336426796cb7e3c93af4 parent 589d7d1ed851b5226a4782de8c9f00001f25c599 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Tue, 19 Sep 2023 20:05:02 +0200 cleanup code a bit and add some comments 589d7d1ed851b5226a4782de8c9f00001f25c599 2023-09-19T18:04:01Z 2023-09-19T18:04:01Z strip down tree.h remove unused code and unused macros Hiltjo Posthuma hiltjo@codemadness.org commit 589d7d1ed851b5226a4782de8c9f00001f25c599 parent c0d1a46e3d5e9d291cb731bec2f0511553d87b48 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Tue, 19 Sep 2023 20:04:01 +0200 strip down tree.h remove unused code and unused macros ... only RB_INSERT and RB_FIND are used. c0d1a46e3d5e9d291cb731bec2f0511553d87b48 2023-09-18T17:08:01Z 2023-09-18T17:08:01Z sync some small XML parser fixes Hiltjo Posthuma hiltjo@codemadness.org commit c0d1a46e3d5e9d291cb731bec2f0511553d87b48 parent 011b4885a533382d98f1aee6cb9619e280c99947 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Mon, 18 Sep 2023 19:08:01 +0200 sync some small XML parser fixes 011b4885a533382d98f1aee6cb9619e280c99947 2023-09-18T17:06:03Z 2023-09-18T17:06:03Z various improvements Hiltjo Posthuma hiltjo@codemadness.org commit 011b4885a533382d98f1aee6cb9619e280c99947 parent 89c9108dc27fe27e0f028f67508a1156ed242d2a Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Mon, 18 Sep 2023 19:06:03 +0200 various improvements Improve link references: - Add RB tree to lookup link references: this uses a stripped-down version of OpenBSD tree.h - Add 2 separate linked-lists for the order of visible and hidden links. - Hidden links and now also deduplicated. Improve nested nodes and max depth: - Rework and increase the allowed depth of nodes. Allocate them on the heap. 89c9108dc27fe27e0f028f67508a1156ed242d2a 2023-09-14T20:31:03Z 2023-09-14T20:31:03Z various improvements Hiltjo Posthuma hiltjo@codemadness.org commit 89c9108dc27fe27e0f028f67508a1156ed242d2a parent 62884d7b5684e791bb0cd6466f74367d6d71618d Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Thu, 14 Sep 2023 22:31:03 +0200 various improvements - add an unique tagid number per tag. This allows checking by tag number. - add support for the link reference <frame>, <iframe>, <embed src>. - improve checking for open optional <p> tags when a block element (such as <section> is open). - check if the base URI using the -b option is absolute. 62884d7b5684e791bb0cd6466f74367d6d71618d 2023-09-13T18:41:31Z 2023-09-13T18:41:31Z whoops, check in some related changes from previous commits Hiltjo Posthuma hiltjo@codemadness.org commit 62884d7b5684e791bb0cd6466f74367d6d71618d parent 8ab6a487c7adcfe44d9d3c07c81a1c07d6dedd2a Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Wed, 13 Sep 2023 20:41:31 +0200 whoops, check in some related changes from previous commits 8ab6a487c7adcfe44d9d3c07c81a1c07d6dedd2a 2023-09-13T18:40:20Z 2023-09-13T18:40:20Z set DisplaySelectMulti for <select> with multiple attribute Hiltjo Posthuma hiltjo@codemadness.org commit 8ab6a487c7adcfe44d9d3c07c81a1c07d6dedd2a parent bf60f514843dfa9c2cc5d10fa9e7f3978da5cefb Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Wed, 13 Sep 2023 20:40:20 +0200 set DisplaySelectMulti for <select> with multiple attribute This bitmask allows easy layout changes or logic. bf60f514843dfa9c2cc5d10fa9e7f3978da5cefb 2023-09-13T18:39:42Z 2023-09-13T18:39:42Z if the <input type="submit|reset"> has no value, use a default one Hiltjo Posthuma hiltjo@codemadness.org commit bf60f514843dfa9c2cc5d10fa9e7f3978da5cefb parent 29ab23324d260dae10475498bfcabd06a4c9ba48 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Wed, 13 Sep 2023 20:39:42 +0200 if the <input type="submit|reset"> has no value, use a default one 29ab23324d260dae10475498bfcabd06a4c9ba48 2023-09-13T18:38:58Z 2023-09-13T18:38:58Z fixed nested <dl> Hiltjo Posthuma hiltjo@codemadness.org commit 29ab23324d260dae10475498bfcabd06a4c9ba48 parent bc435d97f57537adbce2b1ddac9f0744a57279ae Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Wed, 13 Sep 2023 20:38:58 +0200 fixed nested <dl> Noticed on the page, for example: http://man.openbsd.org/ftp bc435d97f57537adbce2b1ddac9f0744a57279ae 2023-09-13T18:38:27Z 2023-09-13T18:38:27Z with the -i and -a option highlight links in ANSI reverse Hiltjo Posthuma hiltjo@codemadness.org commit bc435d97f57537adbce2b1ddac9f0744a57279ae parent f4f2dc53e082fcbf627d567810c399c306009ea0 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Wed, 13 Sep 2023 20:38:27 +0200 with the -i and -a option highlight links in ANSI reverse f4f2dc53e082fcbf627d567810c399c306009ea0 2023-09-13T18:37:28Z 2023-09-13T18:37:28Z initial support for <select> <option> Hiltjo Posthuma hiltjo@codemadness.org commit f4f2dc53e082fcbf627d567810c399c306009ea0 parent 20841145c9fd597e82c3da9dfa7c9d9caf606567 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Wed, 13 Sep 2023 20:37:28 +0200 initial support for <select> <option> Show the first item, or all of the attribute is multiple. This ignores the actual selected item if <select><option selected>. This would require a state of all the option nodes which it doesn't do. 20841145c9fd597e82c3da9dfa7c9d9caf606567 2023-09-13T18:36:36Z 2023-09-13T18:36:36Z support <object data> attribute as a link reference Hiltjo Posthuma hiltjo@codemadness.org commit 20841145c9fd597e82c3da9dfa7c9d9caf606567 parent 7e848a418c711f6857328b5489172a34d44587c8 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Wed, 13 Sep 2023 20:36:36 +0200 support <object data> attribute as a link reference 7e848a418c711f6857328b5489172a34d44587c8 2023-09-13T18:35:17Z 2023-09-13T18:35:17Z add support for more tags and change the markup and display block-type of some Hiltjo Posthuma hiltjo@codemadness.org commit 7e848a418c711f6857328b5489172a34d44587c8 parent 91d236dab89449465eb123d756a450a17eb4195a Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Wed, 13 Sep 2023 20:35:17 +0200 add support for more tags and change the markup and display block-type of some ... also add initial types: Button, Select, SelectMulti and Option. 91d236dab89449465eb123d756a450a17eb4195a 2023-09-12T18:02:57Z 2023-09-12T18:02:57Z add option for unique link references (-d) Hiltjo Posthuma hiltjo@codemadness.org commit 91d236dab89449465eb123d756a450a17eb4195a parent 790402682bab675461f2a12879408dd5ad30c90f Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Tue, 12 Sep 2023 20:02:57 +0200 add option for unique link references (-d) ... also make link type "a" consistently "link" (also at the bottom references). ... also flush inline link only if needed 790402682bab675461f2a12879408dd5ad30c90f 2023-09-12T18:01:15Z 2023-09-12T18:01:15Z do not reset ncells or nbytesline if no newline was emitted Hiltjo Posthuma hiltjo@codemadness.org commit 790402682bab675461f2a12879408dd5ad30c90f parent 8d29c76012b91bbfaad1feca31b2af4cfbc99032 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Tue, 12 Sep 2023 20:01:15 +0200 do not reset ncells or nbytesline if no newline was emitted Test-case: <div> <div><b></b></div> <p>abc</p> </div> This caused an extra indentation due to the nbytesline check in hflush(). 8d29c76012b91bbfaad1feca31b2af4cfbc99032 2023-09-12T18:00:18Z 2023-09-12T18:00:18Z reduce excessive ANSI markup codes using -a Hiltjo Posthuma hiltjo@codemadness.org commit 8d29c76012b91bbfaad1feca31b2af4cfbc99032 parent dc7717417afb10040e4ea5d9472cc1c2658f1c8c Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Tue, 12 Sep 2023 20:00:18 +0200 reduce excessive ANSI markup codes using -a dc7717417afb10040e4ea5d9472cc1c2658f1c8c 2023-09-12T17:59:33Z 2023-09-12T17:59:33Z clamp indent count to increment to >= 0 just in case Hiltjo Posthuma hiltjo@codemadness.org commit dc7717417afb10040e4ea5d9472cc1c2658f1c8c parent 17c9e247a8df6c43cdec6bac73410dd35ece683c Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Tue, 12 Sep 2023 19:59:33 +0200 clamp indent count to increment to >= 0 just in case 17c9e247a8df6c43cdec6bac73410dd35ece683c 2023-09-12T17:58:39Z 2023-09-12T17:58:39Z add more block-like tags Hiltjo Posthuma hiltjo@codemadness.org commit 17c9e247a8df6c43cdec6bac73410dd35ece683c parent 83d4fe1dd6996c779d73406a237c9fd470cda9b6 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Tue, 12 Sep 2023 19:58:39 +0200 add more block-like tags 83d4fe1dd6996c779d73406a237c9fd470cda9b6 2023-09-11T17:10:29Z 2023-09-11T17:10:29Z fix typo Hiltjo Posthuma hiltjo@codemadness.org commit 83d4fe1dd6996c779d73406a237c9fd470cda9b6 parent 2e32abeb2743e5fce55bdfc1591bb66eedd63a45 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Mon, 11 Sep 2023 19:10:29 +0200 fix typo 2e32abeb2743e5fce55bdfc1591bb66eedd63a45 2023-09-11T17:03:25Z 2023-09-11T17:03:25Z optional tag handling improvements Hiltjo Posthuma hiltjo@codemadness.org commit 2e32abeb2743e5fce55bdfc1591bb66eedd63a45 parent 9f4c3a0a47eb2bb127db5a270dfa27ad368deb6a Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Mon, 11 Sep 2023 19:03:25 +0200 optional tag handling improvements Much better handling for the optional tags: <p>, <dd>, <dt>, <dl>. An example page: https://www.openbsd.org/policy.html Some tags to add: - aside - menu - address - details Maybe: - search - hgroup 9f4c3a0a47eb2bb127db5a270dfa27ad368deb6a 2023-09-11T17:01:30Z 2023-09-11T17:01:30Z hputchar: fix flag to reset hadnewline and improve comments Hiltjo Posthuma hiltjo@codemadness.org commit 9f4c3a0a47eb2bb127db5a270dfa27ad368deb6a parent a75a21256774e9ccda82f79ad7989f44bfa81e6a Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Mon, 11 Sep 2023 19:01:30 +0200 hputchar: fix flag to reset hadnewline and improve comments This flag is an extra safety, it can probably be removed. a75a21256774e9ccda82f79ad7989f44bfa81e6a 2023-09-11T17:00:30Z 2023-09-11T17:00:30Z micro optimation for counters on indent() Hiltjo Posthuma hiltjo@codemadness.org commit a75a21256774e9ccda82f79ad7989f44bfa81e6a parent 642198c5b9aabc74e13e4c2e7f044516e76cbf2b Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Mon, 11 Sep 2023 19:00:30 +0200 micro optimation for counters on indent() 642198c5b9aabc74e13e4c2e7f044516e76cbf2b 2023-09-11T16:59:41Z 2023-09-11T16:59:41Z handle <form> as block element with margin bottom 0 Hiltjo Posthuma hiltjo@codemadness.org commit 642198c5b9aabc74e13e4c2e7f044516e76cbf2b parent 4d0ab293b3f3ecd2e5a491c8e94678811f03e398 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Mon, 11 Sep 2023 18:59:41 +0200 handle <form> as block element with margin bottom 0 4d0ab293b3f3ecd2e5a491c8e94678811f03e398 2023-09-11T16:58:55Z 2023-09-11T16:58:55Z fix leading white-space in <pre> Hiltjo Posthuma hiltjo@codemadness.org commit 4d0ab293b3f3ecd2e5a491c8e94678811f03e398 parent 1d80db038e35ca3778e2df19d00b9be512df185f Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Mon, 11 Sep 2023 18:58:55 +0200 fix leading white-space in <pre> Skip first newline only. 1d80db038e35ca3778e2df19d00b9be512df185f 2023-09-08T20:34:46Z 2023-09-08T20:34:46Z just translate all entities Hiltjo Posthuma hiltjo@codemadness.org commit 1d80db038e35ca3778e2df19d00b9be512df185f parent f541d79e5f8a6f69df9494a1e96bee17b88fb82f Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Fri, 8 Sep 2023 22:34:46 +0200 just translate all entities This increases the binary size a lot though... f541d79e5f8a6f69df9494a1e96bee17b88fb82f 2023-09-08T13:46:25Z 2023-09-08T13:46:25Z do not define a non-const in this way Hiltjo Posthuma hiltjo@codemadness.org commit f541d79e5f8a6f69df9494a1e96bee17b88fb82f parent 16f2855bb159b11fd58bc0ccdf9069c00b0bafaa Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Fri, 8 Sep 2023 15:46:25 +0200 do not define a non-const in this way Add a macro for this case. This fixes a compiler error with tcc. 16f2855bb159b11fd58bc0ccdf9069c00b0bafaa 2023-09-08T13:43:06Z 2023-09-08T13:43:06Z handpick and add named entity to the shorter named entities list Hiltjo Posthuma hiltjo@codemadness.org commit 16f2855bb159b11fd58bc0ccdf9069c00b0bafaa parent 62bfd8b37f4b929e5f5a0f06c4cf90e7e07387a9 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Fri, 8 Sep 2023 15:43:06 +0200 handpick and add named entity to the shorter named entities list 62bfd8b37f4b929e5f5a0f06c4cf90e7e07387a9 2023-09-08T13:42:10Z 2023-09-08T13:42:10Z update mailcap example and add it to the man page as well Hiltjo Posthuma hiltjo@codemadness.org commit 62bfd8b37f4b929e5f5a0f06c4cf90e7e07387a9 parent 0626e06482426d2fe329fd9df5b1f6fb3b946e2a Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Fri, 8 Sep 2023 15:42:10 +0200 update mailcap example and add it to the man page as well 0626e06482426d2fe329fd9df5b1f6fb3b946e2a 2023-09-08T13:33:44Z 2023-09-08T13:33:44Z small optimization: use the new DisplayType DisplayInput Hiltjo Posthuma hiltjo@codemadness.org commit 0626e06482426d2fe329fd9df5b1f6fb3b946e2a parent 630f76162a192327a3eecd4fc0adcb9b31cd4504 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Fri, 8 Sep 2023 15:33:44 +0200 small optimization: use the new DisplayType DisplayInput This reduces a string comparison for each node. 630f76162a192327a3eecd4fc0adcb9b31cd4504 2023-09-08T13:05:38Z 2023-09-08T13:05:38Z improve forms a bit Hiltjo Posthuma hiltjo@codemadness.org commit 630f76162a192327a3eecd4fc0adcb9b31cd4504 parent 0705fb754f00c7866b2cc8cee0739a88a584a2e1 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Fri, 8 Sep 2023 15:05:38 +0200 improve forms a bit - Treat fieldset and legend as block elements. - Support more types, default or unsupported is "text". - Show the default selected value for radio and checkboxes. - Don't show hidden input types. - Add a DisplayType DisplayInput to check the tag faster. 0705fb754f00c7866b2cc8cee0739a88a584a2e1 2023-09-08T11:09:37Z 2023-09-08T11:09:37Z improve base URL and <base href /> handling Hiltjo Posthuma hiltjo@codemadness.org commit 0705fb754f00c7866b2cc8cee0739a88a584a2e1 parent 7d4723febabeb679e1980c12b5dfd3b656475b4f Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Fri, 8 Sep 2023 13:09:37 +0200 improve base URL and <base href /> handling - Parse the base URI once and reuse the structure (optimization). - Once it is parsed it cannot be overwritten again. This matches the browser more closely. 7d4723febabeb679e1980c12b5dfd3b656475b4f 2023-09-08T09:29:59Z 2023-09-08T09:29:59Z flush after writing the URL inline Hiltjo Posthuma hiltjo@codemadness.org commit 7d4723febabeb679e1980c12b5dfd3b656475b4f parent 6365a78f6c050106e64b281d29d8ef550f131bf1 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Fri, 8 Sep 2023 11:29:59 +0200 flush after writing the URL inline Otherwise the URL could be partially wrapped and incorrectly wrap to a new line. Remove a TODO, the flush is really needed there. 6365a78f6c050106e64b281d29d8ef550f131bf1 2023-09-08T09:25:13Z 2023-09-08T09:26:40Z improve link references, add option to show full URL inline Hiltjo Posthuma hiltjo@codemadness.org commit 6365a78f6c050106e64b281d29d8ef550f131bf1 parent 56ec7ea6c49d79cc3aaf301d2e6040e15d17785a Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Fri, 8 Sep 2023 11:25:13 +0200 improve link references, add option to show full URL inline - fix URL references not being visible when only the -l option is specified (without -i). Now each option can be specified separately. - add -I option to show full URL option inline. 56ec7ea6c49d79cc3aaf301d2e6040e15d17785a 2023-09-08T09:07:57Z 2023-09-08T09:07:57Z selector syntax: document it and add feature to filter on a specific nth node Hiltjo Posthuma hiltjo@codemadness.org commit 56ec7ea6c49d79cc3aaf301d2e6040e15d17785a parent 94f0ad42fcfbe17b01d9e573a786435d1acc0232 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Fri, 8 Sep 2023 11:07:57 +0200 selector syntax: document it and add feature to filter on a specific nth node 94f0ad42fcfbe17b01d9e573a786435d1acc0232 2023-09-08T08:31:33Z 2023-09-08T08:31:33Z add figure and figcaption: improve display of it Hiltjo Posthuma hiltjo@codemadness.org commit 94f0ad42fcfbe17b01d9e573a786435d1acc0232 parent a30cb9a818546a1e7b651fa33e2f0164079b7bc5 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Fri, 8 Sep 2023 10:31:33 +0200 add figure and figcaption: improve display of it Indent the figure and use a margin. Handle them as block elements. a30cb9a818546a1e7b651fa33e2f0164079b7bc5 2023-09-07T16:35:58Z 2023-09-07T16:36:23Z reduce some newlines before the link references Hiltjo Posthuma hiltjo@codemadness.org commit a30cb9a818546a1e7b651fa33e2f0164079b7bc5 parent 20fb149d1183220e2fee0cbfb6d3ac4f288bc67e Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Thu, 7 Sep 2023 18:35:58 +0200 reduce some newlines before the link references 20fb149d1183220e2fee0cbfb6d3ac4f288bc67e 2023-09-07T16:33:52Z 2023-09-07T16:33:52Z improve documentation Hiltjo Posthuma hiltjo@codemadness.org commit 20fb149d1183220e2fee0cbfb6d3ac4f288bc67e parent ce2a730d81823f9fc5f1d607296bb4529e9aeef0 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Thu, 7 Sep 2023 18:33:52 +0200 improve documentation ce2a730d81823f9fc5f1d607296bb4529e9aeef0 2023-09-07T16:25:16Z 2023-09-07T16:25:16Z initial repo Hiltjo Posthuma hiltjo@codemadness.org commit ce2a730d81823f9fc5f1d607296bb4529e9aeef0 Author: Hiltjo Posthuma <hiltjo@codemadness.org> Date: Thu, 7 Sep 2023 18:25:16 +0200 initial repo Reset development/chaotic hacking history.