tshocco - tomb - the crypto undertaker
HTML git clone git://parazyd.org/tomb.git
DIR Log
DIR Files
DIR Refs
DIR README
DIR LICENSE
---
tshocco (16974B)
---
1 #!/bin/sh
2 # **shocco** is a quick-and-dirty, literate-programming-style documentation
3 # generator written for and in __POSIX shell__. It borrows liberally from
4 # [Docco][do], the original Q&D literate-programming-style doc generator.
5 #
6 # `shocco(1)` reads shell scripts and produces annotated source documentation
7 # in HTML format. Comments are formatted with Markdown and presented
8 # alongside syntax highlighted code so as to give an annotation effect. This
9 # page is the result of running `shocco` against [its own source file][sh].
10 #
11 # shocco is built with `make(1)` and installs under `/usr/local` by default:
12 #
13 # git clone git://github.com/rtomayko/shocco.git
14 # cd shocco
15 # make
16 # sudo make install
17 # # or just copy 'shocco' wherever you need it
18 #
19 # Once installed, the `shocco` program can be used to generate documentation
20 # for a shell script:
21 #
22 # shocco shocco.sh
23 #
24 # The generated HTML is written to `stdout`.
25 #
26 # [do]: http://jashkenas.github.com/docco/
27 # [sh]: https://github.com/rtomayko/shocco/blob/master/shocco.sh#commit
28
29 # Usage and Prerequisites
30 # -----------------------
31
32 # The most important line in any shell program.
33 set -e
34
35 # There's a lot of different ways to do usage messages in shell scripts.
36 # This is my favorite: you write the usage message in a comment --
37 # typically right after the shebang line -- *BUT*, use a special comment prefix
38 # like `#/` so that its easy to pull these lines out.
39 #
40 # This also illustrates one of shocco's corner features. Only comment lines
41 # padded with a space are considered documentation. A `#` followed by any
42 # other character is considered code.
43 #
44 #/ Usage: shocco [-t <title>] [<source>]
45 #/ Create literate-programming-style documentation for shell scripts.
46 #/
47 #/ The shocco program reads a shell script from <source> and writes
48 #/ generated documentation in HTML format to stdout. When <source> is
49 #/ '-' or not specified, shocco reads from stdin.
50
51 # This is the second part of the usage message technique: `grep` yourself
52 # for the usage message comment prefix and then cut off the first few
53 # characters so that everything lines up.
54 expr -- "$*" : ".*--help" >/dev/null && {
55 grep '^#/' <"$0" | cut -c4-
56 exit 0
57 }
58
59 # A custom title may be specified with the `-t` option. We use the filename
60 # as the title if none is given.
61 test "$1" = '-t' && {
62 title="$2"
63 shift;shift
64 }
65
66 # Next argument should be the `<source>` file. Grab it, and use its basename
67 # as the title if none was given with the `-t` option.
68 file="$1"
69 : ${title:=$(basename "$file")}
70
71 # These are replaced with the full paths to real utilities by the
72 # configure/make system.
73 MARKDOWN='/usr/bin/markdown_py'
74 PYGMENTIZE='/usr/bin/pygmentize'
75
76 # On GNU systems, csplit doesn't elide empty files by default:
77 CSPLITARGS=$( (csplit --version 2>/dev/null | grep -i gnu >/dev/null) && echo "--elide-empty-files" || true )
78
79 # We're going to need a `markdown` command to run comments through. This can
80 # be [Gruber's `Markdown.pl`][md] (included in the shocco distribution) or
81 # Discount's super fast `markdown(1)` in C. Try to figure out if either are
82 # available and then bail if we can't find anything.
83 #
84 # [md]: http://daringfireball.net/projects/markdown/
85 # [ds]: http://www.pell.portland.or.us/~orc/Code/discount/
86 command -v "$MARKDOWN" >/dev/null || {
87 if command -v Markdown.pl >/dev/null
88 then alias markdown='Markdown.pl'
89 elif test -f "$(dirname $0)/Markdown.pl"
90 then alias markdown="perl $(dirname $0)/Markdown.pl"
91 else echo "$(basename $0): markdown command not found." 1>&2
92 exit 1
93 fi
94 }
95
96 # Check that [Pygments][py] is installed for syntax highlighting.
97 #
98 # This is a fairly hefty prerequisite. Eventually, I'd like to fallback
99 # on a simple non-highlighting preformatter when Pygments isn't available. For
100 # now, just bail out if we can't find the `pygmentize` program.
101 #
102 # [py]: http://pygments.org/
103 command -v "$PYGMENTIZE" >/dev/null || {
104 echo "$(basename $0): pygmentize command not found." 1>&2
105 exit 1
106 }
107
108 # Work and Cleanup
109 # ----------------
110
111 # Make sure we have a `TMPDIR` set. The `:=` parameter expansion assigns
112 # the value if `TMPDIR` is unset or null.
113 : ${TMPDIR:=/tmp}
114
115 # Create a temporary directory for doing work. Use `mktemp(1)` if
116 # available; but, since `mktemp(1)` is not POSIX specified, fallback on naive
117 # (and insecure) temp dir generation using the program's basename and pid.
118 : ${WORK:=$(
119 if command -v mktemp 1>/dev/null 2>&1
120 then
121 mktemp -d "$TMPDIR/$(basename $0).XXXXXXXXXX"
122 else
123 dir="$TMPDIR/$(basename $0).$$"
124 mkdir "$dir"
125 echo "$dir"
126 fi
127 )}
128
129 # We want to be absolutely sure we're not going to do something stupid like
130 # use `.` or `/` as a work dir. Better safe than sorry.
131 test -z "$WORK" -o "$WORK" = '/' && {
132 echo "$(basename $0): could not create a temp work dir."
133 exit 1
134 }
135
136 # We're about to create a ton of shit under our `$WORK` directory. Register
137 # an `EXIT` trap that cleans everything up. This guarantees we don't leave
138 # anything hanging around unless we're killed with a `SIGKILL`.
139 trap "rm -rf $WORK" 0
140
141 # Preformatting
142 # -------------
143 #
144 # Start out by applying some light preformatting to the `<source>` file to
145 # make the code and doc formatting phases a bit easier. The result of this
146 # pipeline is written to a temp file under the `$WORK` directory so we can
147 # take a few passes over it.
148
149 # Get a pipeline going with the `<source>` data. We write a single blank
150 # line at the end of the file to make sure we have an equal number of code/comment
151 # pairs.
152
153 # Folding.el support: turn {{{ folds }}} into titles -jrml
154 (cat "$file" \
155 | sed -e 's/^# {{{/# #/' -e 's/^# }}}.*/# --------------/' \
156 | awk '
157 /function.*\(\) {$/ { print "# ### " $2; print $0; next }
158 /\(\) {$/ { print "# ### " $1; print $0; next }
159 {print $0}' \
160 && printf "\n\n# \n\n") |
161
162 # We want the shebang line and any code preceding the first comment to
163 # appear as the first code block. This inverts the normal flow of things.
164 # Usually, we have comment text followed by code; in this case, we have
165 # code followed by comment text.
166 #
167 # Read the first code and docs headers and flip them so the first docs block
168 # comes before the first code block.
169 (
170 lineno=0
171 codebuf=;codehead=
172 docsbuf=;docshead=
173 while read -r line
174 do
175 # Issue a warning if the first line of the script is not a shebang
176 # line. This can screw things up and wreck our attempt at
177 # flip-flopping the two headings.
178 lineno=$(( $lineno + 1 ))
179 test $lineno = 1 && ! expr "$line" : "#!.*" >/dev/null &&
180 echo "$(basename $0): $(file):1 [warn] shebang! line missing." 1>&2
181
182 # Accumulate comment lines into `$docsbuf` and code lines into
183 # `$codebuf`. Only lines matching `/#(?: |$)/` are considered doc
184 # lines.
185 if expr "$line" : '# ' >/dev/null || test "$line" = "#"
186 then docsbuf="$docsbuf$line
187 "
188 else codebuf="$codebuf$line
189 "
190 fi
191
192 # If we have stuff in both `$docsbuf` and `$codebuf`, it means
193 # we're at some kind of boundary. If `$codehead` isn't set, we're at
194 # the first comment/doc line, so store the buffer to `$codehead` and
195 # keep going. If `$codehead` *is* set, we've crossed into another code
196 # block and are ready to output both blocks and then straight pipe
197 # everything by `exec`'ing `cat`.
198 if test -n "$docsbuf" -a -n "$codebuf"
199 then
200 if test -n "$codehead"
201 then docshead="$docsbuf"
202 docsbuf=""
203 printf "%s" "$docshead"
204 printf "%s" "$codehead"
205 echo "$line"
206 exec cat
207 else codehead="$codebuf"
208 codebuf=
209 fi
210 fi
211 done
212
213 # We made it to the end of the file without a single comment line, or
214 # there was only a single comment block ending the file. Output our
215 # docsbuf or a fake comment and then the codebuf or codehead.
216 echo "${docsbuf:-#}"
217 echo "${codebuf:-"$codehead"}"
218 ) |
219
220 # Remove comment leader text from all comment lines. Then prefix all
221 # comment lines with "DOCS" and interpreted / code lines with "CODE".
222 # The stream text might look like this after moving through the `sed`
223 # filters:
224 #
225 # CODE #!/bin/sh
226 # CODE #/ Usage: shocco <file>
227 # DOCS Docco for and in POSIX shell.
228 # CODE
229 # CODE PATH="/bin:/usr/bin"
230 # CODE
231 # DOCS Start by numbering all lines in the input file...
232 # ...
233 #
234 # Once we pass through `sed`, save this off in our work directory so
235 # we can take a few passes over it.
236 sed -n '
237 s/^/:/
238 s/^:[ ]\{0,\}# /DOCS /p
239 s/^:[ ]\{0,\}#$/DOCS /p
240 s/^:/CODE /p
241 ' > "$WORK/raw"
242
243 # Now that we've read and formatted our input file for further parsing,
244 # change into the work directory. The program will finish up in there.
245 cd "$WORK"
246
247 # First Pass: Comment Formatting
248 # ------------------------------
249
250 # Start a pipeline going on our preformatted input.
251 # Replace all CODE lines with entirely blank lines. We're not interested
252 # in code right now, other than knowing where comments end and code begins
253 # and code begins and comments end.
254 sed 's/^CODE.*//' < raw |
255
256 # Now squeeze multiple blank lines into a single blank line.
257 #
258 # __TODO:__ `cat -s` is not POSIX and doesn't squeeze lines on BSD. Use
259 # the sed line squeezing code mentioned in the POSIX `cat(1)` manual page
260 # instead.
261 cat -s |
262
263 # At this point in the pipeline, our stream text looks something like this:
264 #
265 # DOCS Now that we've read and formatted ...
266 # DOCS change into the work directory. The rest ...
267 # DOCS in there.
268 #
269 # DOCS First Pass: Comment Formatting
270 # DOCS ------------------------------
271 #
272 # Blank lines represent code segments. We want to replace all blank lines
273 # with a dividing marker and remove the "DOCS" prefix from docs lines.
274 sed '
275 s/^$/##### DIVIDER/
276 s/^DOCS //' |
277
278 # The current stream text is suitable for input to `markdown(1)`. It takes
279 # our doc text with embedded `DIVIDER`s and outputs HTML.
280 $MARKDOWN |
281
282 # Now this where shit starts to get a little crazy. We use `csplit(1)` to
283 # split the HTML into a bunch of individual files. The files are named
284 # as `docs0000`, `docs0001`, `docs0002`, ... Each file includes a single
285 # doc *section*. These files will sit here while we take a similar pass over
286 # the source code.
287 (
288 csplit -sk \
289 $CSPLITARGS \
290 -f docs \
291 -n 4 \
292 - '/<h5>DIVIDER<\/h5>/' '{9999}' \
293 2>/dev/null ||
294 true
295 )
296
297
298 # Second Pass: Code Formatting
299 # ----------------------------
300 #
301 # This is exactly like the first pass but we're focusing on code instead of
302 # comments. We use the same basic technique to separate the two and isolate
303 # the code blocks.
304
305 # Get another pipeline going on our performatted input file.
306 # Replace DOCS lines with blank lines.
307 sed 's/^DOCS.*//' < raw |
308
309 # Squeeze multiple blank lines into a single blank line.
310 cat -s |
311
312 # Replace blank lines with a `DIVIDER` marker and remove prefix
313 # from `CODE` lines.
314 sed '
315 s/^$/# DIVIDER/
316 s/^CODE //' |
317
318 # Now pass the code through `pygmentize` for syntax highlighting. We tell it
319 # the the input is `sh` and that we want HTML output.
320 $PYGMENTIZE -l sh -f html -O encoding=utf8 |
321
322 # Post filter the pygments output to remove partial `<pre>` blocks. We add
323 # these back in at each section when we build the output document.
324 sed '
325 s/<div class="highlight"><pre>//
326 s/^<\/pre><\/div>//' |
327
328 # Again with the `csplit(1)`. Each code section is written to a separate
329 # file, this time with a `codeXXX` prefix. There should be the same number
330 # of `codeXXX` files as there are `docsXXX` files.
331 (
332 DIVIDER='/<span class="c"># DIVIDER</span>/'
333 csplit -sk \
334 $CSPLITARGS \
335 -f code \
336 -n 4 - \
337 "$DIVIDER" '{9999}' \
338 2>/dev/null ||
339 true
340 )
341
342 # At this point, we have separate files for each docs section and separate
343 # files for each code section.
344
345 # HTML Template
346 # -------------
347
348 # Create a function for apply the standard [Docco][do] HTML layout, using
349 # [jashkenas][ja]'s gorgeous CSS for styles. Wrapping the layout in a function
350 # lets us apply it elsewhere simply by piping in a body.
351 #
352 # [ja]: http://github.com/jashkenas/
353 # [do]: http://jashkenas.github.com/docco/
354 layout () {
355 cat <<HTML
356 <!DOCTYPE html>
357 <html>
358 <head>
359 <meta http-equiv='content-type' content='text/html;charset=utf-8'>
360 <title>$1</title>
361 <link rel=stylesheet href="docco.css">
362 <link rel=stylesheet href="style.css">
363 <link rel=stylesheet href="public/stylesheets/normalize.css">
364 </head>
365 <body>
366 <div id=container>
367 <div id=background></div>
368 <table cellspacing=10 cellpadding=10>
369 <thead>
370 <tr>
371 <th class=docs><h1>$1</h1></th>
372 <th class=code></th>
373 </tr>
374 </thead>
375 <tbody>
376 <tr><td class='docs'>$(cat)</td><td class='code'></td></tr>
377 </tbody>
378 </table>
379 </div>
380 </body>
381 </html>
382 HTML
383 }
384
385 # Recombining
386 # -----------
387
388 # Alright, we have separate files for each docs section and separate
389 # files for each code section. We've defined a function to wrap the
390 # results in the standard layout. All that's left to do now is put
391 # everything back together.
392
393 # Before starting the pipeline, decide the order in which to present the
394 # files. If `code0000` is empty, it should appear first so the remaining
395 # files are presented `docs0000`, `code0001`, `docs0001`, and so on. If
396 # `code0000` is not empty, `docs0000` should appear first so the files
397 # are presented `docs0000`, `code0000`, `docs0001`, `code0001` and so on.
398 #
399 # Ultimately, this means that if `code0000` is empty, the `-r` option
400 # should not be provided with the final `-k` option group to `sort`(1) in
401 # the pipeline below.
402 if stat -c"%s" /dev/null >/dev/null 2>/dev/null ; then
403 # GNU stat
404 [ "$(stat -c"%s" "code0000")" = 0 ] && sortopt="" || sortopt="r"
405 else
406 # BSD stat
407 [ "$(stat -f"%z" "code0000")" = 0 ] && sortopt="" || sortopt="r"
408 fi
409
410 # Start the pipeline with a simple list of split out temp filename. One file
411 # per line.
412 ls -1 docs[0-9]* code[0-9]* 2>/dev/null |
413
414 # Now sort the list of files by the *number* first and then by the type. The
415 # list will look something like this when `sort(1)` is done with it:
416 #
417 # docs0000
418 # code0000
419 # docs0001
420 # code0001
421 # docs0002
422 # code0002
423 # ...
424 #
425 sort -n -k"1.5" -k"1.1$sortopt" |
426
427 # And if we pass those files to `cat(1)` in that order, it concatenates them
428 # in exactly the way we need. `xargs(1)` reads from `stdin` and passes each
429 # line of input as a separate argument to the program given.
430 #
431 # We could also have written this as:
432 #
433 # cat $(ls -1 docs* code* | sort -n -k1.5 -k1.1r)
434 #
435 # I like to keep things to a simple flat pipeline when possible, hence the
436 # `xargs` approach.
437 xargs cat |
438
439
440 # Run a quick substitution on the embedded dividers to turn them into table
441 # rows and cells. This also wraps each code block in a `<div class=highlight>`
442 # so that the CSS kicks in properly.
443 {
444 DOCSDIVIDER='<h5>DIVIDER</h5>'
445 DOCSREPLACE='</pre></div></td></tr><tr><td class=docs>'
446 CODEDIVIDER='<span class="c"># DIVIDER</span>'
447 CODEREPLACE='</td><td class=code><div class=highlight><pre>'
448 sed "
449 s@${DOCSDIVIDER}@${DOCSREPLACE}@
450 s@${CODEDIVIDER}@${CODEREPLACE}@
451 "
452 } |
453
454 # Pipe our recombined HTML into the layout and let it write the result to
455 # `stdout`.
456 layout "$title"
457
458 # More
459 # ----
460 #
461 # **shocco** is the third tool in a growing family of quick-and-dirty,
462 # literate-programming-style documentation generators:
463 #
464 # * [Docco][do] - The original. Written in CoffeeScript and generates
465 # documentation for CoffeeScript, JavaScript, and Ruby.
466 # * [Rocco][ro] - A port of Docco to Ruby.
467 #
468 # If you like this sort of thing, you may also find interesting Knuth's
469 # massive body of work on literate programming:
470 #
471 # * [Knuth: Literate Programming][kn]
472 # * [Literate Programming on Wikipedia][wi]
473 #
474 # [ro]: http://rtomayko.github.com/rocco/
475 # [do]: http://jashkenas.github.com/docco/
476 # [kn]: http://www-cs-faculty.stanford.edu/~knuth/lp.html
477 # [wi]: http://en.wikipedia.org/wiki/Literate_programming
478
479 # Copyright (C) [Ryan Tomayko <tomayko.com/about>](http://tomayko.com/about)<br>
480 # This is Free Software distributed under the MIT license.
481 :