[HN Gopher] Parse, Don't Validate and Type-Driven Design in Rust
___________________________________________________________________
Parse, Don't Validate and Type-Driven Design in Rust
Author : todsacerdoti
Score : 226 points
Date : 2026-02-21 19:40 UTC (20 hours ago)
HTML web link (www.harudagondi.space)
TEXT w3m dump (www.harudagondi.space)
| dang wrote:
| Recent and related: _Parse, Don 't Validate (2019)_ -
| https://news.ycombinator.com/item?id=46960392 - Feb 2026 (172
| comments)
|
| also:
|
| _Parse, Don't Validate - Some C Safety Tips_ -
| https://news.ycombinator.com/item?id=44507405 - July 2025 (73
| comments)
|
| _Parse, Don 't Validate (2019)_ -
| https://news.ycombinator.com/item?id=41031585 - July 2024 (102
| comments)
|
| _Parse, don 't validate (2019)_ -
| https://news.ycombinator.com/item?id=35053118 - March 2023 (219
| comments)
|
| _Parse, Don 't Validate (2019)_ -
| https://news.ycombinator.com/item?id=27639890 - June 2021 (270
| comments)
|
| _Parsix: Parse Don 't Validate_ -
| https://news.ycombinator.com/item?id=27166162 - May 2021 (107
| comments)
|
| _Parse, Don't Validate_ -
| https://news.ycombinator.com/item?id=21476261 - Nov 2019 (230
| comments)
|
| _Parse, Don 't Validate_ -
| https://news.ycombinator.com/item?id=21471753 - Nov 2019 (4
| comments)
|
| (p.s. these links are just to satisfy extra-curious readers - no
| criticism is intended! I add this because people sometimes assume
| otherwise)
| jaggederest wrote:
| You can go even further with this in other languages, with things
| like dependent typing - which can assert (among other interesting
| properties) that, for example, something like
| get_elem_at_index(array, index)
|
| cannot ever have index outside the bounds of the array, but
| checked statically at compilation time - and this is the key,
| without knowing a priori what the length of array is.
|
| "In Idris, a length-indexed vector is Vect n a (length n is in
| the type), and a valid index into length n is Fin n ('a natural
| number strictly less than n')."
|
| Similar tricks work with division that might result in inf/-inf,
| to prevent them from typechecking, and more subtle implications
| in e.g. higher order types and functions
| VorpalWay wrote:
| How does that work? If the length of the array is read from
| stdin for example, it would be impossible to know it at compile
| time. Presumably this is limited somehow?
| jaggederest wrote:
| If the length is read from outside the program it's an IO
| operation, not a static variable, but there are generally
| runtime checks in addition to the type system. Usually you
| solve this as in the article, with a constructor that checks
| it - so you'd have something like "Invalid option: length = 5
| must be within 0-4" when you tried to create the Fin n from
| the passed in value
| ratorx wrote:
| It doesn't have to be a compile time constant. An alternative
| is to prove that when you are calling the function the index
| is always less than the size of the vector (a dynamic
| constraint). You may be able to assert this by having a
| separate function on the vector that returns a constrained
| value (eg. n < v.len()).
| mdm12 wrote:
| One option is dependent pairs, where one value of the pair
| (in this example) would be the length of the array and the
| other value is a type which depends on that same value (such
| as Vector n T instead of List T).
|
| Type-Driven Development with Idris[1] is a great introduction
| for dependently typed languages and covers methods such as
| these if you're interested (and Edwin Brady is a great
| teacher).
|
| [1] https://www.manning.com/books/type-driven-development-
| with-i...
| marcosdumay wrote:
| If you check that the value is inside the range, and execute
| some different code if it's not, then congratulations, you
| now know at compile time that the number you will read from
| stdin is in the right range.
| dernett wrote:
| Not sure about Idris, but in Lean `Fin n` is a struct that
| contains a value `i` and a proof that `i < n`. You can read
| in the value `n` from stdin and then you can do `if h : i <
| n` to have a compile-time proof `h` that you can use to
| construct a `Fin n` instance.
| smj-edison wrote:
| I've heard this can be a bit of a pain in practice, is that
| true? I can imagine it could slow me down to construct a
| proof of an invariant every time I call a function (if I
| understand correctly).
| jaggederest wrote:
| I haven't worked significantly with lean but I'm toying
| with my own dependently typed language. generally you
| only have to construct a proof once, much like a type or
| function, and then you reuse it. Also, depending on the
| language there are rewriting semantics ("elaboration")
| that let you do mathematical transformations to make two
| statements equivalent and then reuse the standardized
| proof.
| rq1 wrote:
| Imagine you read a value from stdin and parse it as:
|
| Maybe Int
|
| So your program splits into two branches:
|
| 1. Nothing branch: you failed to obtain an Int.
|
| There is no integer to use as an index, so you can't even
| attempt a safe lookup into something like Vect n a.
|
| 2. Just i branch: you do have an Int called i.
|
| But an Int is not automatically a valid index for Vect n a,
| because vectors are indexed by Fin n (a proof carrying
| "bounded natural").
|
| So inside the Just i branch, you refine further:
|
| 3. Try to turn the runtime integer i into a value of type Fin
| n.
|
| There are two typical shapes of this step:
|
| * Checked conversion returning Maybe (Fin n)
|
| If the integer is in range, you get Just (fin : Fin n).
| Otherwise Nothing.
|
| Checked conversion returning evidence (proof) that it's in
| range
|
| For example: produce k : Nat plus a proof like k < n (or LTE
| (S k) n), and then you can construct Fin n from that
| evidence.
|
| (But it's the same basically, you end up with a "Maybe
| LTE..."
|
| Now if you also have a vector: xs : Vect n a
|
| ... the n in Fin n and the n in Vect n a are the same n
| (that's what "unifies" means here: the types line up), so you
| can do: index fin xs : a
|
| And crucially:
|
| there is no branch in which you can call index without having
| constructed the Fin n first, so out-of-bounds access is
| unrepresentable (it's not "checked later", it's "cannot be
| expressed").
|
| And within _that_ branch of the program, you have a proof of
| Fin n.
|
| Said differently: you don't get "compile-time knowledge of
| i"; you get a compile-time _guarantee_ that whatever value
| you ended up with satisfies a predicate.
|
| Concretely: you run a runtime check i < n. _ONCE_
|
| If it fails, you're in a branch where you do not have Fin n.
|
| If it succeeds, you construct fin : Fin n at runtime (it's a
| value, you can't get around that), but its type encodes the
| invariant "in bounds"/check was done somewhere in this
| branch.
| esafak wrote:
| I wish dependent types were more common :(
| satvikpendem wrote:
| Rust has some libraries that can do dependent typing too, based
| on macros. For example: https://youtube.com/watch?v=JtYyhXs4t6w
|
| Which refers to https://docs.rs/anodized/latest/anodized/
| hmry wrote:
| Very cool and practical, but specs aren't dependent typing.
| (I actually think specs are probably more useful than
| dependent types for most people)
|
| Dependent typing requires:
|
| - Generic types that can take runtime values as parameters,
| e.g. [u8; user_input]
|
| - Functions where the type of one parameter depends on the
| runtime value of another parameter, e.g. fn f(len: usize,
| data: [u8; len])
|
| - Structs/tuples where the type of one field depends on the
| runtime value of another field, e.g. struct Vec { len: usize,
| data: [u8; self.len] }
| cmovq wrote:
| Dividing a float by zero is usually perfectly valid. It has
| predictable outputs, and for some algorithms like collision
| detection this property is used to remove branches.
| woodruffw wrote:
| I think "has predictable outputs" is less valuable than "has
| expected outputs" for most workloads. Dividing by zero almost
| always reflects an unintended state, so proceeding with the
| operation means compounding the error state.
|
| (This isn't to say it's always wrong, but that having it be an
| error state by default seems very reasonable to me.)
| noitpmeder wrote:
| This reminds me a bit of a recent publication by Stroustrup about
| using concepts... in C++ to validate integer conversions
| automatically where necessary.
|
| https://www.stroustrup.com/Concept-based-GP.pdf {
| Number<unsigned int> ii = 0; Number<char> cc = '0';
| ii = 2; // OK ii = -2; // throws cc = i; // OK
| if i is within cc's range cc = -17; // OK if char is
| signed; otherwise throws cc = 1234; // throws if a char
| is 8 bits }
| strawhatguy wrote:
| The alternative is one type, with many functions that can operate
| on that type.
|
| Like how clojure basically uses maps everywhere and the whole
| standard library allows you to manipulate them in various ways.
|
| The main problem with the many type approach is several same it
| worse similar types, all incompatible.
| packetlost wrote:
| I don't really get why this is getting flagged, I've found this
| to be true but more of a trade off than a pure benefit. It also
| is sort of besides the point: you always need to parse inputs
| from external, usually untrusted, sources.
| doublesocket wrote:
| Agree with this. Mismatching types are generally an indicator
| of an underlying issue with the code, not the language
| itself. These are areas AI can be helpful flagging potential
| problems.
| fiddlerwoaroof wrote:
| Yeah, there's something of a tension between the Perlis quote
| "It is better to have 100 functions operate on one data
| structure than 10 functions on 10 data structures" and Parse,
| don't validate.
|
| The way I've thought about it, though, is that it's possible to
| design a program well either by encoding your important
| invariants in your types or in your functions (especially
| simple functions). In dynamically typed languages like Clojure,
| my experience is that there's a set of design practices that
| have a lot of the same effects as "Parse, Don't Validate"
| without statically enforced types. And, ultimately, it's a
| question of mindset which style you prefer.
| strawhatguy wrote:
| There's probably a case for both. Core logic might benefit
| from hard types deep in the bowels of unchanging engine.
|
| The real world often changes though, and more often than not
| the code has to adapt, regardless of how elegant are systems
| are designed.
| fiddlerwoaroof wrote:
| Coalton ( https://coalton-lang.github.io ) is the sort of
| thing I like: a Haskell-style language hosted inside a very
| dynamic one with good interop.
| strawhatguy wrote:
| Yes it's quite the blend!
| marcosdumay wrote:
| There are more than two alternatives, since functions can
| operate in more than one type.
| Rygian wrote:
| This sounds like the "stringly typed language" mockery of some
| languages. How is it actually different?
| Kinrany wrote:
| It's not an alternative.
|
| Start with a more dynamic type, do stuff that doesn't care
| about the shape, parse into a more precise type, do stuff that
| relies on the additional invariants, drop back into the more
| dynamic type again.
| slopinthebag wrote:
| I find a balance is important. You can do nominal typing in a
| structural type system with branding, and you can _kinda_ do
| structural typing in a nominal type system, but it 's not as
| ergonomic. But you should probably end up doing a mix of both.
| sam0x17 wrote:
| btw the "quoth" crate makes it really really easy to implement
| scannerless parsing in rust for arbitrary syntax, use it on many
| of my projects
| IshKebab wrote:
| Interesting looking crate. You don't seem to have any examples
| at all though so I wouldn't say it makes it easy!
| hutao wrote:
| Note that the division-by-zero example used in this article is
| not the best example to demonstrate "Parse, Don't Validate,"
| because it relies on encapsulation. The principle of "Parse,
| Don't Validate" is best embodied by functions that transform
| untrusted data into some data type which is _correct by
| construction_.
|
| Alexis King, the author of the original "Parse, Don't Validate"
| article, also published a follow-up, "Names are not type safety"
| [0] clarifying that the "newtype" pattern (such as hiding a
| nonzero integer in a wrapper type) provide weaker guarantees than
| correctness by construction. Her original "Parse, Don't Validate"
| article also includes the following caveat:
|
| > Use abstract datatypes to make validators "look like" parsers.
| Sometimes, making an illegal state truly unrepresentable is just
| plain impractical given the tools Haskell provides, such as
| ensuring an integer is in a particular range. In that case, use
| an abstract newtype with a smart constructor to "fake" a parser
| from a validator.
|
| So, an abstract data type that protects its inner data is really
| a "validator" that tries to resemble a "parser" in cases where
| the type system itself cannot encode the invariant.
|
| The article's second example, the non-empty vec, is a better
| example, because it encodes within the type system the invariant
| that one element must exist. The crux of Alexis King's article is
| that programs should be structured so that functions return data
| types designed to be correct by construction, akin to a parser
| transforming less-structured data into more-structured data.
|
| [0] https://lexi-lambda.github.io/blog/2020/11/01/names-are-
| not-...
| Sharlin wrote:
| Even the newtype-based "parse, don't validate" is tremendously
| useful in practice, though. The big thing is that if you have a
| bare string, you don't know "where it's been". It doesn't carry
| with it information whether it's already been validated. Even
| if a newtype can't provide you full correctness by
| construction, it's vastly easier to be convinced of the
| validity of an encapsulated value compared to a naked one.
|
| For full-on parse-don't-validate, you essentially need a
| dependent type system. As a more light-weight partial solution,
| Rust has been prototyping pattern types, which are types
| constrained by patterns. For instance a range-restricted
| integer type could be simply spelled `i8 is 0..100`, or a
| nonempty slice as `[T] is [_, ..]`. Such a feature would
| certainly make correctness-by-construction easier in many
| cases.
|
| The non-empty list implemented as a (T, Vec<T>) is, btw, a nice
| example of the clash between practicality and theoretical
| purity. It can't offer you a slice (consecutive view) of its
| elements without storing the first element twice (which
| requires immutability and that T: Clone, unlike normal Vec<T>),
| which makes it fairly useless as a vector. It's okay if you
| consider it just an abstract list with a more restricted
| interface.
| rhdunn wrote:
| It's also useful to wrap/tag IDs in structured types. That
| makes it easier to avoid errors when there are multiple type
| parameters such as in the Microsoft graph API.
| humkieufj wrote:
| Tragic how Rusties and HN apparently like to try to murder
| critics through svvvatting. Rusties and HN cannot be said
| to have neither souls nor conscience.
| spockz wrote:
| Coming from Haskell, I loved Agda 2 as a dependent type
| language. Is there any newer or more mainstream language that
| has added dependent types?
| doctor_phil wrote:
| Idris is slightly more mainstream I would say, but not
| wildy so. If you like the Haskell interop then I'd
| recommend staying with Agda.
|
| Scala 3 is much more mainstream and has path dependent
| types. I've only used Scala 2, and there the boilerplate
| for dependent types was frustrating imo, but I've heard its
| better in 3.
| throw_await wrote:
| Typescript has something that can be used as dependent
| types, but it wasn't intended as a language feature, so the
| Syntax is not as ergonomic as Agda:
| https://www.hacklewayne.com/dependent-types-in-typescript-
| se...
| rapnie wrote:
| You can also search for "make invalid states
| impossible/unrepresentable" [0] to find more info on related
| practices. See "domain modeling made functional" [0] as a nice
| example
|
| [0] https://geeklaunch.io/blog/make-invalid-states-
| unrepresentab...
|
| [1] https://www.youtube.com/watch?v=2JB1_e5wZmU
| hutao wrote:
| The phrasing that I hear more often is "make illegal states
| unrepresentable"; both the submitted article and Alexis
| King's original article use this phrase. At least according
| to https://fsharpforfunandprofit.com/posts/designing-with-
| types..., it originates from Yaron Minsky (a programmer at
| Jane Street who is prominent in the OCaml community).
|
| EDIT: Parent comment was edited to amend the
| "impossible/unrepresentable" wording
| rapnie wrote:
| Yes, sorry. I thought to add some resources to it, or it
| would be a too vague comment and found the better phrasing.
| CodeBit26 wrote:
| I agree, 'correct by construction' is the ultimate goal here.
| Using types like NonZeroU32 is a great simple example, but the
| real power comes when you design your entire domain logic so
| that the compiler acts as your gatekeeper. It shifts the mental
| load from run-time debugging to design-time thinking.
| fph wrote:
| The article quickly mentions implementing addition:
|
| ```
|
| impl Add for NonZeroF32 { ... }
|
| impl Add<f32> for NonZeroF32 { ... }
|
| impl Add<NonZeroF32> for f32 { ... }
|
| ```
|
| What type would it return though?
| alfons_foobar wrote:
| Would have to be F32, no? I cannot think of any way to enforce
| "non-zero-ness" of the result without making it return an
| optional Result<NonZeroF32>, and at that point we are basically
| back to square one...
| MaulingMonkey wrote:
| > Would have to be F32, no?
|
| Generally yes. `NonZeroU32::saturating_add(self, other: u32)`
| is able to return `NonZeroU32` though! ( https://doc.rust-
| lang.org/std/num/type.NonZeroU32.html#metho... )
|
| > I cannot think of any way to enforce "non-zero-ness" of the
| result without making it return an optional
| Result<NonZeroF32>, and at that point we are basically back
| to square one...
|
| `NonZeroU32::checked_add(self, other: u32)` basically does
| this, although I'll note it returns an `Option` instead of a
| `Result` ( https://doc.rust-
| lang.org/std/num/type.NonZeroU32.html#metho... ), leaving you
| to `.map_err(...)` or otherwise handle the edge case to your
| heart's content. Niche, but occasionally what you want.
| alfons_foobar wrote:
| > `NonZeroU32::saturating_add(self, other: u32)` is able to
| return `NonZeroU32` though!
|
| I was confused at first how that could work, but then I
| realized that of course, with _unsigned_ integers this
| works fine because you cannot add a negative number...
| WJW wrote:
| I imagine it would be something like Option<NonZeroF32>, since
| -2.0 + 2.0 would violate the constraints at runtime. This gets
| us the Option handling problem back.
|
| I think the article would have been better with
| NonZeroPositiveF32 as the example type, since then addition
| would be safe.
| the__alchemist wrote:
| The examples in question propagate complexity throughout related
| code. I think this is a case I see frequently in Rust of using
| too many abstractions, and its associated complexities.
|
| I would just (as a default; the situation varies)... validate
| prior to the division and handle as appropriate.
|
| The analogous situation I encounter frequently is indexing, e.g.
| checking if the index is out of bounds. Similar idea; check;
| print or display an error, then fail that computation without
| crashing the program. Usually an indication of some bug, which
| can be tracked down. Or, if it's an array frequently indexed, use
| a (Canonical for Rust's core) `get` method on the whatever struct
| owns the array. It returns an Option.
|
| I do think either the article's approach, or validating is better
| than runtime crashes! There are many patterns in programming.
| Using Types in this way is something I see a lot of in OSS rust,
| but it is not my cup of tea. Not heinous in this case, but I
| think not worth it.
|
| This is the key to this article's philosophy, near the bottom:
|
| > I love creating more types. Five million types for everyone
| please.
| unixpickle wrote:
| The `try_roots` example here is actually a _counterexample_ to
| the author's main argument. They explicitly ignore the "negative
| discriminant" case. What happens if we consider it?
|
| If we take their "parse" approach, then the types of the
| arguments a, b, and c have to somehow encode the constraint `b^2
| - 4 _a_ c >= 0`. This would be a total mess--I can't think of any
| clean way to do this in Rust. It makes _much_ more sense to
| simply return an Option and do the validation within the
| function.
|
| In general, I think validation is often the best way to solve the
| problem. The only counterexample, which the author fixates on in
| the post, is when one particular value is constrained in a clean,
| statically verifiable way. Most of the time, validation is used
| to check (possibly complex) interactions between multiple values,
| and "parsing" isn't at all convenient.
| sbszllr wrote:
| I was thinking a similar thing when reading the article. Often,
| the validity of the input depends on the interaction between
| some of them.
|
| Sure, we can follow the advice of creating types that represent
| only valid states but then we end up with `fn(a: A, b: B, c: C)
| transformed into `fn(abc: ValidABC)`
| AxiomLab wrote:
| This exact philosophy is why I started treating UI design systems
| like compilers.
|
| Instead of validating visual outputs after the fact (like linting
| CSS or manual design reviews), you parse the constraints upfront.
| If a layout component is strictly typed to only accept discrete
| grid multiples, an arbitrary 13px margin becomes a compile error,
| not a subjective design debate. It forces determinism.
| Garlef wrote:
| curious: what kind of tooling would you use here?
| qsera wrote:
| >"Parse, Don't Validate,"
|
| Ah, that pretentious little blog post that was taken up by many
| as gospel, for some mysterious reason...
|
| This tells me that any idea, even if stupid (or obvious in this
| case) ones, can go mainstream, if you rhyme it right.
|
| That blog post is like a "modern art" painting by some famous
| author that totally look a child has flingned some paint to a
| wall, but every person who looks at it tries so hard to find some
| meaning that every one end up finding "something"..
|
| In fact the top comment right now express that their
| interpretation of blog post is different from the author's...
|
| Amusing!
| throw310822 wrote:
| I only half understand this stuff, but all this encapsulation
| of values so that they are guaranteed to remain valid across
| manipulations... isn't it called Object Oriented Programming?
| mrkeen wrote:
| Was the original blog post wrong?
| MarcLore wrote:
| This pattern maps beautifully to API design too. Instead of
| validating a raw JSON request body and hoping you checked
| everything, you parse it into a well-typed struct upfront. Every
| downstream function then gets guaranteed-valid data without
| redundant checks. The Rust ecosystem makes this almost painless
| with serde + custom deserializers. I've seen codebases cut their
| error-handling surface area by 60% just by moving from validate-
| then-use to parse-at-the-boundary.
| barnacs wrote:
| Every time you introduce a type for a "value invariant" you lose
| compatibility and force others to make cumbersome type
| conversions.
|
| To me, invalid values are best expressed with optional error
| returns along with the value that are part of the function
| signature. Types are best used to only encode information about
| the hierarchy of structures composed of primitive types. They
| help define and navigate the representation of composite things
| as opposed to just having dynamic nested maps of arbitrary
| strings.
| mrkeen wrote:
| > They help define and navigate the representation of composite
| things as opposed to just having dynamic nested maps of
| arbitrary strings.
|
| What would you say to someone who thinks that nested maps of
| arbitrary strings have maximum compatibility, and using types
| forces others to make cumbersome type conversions?
| barnacs wrote:
| If the fields of a structure or the string keys of an untyped
| map don't match then you don't have compatibility either way.
| The same is not true for restricting the set of valid values.
|
| edit: To put it differently: To possibly be compatible with
| the nested "Circle" map, you need to know it is supposed to
| have a "Radius" key that is supposed to be a float. Type
| definitions just make this explicit. But just because your
| "Radius" can't be 0, you shouldn't make it incompatible with
| everything else operating on floats in general.
| ubixar wrote:
| C# gets close to this with records + pattern matching, F#
| discriminated unions are even better for this with algebraic data
| types built right in. A Result<'T,'Error> makes invalid states
| unrepresentable without any ceremony. C# records/matching works
| for now, but native DUs will make it even nicer.
___________________________________________________________________
(page generated 2026-02-22 16:00 UTC)