hngopher.com/1/live/items/47103931

       [HN Gopher] Parse, Don't Validate and Type-Driven Design in Rust
       ___________________________________________________________________
        
       Parse, Don't Validate and Type-Driven Design in Rust
        
       Author : todsacerdoti
       Score  : 226 points
       Date   : 2026-02-21 19:40 UTC (20 hours ago)
        
  HTML web link (www.harudagondi.space)
  TEXT w3m dump (www.harudagondi.space)
        
       | dang wrote:
       | Recent and related: _Parse, Don 't Validate (2019)_ -
       | https://news.ycombinator.com/item?id=46960392 - Feb 2026 (172
       | comments)
       | 
       | also:
       | 
       |  _Parse, Don't Validate - Some C Safety Tips_ -
       | https://news.ycombinator.com/item?id=44507405 - July 2025 (73
       | comments)
       | 
       |  _Parse, Don 't Validate (2019)_ -
       | https://news.ycombinator.com/item?id=41031585 - July 2024 (102
       | comments)
       | 
       |  _Parse, don 't validate (2019)_ -
       | https://news.ycombinator.com/item?id=35053118 - March 2023 (219
       | comments)
       | 
       |  _Parse, Don 't Validate (2019)_ -
       | https://news.ycombinator.com/item?id=27639890 - June 2021 (270
       | comments)
       | 
       |  _Parsix: Parse Don 't Validate_ -
       | https://news.ycombinator.com/item?id=27166162 - May 2021 (107
       | comments)
       | 
       |  _Parse, Don't Validate_ -
       | https://news.ycombinator.com/item?id=21476261 - Nov 2019 (230
       | comments)
       | 
       |  _Parse, Don 't Validate_ -
       | https://news.ycombinator.com/item?id=21471753 - Nov 2019 (4
       | comments)
       | 
       | (p.s. these links are just to satisfy extra-curious readers - no
       | criticism is intended! I add this because people sometimes assume
       | otherwise)
        
       | jaggederest wrote:
       | You can go even further with this in other languages, with things
       | like dependent typing - which can assert (among other interesting
       | properties) that, for example, something like
       | get_elem_at_index(array, index)
       | 
       | cannot ever have index outside the bounds of the array, but
       | checked statically at compilation time - and this is the key,
       | without knowing a priori what the length of array is.
       | 
       | "In Idris, a length-indexed vector is Vect n a (length n is in
       | the type), and a valid index into length n is Fin n ('a natural
       | number strictly less than n')."
       | 
       | Similar tricks work with division that might result in inf/-inf,
       | to prevent them from typechecking, and more subtle implications
       | in e.g. higher order types and functions
        
         | VorpalWay wrote:
         | How does that work? If the length of the array is read from
         | stdin for example, it would be impossible to know it at compile
         | time. Presumably this is limited somehow?
        
           | jaggederest wrote:
           | If the length is read from outside the program it's an IO
           | operation, not a static variable, but there are generally
           | runtime checks in addition to the type system. Usually you
           | solve this as in the article, with a constructor that checks
           | it - so you'd have something like "Invalid option: length = 5
           | must be within 0-4" when you tried to create the Fin n from
           | the passed in value
        
           | ratorx wrote:
           | It doesn't have to be a compile time constant. An alternative
           | is to prove that when you are calling the function the index
           | is always less than the size of the vector (a dynamic
           | constraint). You may be able to assert this by having a
           | separate function on the vector that returns a constrained
           | value (eg. n < v.len()).
        
           | mdm12 wrote:
           | One option is dependent pairs, where one value of the pair
           | (in this example) would be the length of the array and the
           | other value is a type which depends on that same value (such
           | as Vector n T instead of List T).
           | 
           | Type-Driven Development with Idris[1] is a great introduction
           | for dependently typed languages and covers methods such as
           | these if you're interested (and Edwin Brady is a great
           | teacher).
           | 
           | [1] https://www.manning.com/books/type-driven-development-
           | with-i...
        
           | marcosdumay wrote:
           | If you check that the value is inside the range, and execute
           | some different code if it's not, then congratulations, you
           | now know at compile time that the number you will read from
           | stdin is in the right range.
        
           | dernett wrote:
           | Not sure about Idris, but in Lean `Fin n` is a struct that
           | contains a value `i` and a proof that `i < n`. You can read
           | in the value `n` from stdin and then you can do `if h : i <
           | n` to have a compile-time proof `h` that you can use to
           | construct a `Fin n` instance.
        
             | smj-edison wrote:
             | I've heard this can be a bit of a pain in practice, is that
             | true? I can imagine it could slow me down to construct a
             | proof of an invariant every time I call a function (if I
             | understand correctly).
        
               | jaggederest wrote:
               | I haven't worked significantly with lean but I'm toying
               | with my own dependently typed language. generally you
               | only have to construct a proof once, much like a type or
               | function, and then you reuse it. Also, depending on the
               | language there are rewriting semantics ("elaboration")
               | that let you do mathematical transformations to make two
               | statements equivalent and then reuse the standardized
               | proof.
        
           | rq1 wrote:
           | Imagine you read a value from stdin and parse it as:
           | 
           | Maybe Int
           | 
           | So your program splits into two branches:
           | 
           | 1. Nothing branch: you failed to obtain an Int.
           | 
           | There is no integer to use as an index, so you can't even
           | attempt a safe lookup into something like Vect n a.
           | 
           | 2. Just i branch: you do have an Int called i.
           | 
           | But an Int is not automatically a valid index for Vect n a,
           | because vectors are indexed by Fin n (a proof carrying
           | "bounded natural").
           | 
           | So inside the Just i branch, you refine further:
           | 
           | 3. Try to turn the runtime integer i into a value of type Fin
           | n.
           | 
           | There are two typical shapes of this step:
           | 
           | * Checked conversion returning Maybe (Fin n)
           | 
           | If the integer is in range, you get Just (fin : Fin n).
           | Otherwise Nothing.
           | 
           | Checked conversion returning evidence (proof) that it's in
           | range
           | 
           | For example: produce k : Nat plus a proof like k < n (or LTE
           | (S k) n), and then you can construct Fin n from that
           | evidence.
           | 
           | (But it's the same basically, you end up with a "Maybe
           | LTE..."
           | 
           | Now if you also have a vector: xs : Vect n a
           | 
           | ... the n in Fin n and the n in Vect n a are the same n
           | (that's what "unifies" means here: the types line up), so you
           | can do: index fin xs : a
           | 
           | And crucially:
           | 
           | there is no branch in which you can call index without having
           | constructed the Fin n first, so out-of-bounds access is
           | unrepresentable (it's not "checked later", it's "cannot be
           | expressed").
           | 
           | And within _that_ branch of the program, you have a proof of
           | Fin n.
           | 
           | Said differently: you don't get "compile-time knowledge of
           | i"; you get a compile-time _guarantee_ that whatever value
           | you ended up with satisfies a predicate.
           | 
           | Concretely: you run a runtime check i < n. _ONCE_
           | 
           | If it fails, you're in a branch where you do not have Fin n.
           | 
           | If it succeeds, you construct fin : Fin n at runtime (it's a
           | value, you can't get around that), but its type encodes the
           | invariant "in bounds"/check was done somewhere in this
           | branch.
        
         | esafak wrote:
         | I wish dependent types were more common :(
        
         | satvikpendem wrote:
         | Rust has some libraries that can do dependent typing too, based
         | on macros. For example: https://youtube.com/watch?v=JtYyhXs4t6w
         | 
         | Which refers to https://docs.rs/anodized/latest/anodized/
        
           | hmry wrote:
           | Very cool and practical, but specs aren't dependent typing.
           | (I actually think specs are probably more useful than
           | dependent types for most people)
           | 
           | Dependent typing requires:
           | 
           | - Generic types that can take runtime values as parameters,
           | e.g. [u8; user_input]
           | 
           | - Functions where the type of one parameter depends on the
           | runtime value of another parameter, e.g. fn f(len: usize,
           | data: [u8; len])
           | 
           | - Structs/tuples where the type of one field depends on the
           | runtime value of another field, e.g. struct Vec { len: usize,
           | data: [u8; self.len] }
        
       | cmovq wrote:
       | Dividing a float by zero is usually perfectly valid. It has
       | predictable outputs, and for some algorithms like collision
       | detection this property is used to remove branches.
        
         | woodruffw wrote:
         | I think "has predictable outputs" is less valuable than "has
         | expected outputs" for most workloads. Dividing by zero almost
         | always reflects an unintended state, so proceeding with the
         | operation means compounding the error state.
         | 
         | (This isn't to say it's always wrong, but that having it be an
         | error state by default seems very reasonable to me.)
        
       | noitpmeder wrote:
       | This reminds me a bit of a recent publication by Stroustrup about
       | using concepts... in C++ to validate integer conversions
       | automatically where necessary.
       | 
       | https://www.stroustrup.com/Concept-based-GP.pdf                 {
       | Number<unsigned int> ii = 0;          Number<char> cc = '0';
       | ii = 2; // OK          ii = -2; // throws          cc = i; // OK
       | if i is within cc's range          cc = -17; // OK if char is
       | signed; otherwise throws          cc = 1234; // throws if a char
       | is 8 bits       }
        
       | strawhatguy wrote:
       | The alternative is one type, with many functions that can operate
       | on that type.
       | 
       | Like how clojure basically uses maps everywhere and the whole
       | standard library allows you to manipulate them in various ways.
       | 
       | The main problem with the many type approach is several same it
       | worse similar types, all incompatible.
        
         | packetlost wrote:
         | I don't really get why this is getting flagged, I've found this
         | to be true but more of a trade off than a pure benefit. It also
         | is sort of besides the point: you always need to parse inputs
         | from external, usually untrusted, sources.
        
           | doublesocket wrote:
           | Agree with this. Mismatching types are generally an indicator
           | of an underlying issue with the code, not the language
           | itself. These are areas AI can be helpful flagging potential
           | problems.
        
         | fiddlerwoaroof wrote:
         | Yeah, there's something of a tension between the Perlis quote
         | "It is better to have 100 functions operate on one data
         | structure than 10 functions on 10 data structures" and Parse,
         | don't validate.
         | 
         | The way I've thought about it, though, is that it's possible to
         | design a program well either by encoding your important
         | invariants in your types or in your functions (especially
         | simple functions). In dynamically typed languages like Clojure,
         | my experience is that there's a set of design practices that
         | have a lot of the same effects as "Parse, Don't Validate"
         | without statically enforced types. And, ultimately, it's a
         | question of mindset which style you prefer.
        
           | strawhatguy wrote:
           | There's probably a case for both. Core logic might benefit
           | from hard types deep in the bowels of unchanging engine.
           | 
           | The real world often changes though, and more often than not
           | the code has to adapt, regardless of how elegant are systems
           | are designed.
        
             | fiddlerwoaroof wrote:
             | Coalton ( https://coalton-lang.github.io ) is the sort of
             | thing I like: a Haskell-style language hosted inside a very
             | dynamic one with good interop.
        
               | strawhatguy wrote:
               | Yes it's quite the blend!
        
         | marcosdumay wrote:
         | There are more than two alternatives, since functions can
         | operate in more than one type.
        
         | Rygian wrote:
         | This sounds like the "stringly typed language" mockery of some
         | languages. How is it actually different?
        
         | Kinrany wrote:
         | It's not an alternative.
         | 
         | Start with a more dynamic type, do stuff that doesn't care
         | about the shape, parse into a more precise type, do stuff that
         | relies on the additional invariants, drop back into the more
         | dynamic type again.
        
         | slopinthebag wrote:
         | I find a balance is important. You can do nominal typing in a
         | structural type system with branding, and you can _kinda_ do
         | structural typing in a nominal type system, but it 's not as
         | ergonomic. But you should probably end up doing a mix of both.
        
       | sam0x17 wrote:
       | btw the "quoth" crate makes it really really easy to implement
       | scannerless parsing in rust for arbitrary syntax, use it on many
       | of my projects
        
         | IshKebab wrote:
         | Interesting looking crate. You don't seem to have any examples
         | at all though so I wouldn't say it makes it easy!
        
       | hutao wrote:
       | Note that the division-by-zero example used in this article is
       | not the best example to demonstrate "Parse, Don't Validate,"
       | because it relies on encapsulation. The principle of "Parse,
       | Don't Validate" is best embodied by functions that transform
       | untrusted data into some data type which is _correct by
       | construction_.
       | 
       | Alexis King, the author of the original "Parse, Don't Validate"
       | article, also published a follow-up, "Names are not type safety"
       | [0] clarifying that the "newtype" pattern (such as hiding a
       | nonzero integer in a wrapper type) provide weaker guarantees than
       | correctness by construction. Her original "Parse, Don't Validate"
       | article also includes the following caveat:
       | 
       | > Use abstract datatypes to make validators "look like" parsers.
       | Sometimes, making an illegal state truly unrepresentable is just
       | plain impractical given the tools Haskell provides, such as
       | ensuring an integer is in a particular range. In that case, use
       | an abstract newtype with a smart constructor to "fake" a parser
       | from a validator.
       | 
       | So, an abstract data type that protects its inner data is really
       | a "validator" that tries to resemble a "parser" in cases where
       | the type system itself cannot encode the invariant.
       | 
       | The article's second example, the non-empty vec, is a better
       | example, because it encodes within the type system the invariant
       | that one element must exist. The crux of Alexis King's article is
       | that programs should be structured so that functions return data
       | types designed to be correct by construction, akin to a parser
       | transforming less-structured data into more-structured data.
       | 
       | [0] https://lexi-lambda.github.io/blog/2020/11/01/names-are-
       | not-...
        
         | Sharlin wrote:
         | Even the newtype-based "parse, don't validate" is tremendously
         | useful in practice, though. The big thing is that if you have a
         | bare string, you don't know "where it's been". It doesn't carry
         | with it information whether it's already been validated. Even
         | if a newtype can't provide you full correctness by
         | construction, it's vastly easier to be convinced of the
         | validity of an encapsulated value compared to a naked one.
         | 
         | For full-on parse-don't-validate, you essentially need a
         | dependent type system. As a more light-weight partial solution,
         | Rust has been prototyping pattern types, which are types
         | constrained by patterns. For instance a range-restricted
         | integer type could be simply spelled `i8 is 0..100`, or a
         | nonempty slice as `[T] is [_, ..]`. Such a feature would
         | certainly make correctness-by-construction easier in many
         | cases.
         | 
         | The non-empty list implemented as a (T, Vec<T>) is, btw, a nice
         | example of the clash between practicality and theoretical
         | purity. It can't offer you a slice (consecutive view) of its
         | elements without storing the first element twice (which
         | requires immutability and that T: Clone, unlike normal Vec<T>),
         | which makes it fairly useless as a vector. It's okay if you
         | consider it just an abstract list with a more restricted
         | interface.
        
           | rhdunn wrote:
           | It's also useful to wrap/tag IDs in structured types. That
           | makes it easier to avoid errors when there are multiple type
           | parameters such as in the Microsoft graph API.
        
             | humkieufj wrote:
             | Tragic how Rusties and HN apparently like to try to murder
             | critics through svvvatting. Rusties and HN cannot be said
             | to have neither souls nor conscience.
        
           | spockz wrote:
           | Coming from Haskell, I loved Agda 2 as a dependent type
           | language. Is there any newer or more mainstream language that
           | has added dependent types?
        
             | doctor_phil wrote:
             | Idris is slightly more mainstream I would say, but not
             | wildy so. If you like the Haskell interop then I'd
             | recommend staying with Agda.
             | 
             | Scala 3 is much more mainstream and has path dependent
             | types. I've only used Scala 2, and there the boilerplate
             | for dependent types was frustrating imo, but I've heard its
             | better in 3.
        
             | throw_await wrote:
             | Typescript has something that can be used as dependent
             | types, but it wasn't intended as a language feature, so the
             | Syntax is not as ergonomic as Agda:
             | https://www.hacklewayne.com/dependent-types-in-typescript-
             | se...
        
         | rapnie wrote:
         | You can also search for "make invalid states
         | impossible/unrepresentable" [0] to find more info on related
         | practices. See "domain modeling made functional" [0] as a nice
         | example
         | 
         | [0] https://geeklaunch.io/blog/make-invalid-states-
         | unrepresentab...
         | 
         | [1] https://www.youtube.com/watch?v=2JB1_e5wZmU
        
           | hutao wrote:
           | The phrasing that I hear more often is "make illegal states
           | unrepresentable"; both the submitted article and Alexis
           | King's original article use this phrase. At least according
           | to https://fsharpforfunandprofit.com/posts/designing-with-
           | types..., it originates from Yaron Minsky (a programmer at
           | Jane Street who is prominent in the OCaml community).
           | 
           | EDIT: Parent comment was edited to amend the
           | "impossible/unrepresentable" wording
        
             | rapnie wrote:
             | Yes, sorry. I thought to add some resources to it, or it
             | would be a too vague comment and found the better phrasing.
        
         | CodeBit26 wrote:
         | I agree, 'correct by construction' is the ultimate goal here.
         | Using types like NonZeroU32 is a great simple example, but the
         | real power comes when you design your entire domain logic so
         | that the compiler acts as your gatekeeper. It shifts the mental
         | load from run-time debugging to design-time thinking.
        
       | fph wrote:
       | The article quickly mentions implementing addition:
       | 
       | ```
       | 
       | impl Add for NonZeroF32 { ... }
       | 
       | impl Add<f32> for NonZeroF32 { ... }
       | 
       | impl Add<NonZeroF32> for f32 { ... }
       | 
       | ```
       | 
       | What type would it return though?
        
         | alfons_foobar wrote:
         | Would have to be F32, no? I cannot think of any way to enforce
         | "non-zero-ness" of the result without making it return an
         | optional Result<NonZeroF32>, and at that point we are basically
         | back to square one...
        
           | MaulingMonkey wrote:
           | > Would have to be F32, no?
           | 
           | Generally yes. `NonZeroU32::saturating_add(self, other: u32)`
           | is able to return `NonZeroU32` though! ( https://doc.rust-
           | lang.org/std/num/type.NonZeroU32.html#metho... )
           | 
           | > I cannot think of any way to enforce "non-zero-ness" of the
           | result without making it return an optional
           | Result<NonZeroF32>, and at that point we are basically back
           | to square one...
           | 
           | `NonZeroU32::checked_add(self, other: u32)` basically does
           | this, although I'll note it returns an `Option` instead of a
           | `Result` ( https://doc.rust-
           | lang.org/std/num/type.NonZeroU32.html#metho... ), leaving you
           | to `.map_err(...)` or otherwise handle the edge case to your
           | heart's content. Niche, but occasionally what you want.
        
             | alfons_foobar wrote:
             | > `NonZeroU32::saturating_add(self, other: u32)` is able to
             | return `NonZeroU32` though!
             | 
             | I was confused at first how that could work, but then I
             | realized that of course, with _unsigned_ integers this
             | works fine because you cannot add a negative number...
        
         | WJW wrote:
         | I imagine it would be something like Option<NonZeroF32>, since
         | -2.0 + 2.0 would violate the constraints at runtime. This gets
         | us the Option handling problem back.
         | 
         | I think the article would have been better with
         | NonZeroPositiveF32 as the example type, since then addition
         | would be safe.
        
       | the__alchemist wrote:
       | The examples in question propagate complexity throughout related
       | code. I think this is a case I see frequently in Rust of using
       | too many abstractions, and its associated complexities.
       | 
       | I would just (as a default; the situation varies)... validate
       | prior to the division and handle as appropriate.
       | 
       | The analogous situation I encounter frequently is indexing, e.g.
       | checking if the index is out of bounds. Similar idea; check;
       | print or display an error, then fail that computation without
       | crashing the program. Usually an indication of some bug, which
       | can be tracked down. Or, if it's an array frequently indexed, use
       | a (Canonical for Rust's core) `get` method on the whatever struct
       | owns the array. It returns an Option.
       | 
       | I do think either the article's approach, or validating is better
       | than runtime crashes! There are many patterns in programming.
       | Using Types in this way is something I see a lot of in OSS rust,
       | but it is not my cup of tea. Not heinous in this case, but I
       | think not worth it.
       | 
       | This is the key to this article's philosophy, near the bottom:
       | 
       | > I love creating more types. Five million types for everyone
       | please.
        
       | unixpickle wrote:
       | The `try_roots` example here is actually a _counterexample_ to
       | the author's main argument. They explicitly ignore the "negative
       | discriminant" case. What happens if we consider it?
       | 
       | If we take their "parse" approach, then the types of the
       | arguments a, b, and c have to somehow encode the constraint `b^2
       | - 4 _a_ c >= 0`. This would be a total mess--I can't think of any
       | clean way to do this in Rust. It makes _much_ more sense to
       | simply return an Option and do the validation within the
       | function.
       | 
       | In general, I think validation is often the best way to solve the
       | problem. The only counterexample, which the author fixates on in
       | the post, is when one particular value is constrained in a clean,
       | statically verifiable way. Most of the time, validation is used
       | to check (possibly complex) interactions between multiple values,
       | and "parsing" isn't at all convenient.
        
         | sbszllr wrote:
         | I was thinking a similar thing when reading the article. Often,
         | the validity of the input depends on the interaction between
         | some of them.
         | 
         | Sure, we can follow the advice of creating types that represent
         | only valid states but then we end up with `fn(a: A, b: B, c: C)
         | transformed into `fn(abc: ValidABC)`
        
       | AxiomLab wrote:
       | This exact philosophy is why I started treating UI design systems
       | like compilers.
       | 
       | Instead of validating visual outputs after the fact (like linting
       | CSS or manual design reviews), you parse the constraints upfront.
       | If a layout component is strictly typed to only accept discrete
       | grid multiples, an arbitrary 13px margin becomes a compile error,
       | not a subjective design debate. It forces determinism.
        
         | Garlef wrote:
         | curious: what kind of tooling would you use here?
        
       | qsera wrote:
       | >"Parse, Don't Validate,"
       | 
       | Ah, that pretentious little blog post that was taken up by many
       | as gospel, for some mysterious reason...
       | 
       | This tells me that any idea, even if stupid (or obvious in this
       | case) ones, can go mainstream, if you rhyme it right.
       | 
       | That blog post is like a "modern art" painting by some famous
       | author that totally look a child has flingned some paint to a
       | wall, but every person who looks at it tries so hard to find some
       | meaning that every one end up finding "something"..
       | 
       | In fact the top comment right now express that their
       | interpretation of blog post is different from the author's...
       | 
       | Amusing!
        
         | throw310822 wrote:
         | I only half understand this stuff, but all this encapsulation
         | of values so that they are guaranteed to remain valid across
         | manipulations... isn't it called Object Oriented Programming?
        
         | mrkeen wrote:
         | Was the original blog post wrong?
        
       | MarcLore wrote:
       | This pattern maps beautifully to API design too. Instead of
       | validating a raw JSON request body and hoping you checked
       | everything, you parse it into a well-typed struct upfront. Every
       | downstream function then gets guaranteed-valid data without
       | redundant checks. The Rust ecosystem makes this almost painless
       | with serde + custom deserializers. I've seen codebases cut their
       | error-handling surface area by 60% just by moving from validate-
       | then-use to parse-at-the-boundary.
        
       | barnacs wrote:
       | Every time you introduce a type for a "value invariant" you lose
       | compatibility and force others to make cumbersome type
       | conversions.
       | 
       | To me, invalid values are best expressed with optional error
       | returns along with the value that are part of the function
       | signature. Types are best used to only encode information about
       | the hierarchy of structures composed of primitive types. They
       | help define and navigate the representation of composite things
       | as opposed to just having dynamic nested maps of arbitrary
       | strings.
        
         | mrkeen wrote:
         | > They help define and navigate the representation of composite
         | things as opposed to just having dynamic nested maps of
         | arbitrary strings.
         | 
         | What would you say to someone who thinks that nested maps of
         | arbitrary strings have maximum compatibility, and using types
         | forces others to make cumbersome type conversions?
        
           | barnacs wrote:
           | If the fields of a structure or the string keys of an untyped
           | map don't match then you don't have compatibility either way.
           | The same is not true for restricting the set of valid values.
           | 
           | edit: To put it differently: To possibly be compatible with
           | the nested "Circle" map, you need to know it is supposed to
           | have a "Radius" key that is supposed to be a float. Type
           | definitions just make this explicit. But just because your
           | "Radius" can't be 0, you shouldn't make it incompatible with
           | everything else operating on floats in general.
        
       | ubixar wrote:
       | C# gets close to this with records + pattern matching, F#
       | discriminated unions are even better for this with algebraic data
       | types built right in. A Result<'T,'Error> makes invalid states
       | unrepresentable without any ceremony. C# records/matching works
       | for now, but native DUs will make it even nicer.
        
       ___________________________________________________________________
       (page generated 2026-02-22 16:00 UTC)