codevoid.de/1/hn/comments_47110473.gph

        _______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
  HTML Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
  HTML   What is a database transaction?
       
       
        rmunn wrote 14 hours 23 min ago:
        One way to think about transactions, as I wrote in an earlier comment,
        would be to think of them as being like snapshots in a copy-on-write
        filesystem like btrfs or zfs. But another way to think of them is being
        like Git branches.
        
        When you BEGIN a transaction, you're creating a branch in Git. Everyone
        else continues to work on the master branch, perhaps making their own
        branches (transactions) off of it while you're working. Every UPDATE
        command you run inside the transaction is a commit pushed to your
        branch. If you do a ROLLBACK, then you're deleting the branch unmerged,
        and its changes will be discarded without ever ending up in the master
        branch. But if you instead do a COMMIT, then that's a `git merge`
        command, and your changes will be merged into the master branch. If
        they merge cleanly, then all is well. If they do NOT merge cleanly,
        because someone else merged their own branch (committed their own
        transaction) that touched the same files that you touched (updated rows
        in the same table), then the DB will go through the file line by line
        (go through the table row by row) to try to get a clean merge. If it
        can successfully merge both changes without conflict, great. If it
        can't, then what happens depends on the transaction settings you chose.
        You can, when you start the transaction, tell the DB "If this doesn't
        merge cleanly, roll it back". Or you can say "If this doesn't merge
        cleanly, I don't care, just make sure it gets merged and I don't care
        if the conflict resolution ends up picking the "wrong" value, because
        for my use case there is no wrong value." This is like using "READ
        UNCOMMITTED" vs "SERIALIZABLE" transaction settings (isolation levels):
        you would use "READ UNCOMMITTED" if you don't care about merge
        conflicts in this particular table, and just want a quick merge. You
        would use "SERIALIZABLE" for tables with data that must, MUST, be
        correct, e.g. account balances. And there are two more levels in
        between for subtle differences in your use case's requirements.
        
        As with my previous comment, this is probably obvious to 98.5% of
        people here. But maybe it'll help someone get that "ah-ha!" moment and
        understand transactions better.
       
        shalabhc wrote 21 hours 26 min ago:
        For all interested in this topic, I highly recommend the book Designing
        Data Intensive Applications [1] .
        
        It goes into not only different isolation levels, but also some
        ambiguity in the traditional ACID definition.
        
        I believe a 2nd edition is imminent.
        
  HTML  [1]: https://www.goodreads.com/book/show/23463279-designing-data-in...
       
          vismit2000 wrote 11 hours 13 min ago:
          Second edition is available now:
          
  HTML    [1]: https://www.oreilly.com/library/view/designing-data-intensiv...
       
        interlocutor wrote 1 day ago:
        A lot of database tools these days prioritize instant sharing of
        updates over transactions and ACID properties. Example: Airtable. As
        soon as you update a field the update shows up on your coworkers screen
        who also has the same table open. The downside of this is that Airtable
        doesn't do transactions. And the downside of not doing transactions is
        potentially dangerous data inconsistencies. More about that here:
        
  HTML  [1]: https://visualdb.com/blog/concurrencycontrol/
       
          gtowey wrote 15 hours 39 min ago:
          Not so subtle product promo while taking a swipe at your competition.
       
        zzzeek wrote 1 day ago:
        I think this is a great post to have but I'm going to make a critical
        usability suggestion:
        
        * the videos should have "pause" and a "step at a time" control *
        
        Even at the "half speed", without a deep knowledge of the context, the
        videos move way too fast for me to read the syntax that's invoking and
        line it up with the data on the left side.    I (and im definitely not
        the only one) need to be able to sit on one step and stare at the whole
        thing without the latent anxiety of the state changing before I've had
        a chance to grok the whole thing.
        
        this has nothing to do with familiarity with the concepts (read my
        profile).   I literally need time to read all the words and connect
        them together mentally (ooh, just noticed this is pseudo-SQL syntax
        also, e.g. "select id=4", that probably added some load for me) without
        worrying they're going to change before watching things move.
        
        please add a step-at-a-time button!
       
          bddicken wrote 23 hours 47 min ago:
          I appreciate this feedback, and then you read through it with enough
          rigor to notice.
       
            johanyc wrote 1 hour 45 min ago:
            I second it. At the very least a pause button is needed.
       
        MHordecki wrote 1 day ago:
        Iâve found this article lacking. Like some other articles in this
        space, it introduces isolation levels through the lens of the phenomena
        described in the SQL standard, but I find that thereâs a different,
        more intuitive approach.
        
        I think itâs more tractable to define this problem space starting
        from the concept of (strict) serializability, which is really a
        generalization of the concept of thread safety. Every software engineer
        has an intuitive understanding of it. Lack of serializability can lead
        to execution-dependent behavior, which usually results in
        hard-to-diagnose bugs. Thus, all systems should strive towards
        serializability, and the database can be a tool in achieving it.
        
        Various non-serializable levels of database transaction isolation are
        relaxations of the serializability guarantee, where the database no
        longer enforces the guarantee and itâs up to the database user to
        ensure it through other means.
        
        The isolation phenomena are a useful tool for visualizing various
        corner cases of non-serializability, but they are not inherently tied
        to it. It's possible to achieve serializability while observing all of
        the SQL phenomena. For example, a Kubernetes cluster with
        carefully-written controllers can be serializable.
       
          lateforwork wrote 1 day ago:
          Most RDBMSs offer serializable isolation if you need it. Often you
          don't need it. The downside of using serializable isolation
          unnecessarily is reduced concurrency and throughput due to increased
          coordination between transactions.
       
          bddicken wrote 1 day ago:
          Author here. This is good feedback.
          
          The combination of transactions, isolation levels, and MVCC is such a
          huge undertaking to cover all at once, specially when comparing how
          it's done across multiple DBs which I attempted here. Always a
          balance between technical depth, accessibility to people with less
          experience, and not letting it turn into an hour-long read.
       
            jaxr wrote 1 day ago:
            I love the work planetscale does on keeping this type of content
            accurate yet accessible. Keep it up!
       
          Rapzid wrote 1 day ago:
           [1] More notation, more citations, more better.
          
  HTML    [1]: https://aphyr.com/posts/327-jepsen-mariadb-galera-cluster
       
            peterclary wrote 1 day ago:
            Looks like the author is geoblocking in protest of the UK Online
            Safety Act (and fair enough).
       
            bddicken wrote 1 day ago:
            Notation is useful. Citations are nice for further reading. But I
            don't agree more of this makes for a better article!
       
          mika6996 wrote 1 day ago:
          Then recommend a better explanation?
       
        rishabhaiover wrote 1 day ago:
        It's an absolute pleasure reading planetscale blogs. I'm curious about
        what tool is used to make these visualizations?
       
          bddicken wrote 1 day ago:
          Author here. Thank you! These visuals are built with js + gsap ( [1]
          )
          
  HTML    [1]: https://gsap.com
       
       
   DIR <- back to front page