codevoid.de/1/hn/comments_48492822.gph

        _______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
  HTML Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
  HTML   The only scalable delete in Postgres is DROP TABLE
       
       
        nicman23 wrote 12 min ago:
        funnily enough i had a broken mysql / mariadb install that drop would
        not work at all - pinned a core to 100% - but delete would be instant
        in 1M rows lmao
       
        elnerd wrote 6 hours 51 min ago:
        Iâm using Postgres for my DNS log service. I only store data for 90
        days. To delete data, my strategy is to use partitions based on month.
        At the start of every month, I drop one partition.
        
        I am not sure of this is the best way to do this, but it works for me.
       
        insumanth wrote 7 hours 3 min ago:
        This mostly applies to almost any database.
        
        Deletes are Writes and Writes are resource intensive. This is more
        prominent on databases like Elastic Search.
        
        When I was tasked to delete millions of (old) documents, it overloaded
        the cluster and almost brought it down. Only scalable solution was to
        split the index and drop the whole index.
       
        znnajdla wrote 8 hours 9 min ago:
        Why are databases so hard?
       
          DarkUranium wrote 2 hours 11 min ago:
          Because of the guarantees they provide.
          
          Storing some data in a binary file isn't very hard.
          Making it so that you can do quick lookups on it (indexes) and
          implementing joins in a sane way is kinda hard, but easy compared to
          the real problem:
          
          Ensuring ACID (in the case of "traditional" databases). I.e.
          Atomicity, Consistency, Isolation, Durability.
          
          You need to protect against data corruption in the event of failure,
          all while guaranteeing atomic operations at the user level
          concurrently (in most production DBs; SQLite is a notable exception
          in that it fully serializes writes --- but it can get away with this
          because it's an embedded database with the primary use case of a
          single-process writer).
          And the entire thing must land on a known good state at the end of
          all of those concurrent transactions.
          
          ... and they must do it all while maintaining good performance, and
          sometimes on a combination of filesystem + hardware that's actively
          hostile towards the idea of data integrity (e.g. hidden RAM caches in
          disk or RAID controllers that don't flush on power loss ---
          thankfully, those are getting rarer, or so I've come to understand).
       
        Syzygies wrote 10 hours 20 min ago:
        I can't believe I'm the first to Rick roll this thread with the most
        famous XKCD comic of all time:
        
  HTML  [1]: https://xkcd.com/327/
       
          TurdF3rguson wrote 9 hours 42 min ago:
          I can because this thread is not about sql injection.
       
            missingdays wrote 1 hour 38 min ago:
            But there's DROP TABLE in both the title and the comic, so it must
            be relevant
       
        buremba wrote 11 hours 39 min ago:
        Materialized tables are useful for time-series or sharding-like
        use-cases. You essentially offload the work to INSERT time to locate
        the data into relevant buckets/sub-tables that you can DROP later.
        
        We use materialized views for append-only timeseries data for [1] and
        the retention policies define how we DROP the tables so we don't
        DELETE/UPDATE any rows in the tables.
        
        The long term storage is Iceberg on S3 that's ingested via Postgresql
        replication, suitable for OLAP use-cases. Postgresql only stores the
        dimensional OLTP data the users can update and the hot append-only
        event data.
        
  HTML  [1]: https://lobu.ai
       
        ankitml wrote 12 hours 38 min ago:
        This is directionally correct approach. Deleting a large chunk of rows,
        in a large table does lead to unpredictable-bad behaviour for a while
        until those dead tuples are handled.
        
        I have used a very similar strategy by forking repack client [1] This
        works out of the box with rds/cloudsql etc.
        
  HTML  [1]: https://github.com/reorg/pg_repack/pull/326
       
        flossly wrote 14 hours 45 min ago:
        Years ago I heard Oracle's db had an edge on PG when it comes to
        DELETEs. I guess that's still the case...
       
          asgeirn wrote 2 hours 15 min ago:
          Also ALTER TABLE DROP PARTITION
       
          mike_hearn wrote 2 hours 35 min ago:
          Oracle has ALTER TABLE [name] MOVE ONLINE INCLUDING ROWS
          [predicates]. This is basically what you want - it keeps only the
          data matching the predicate, it's efficient and it works without
          disrupting the app. That might be the edge you heard about.
       
        saltcured wrote 17 hours 18 min ago:
        DROP DATABASE, for when a bunch of calls to DROP TABLE seems like too
        much overhead...
       
          mike_hock wrote 7 hours 58 min ago:
          pg_dropcluster for when a bunch of calls to DROP DATABASE seems like
          too much overhead.
       
        cowthulhu wrote 20 hours 16 min ago:
        IMO, needing to clear out an entire table is an indicator that
        something has gone wrong with your design.
        
        Don't get me wrong, I've definitely done it before, but it's in the
        same bucket as VACUUM for me... high impact interventions used to fix a
        mistake I made, not "course of business" actions.
       
          baq wrote 18 hours 24 min ago:
          You should run vacuum as often as possible in Postgres if youâre
          doing anything other than INSERTs, this is a design tradeoff in
          Postgres itself. Itâs the reason autovacuum exists and why tuning
          it is so important for performance; nothing wrong with doing a VACUUM
          ANALYZE after finishing a large DML batch job.
       
            TurdF3rguson wrote 9 hours 41 min ago:
            what does as often as possible mean? It will auto-vacuum when it's
            idle, right? Why not just let it do that?
       
              baq wrote 5 hours 5 min ago:
              What do you mean by âidleâ?
              
              If you mean your database is seeing extended periods of no
              updates to a table, you still want to vacuum, maybe even vacuum
              full if you know when traffic stopped and for how long to get the
              best possible read performance.
              
              If you have quiet periods in both reads and writes, enjoy the
              luxury of having an unused database to operate.
       
        inigyou wrote 20 hours 26 min ago:
        This website appears to be hard blocking my IP address (drops all
        packets( so I can't read the article.
       
        twotwotwo wrote 20 hours 41 min ago:
        Years ago work was bit by the analogous thing in MySQL. Like it usually
        does, it took a chain of events:
        
        - We wrote a cronjob to periodically DELETE for a retention policy on a
        table we'd just created. Most senior person on the team reviewed it,
        looked fine.
        
        - Unusually for us, we prioritize QA'ing a different feature for
        release, delaying the release of this cronjob and a bunch of other
        code.
        
        - During that delay, the new table accumulated many times more rows to
        be deleted than we'd expected during review.
        
        - Release happens. All looks well since the initial delete wasn't a
        migration and cronjob hasn't run yet; engineer doing the release signs
        off.
        
        - Cronjob runs, deleting hundreds of millions of rows quickly.
        
        - Next day, replica lag's high and MySQL's transaction history is very
        high. MySQL keeps transaction history around until purge threads have
        visited all the affected pages on disk.
        
        - The bad cluster conditions last for days and lead to other problems.
        
        This omits detail and the 'noise' of everything else we were watching.
        But it gets across how the code and MySQL behaved.
        
        Like most exciting events, it led to multiple changes to avoid a
        repeat. For retention policies, our new approach was one at the end of
        PlanetScale's post, to partition and drop old partitions. Transitioning
        to this from a huge unpartitioned table can be fun!
        
        If a table is append-only and already huge, with lots of rows already
        past the retention threshold, you might only copy the rows to be kept
        to the new partitioned table: copy what you can, lock tables, do a last
        catch-up copy and swap tables. (Roughly the blog's 'performant one-off
        delete'.)
        
        If the table's merely kind of big, gh-ost or such could allow you to
        ALTER without causing lag, locking, etc.
        
        At a scale below that, you could run a slow incremental 'nibble' delete
        while watching server stats, and a step below that, plain ALTERs or
        DELETEs are fine.
        
        Using partitioning has fun bits, too. In MySQL, the partition key has
        to be part of any unique index, understandably. But you have to keep
        that in mind when you're using INSERT..ON DUPLICATE KEY UPDATE and
        relying on uniqueness to trigger the update. Things stay interesting!
        
        I hear Vitess shops like PlanetScale usually don't run multi-terabyte
        myqsld instances in the first place: even when physical nodes are big,
        they run many smaller mysqlds on them. That wouldn't make all this
        fully irrelevant--huge deletes would still sometimes be worse than
        copy-swap-drop--but it does seem real handy for taming issues that tend
        to worsen with mysqld size, like replication lag. All to say, little
        bit jelly of their setup over there!
       
          DarkUranium wrote 1 hour 55 min ago:
          One thing I did a while ago was to make deletes part of inserts, to
          amortize the cost.
          
          The main reason was to avoid a separate cron job, but it had other
          benefits (and downsides) too. Something like:
          
              DELETE FROM foo WHERE expires_at < now() LIMIT 10;
              INSERT INTO foo .....;
          
          Note the LIMIT: it ensures the latency stays under control even if
          we've suddenly hit 50k rows that need deleting.
          
          And by deleting (up to) 10 each time we insert one, it ensures
          obsolete things will eventually get deleted.
          
          Obviously, this isn't viable when the deletions must happen due to
          strict policy (e.g. legal compliance) since it can't ensure when
          things get deleted, just that they eventually do.
          IIRC, in my case, I used it for a password reset tokens table.
          There's no legal issue there and keeping expired ones around is fine
          as long as the code also checks `expires_at` to make sure it's still
          valid (which would be a good practice regardless, for defense in
          depth).
       
        foreigner wrote 20 hours 42 min ago:
        Surprisingly to remove small numbers of rows in multiple tables (e.g.
        cleanup between automated tests), DELETE is often faster than TRUNCATE!
        It's counterintuitive but just measure it for yourself and see. Note
        you can DELETE from multiple tables in one statement using CTEs, and
        that way you don't need to think about foreign key dependency order.
       
        xenator wrote 20 hours 44 min ago:
        Deleting whole file is faster then deleting rows in file.
       
        smithcheck4 wrote 20 hours 49 min ago:
        I have been using TRUNCATE allm the way, and is fast as very good.
       
        saisrirampur wrote 21 hours 5 min ago:
        Partially true but too much of a blanket statement and clickbaity.
        
        DELETE with well-tuned autovacuum works pretty well. Have seen it work
        at TBs scale with no hicuups. If DELETEs are large, we used to
        recommend customers to follow that with a manual VACUUM for table to
        reclaim space right away for future rows.
        
        DROP TABLE can be risky, it requires an ACCESS EXCLUSIVE LOCK and if
        its waiting, it blocks all other statements following it, because of
        how lock queues work in Postgres. And you cannot keep doing high
        concurrent DROP TABLEs to run your large scale CRUD app.
       
          saisrirampur wrote 18 hours 35 min ago:
          Separately, this is one of the Postgres autovacuum tuning blog that
          I've ever read. Have seen it work across many customers and it is
          also simple to decipher and implement.
          
  HTML    [1]: https://www.citusdata.com/blog/2022/07/28/debugging-postgres...
       
          CharlieDigital wrote 19 hours 29 min ago:
          > And you cannot keep doing high concurrent DROP TABLEs to run your
          large scale CRUD app
          
          In this kind of use case/design, I would assume it would make use of
          partitions to make this more palatable in which case it would seem
          that you would bypass this issue of "high concurrent DROP TABLE". 
          Large scale CRUD app just points to recent-ish partitions.  Old
          partitions are either going to be low or on access and can be dropped
          easily or transformed/transferred into some long term/cold storage.
       
            saisrirampur wrote 18 hours 39 min ago:
            (Most) CRUD/OLTP applications don't delete data by timestamp; they
            delete by primary key. For those workloads, DROP TABLE (or dropping
            a partition) isn't a palatable option.
            
            The entire premise here is really about time-series workloads where
            most operations are based on a timestamp. In those apps partition
            dropping has been a standard and recommended retention strategy for
            years. That's precisely why extensions like pg_partman and
            TimescaleDB exist. Given that context, the title feels more
            clickbaity, and could easily mislead readers into thinking this
            applies broadly to OLTP systems when it doesn't;
       
              parthdesai wrote 13 min ago:
              > (Most) CRUD/OLTP applications don't delete data by timestamp;
              they delete by primary key. For those workloads, DROP TABLE (or
              dropping a partition) isn't a palatable option.
              
              UUID v7 to the rescue!
       
        aftbit wrote 21 hours 10 min ago:
        TIL about pg_partman
        
  HTML  [1]: https://github.com/pgpartman/pg_partman/blob/development/doc/p...
       
        levkk wrote 21 hours 13 min ago:
        CRUD apps don't usually delete in bulk. It's also hard to structure
        partitions in a way that doesn't wipe out months of important business
        data -- this is why teams often ETL their DB into Snowflake/ClickHouse
        and only then drop partitions. That makes it hard for the app to use
        that data again.
        
        The better approach is either to change your storage engine (e.g.
        OrioleDB is working on adding the undo log to Pg), or to shard which
        distributes the vacuum load across multiple servers.
       
          sgarland wrote 20 hours 50 min ago:
          They should be performing bulk deletions, due to GDPR: âData must
          be stored for the shortest time possible.â Unless you have some
          kind of rolling cron checking every few minutes (and even then,
          depending on your scale, that may well be considered bulk), that
          generally resolves to something like daily or weekly deletions.
       
        jandrewrogers wrote 21 hours 14 min ago:
        This generalizes to most (all?) databases. Selective deletion is
        largely an unsolved problem at scale in databases to the extent it
        doesn't release the deleted resources. Under the hood databases try to
        turn this into selective resource truncation, which scales much better,
        but in most cases that is not possible without careful design of your
        data model.
        
        Similarly, you often have to remind devs that in many databases an
        UPDATE is just an INSERT + DELETE, with all of the scaling issues
        implied.
       
          mike_hearn wrote 2 hours 53 min ago:
          RocksDB and other LSM tree backed databases do have cheap deletes and
          updates, although you could argue that's because they make everything
          else expensive. If you have spare cores it can be a good trade
          though.
       
        crazygringo wrote 21 hours 15 min ago:
        Only by a weird definition of "scalable". The first sentence says:
        
        > Counterintuitively, large DELETEs add work to the database.
        
        There is nothing counterintuitive about this. It takes just as much
        work to delete a row as it takes to insert a row. Why wouldn't it?
        Obviously you have to do almost all the same operations: write a log,
        write the deletion, update indices, replicate it, etc.
        
        And yes, it's a well-known trick for all major relational databases
        (not just Postgres) that if you want to delete 90% of rows from a large
        table, it's much faster to just copy the rows you want to keep to a new
        table, run DROP TABLE on the old table, and rename the new table to the
        old table. Since DROP TABLE is ~instantaneous, mainly involving
        table-level metadata.
        
        DELETE scales just fine, in the sense that if you are constantly
        inserting and deleting individual rows, DELETE scales the same as
        INSERT.
        
        Basic database functionality is designed around the assumption of lots
        of small transactions. Whenever you have to do something involving
        millions of rows at once, you generally need to investigate solutions
        that work well in "bulk". E.g. loading rows directly from a file rather
        than with SQL, adding indices only after the data has been loaded
        rather than before, disabling foreign key checks on large operations
        (if you know by design that the keys are valid)... and yes, taking
        advantage of DROP TABLE instead of DELETE. This doesn't mean small
        transactions aren't scalable, it just means bulk operations are
        qualitatively different and benefit from their own solutions. And
        DELETE is no different from INSERT in this regard.
       
          globular-toast wrote 5 hours 12 min ago:
          > It takes just as much work to delete a row as it takes to insert a
          row. Why wouldn't it?
          
          Because your data structure/algorithm supports fast deletes? File
          systems support deleting entire directories instantly. I'm not aware
          of any fundamental reason why DELETE in a SQL database must take as
          long as an insert?
       
          umpalumpaaa wrote 9 hours 29 min ago:
          Does this âDrop hackâ work well with foreign keys, triggers and
          constraints?
       
          coldtea wrote 16 hours 7 min ago:
          >There is nothing counterintuitive about this. It takes just as much
          work to delete a row as it takes to insert a row. Why wouldn't it?
          
          Because e.g. DROP also effectively deletes the rows but takes way way
          less work.
       
          Retr0id wrote 20 hours 29 min ago:
          > if you are constantly inserting and deleting individual rows,
          DELETE scales the same as INSERT
          
          Technically correct, but for a small table with a high churn rate,
          the performance characteristics may be surprising in that the "n" in
          most big-O calculations includes all inserts since the last VACUUM,
          not the actual number of resident rows.
       
          winterbloom wrote 20 hours 35 min ago:
          how does that solution work if the table that is dropped has foreign
          key constraints?
       
            crazygringo wrote 20 hours 31 min ago:
            That's why I said:
            
            > disabling foreign key checks on large operations
            
            And you have to know that, according to your business logic, what
            you're doing is safe.
       
          setr wrote 21 hours 3 min ago:
          > It takes just as much work to delete a row as it takes to insert a
          row. Why wouldn't it? Obviously you have to do almost all the same
          operations: write a log, write the deletion, update indices,
          replicate it, etc.
          
          It takes far more work to delete/update than insert. My recent
          example is updating ~2TB of text data was about 40x slower than
          inserting 12TB (was trying to correct some large text truncation that
          occurred during migration into PG, ended up being faster to redo).
       
            crazygringo wrote 20 hours 38 min ago:
            > It takes far more work to delete/update than insert.
            
            Updating rows of text data is going to be more work, because
            variable-length text can't be updated in-place. So in terms of
            allocating space, it's more like a delete plus an insert. That's
            not surprising. (An in-place update that doesn't touch indices is
            generally going to be faster than an insert, though.)
            
            I'm not aware of instances where a delete is "far more work" than
            an equivalent insert though. That's not the general case, and I'm
            having a hard time thinking of any situations where that would be
            true.
       
              fidotron wrote 19 hours 39 min ago:
              > I'm not aware of instances where a delete is "far more work"
              than an equivalent insert though. That's not the general case,
              and I'm having a hard time thinking of any situations where that
              would be true.
              
              Transactionally across related items with constraints it can
              explode fast.
              
              If you've ever used FoundationDB this rapidly becomes the
              defining PITA due to the transaction size limits.
              Adding/inserting/updates are all far more predictably bounded.
       
                tremon wrote 17 hours 47 min ago:
                But in that case, you need to compare like-for-like with the
                situation where you need to insert all the prerequisite rows
                too. You can't just compare a delete cascade with a single
                insert where all the foreign keys are already satisfied.
       
                  fidotron wrote 15 hours 42 min ago:
                  The whole problem with the delete cascade is you can't tell
                  how big it will be until you have entered the transaction to
                  do it. An insert you either know or it will fail and you can
                  retry.
       
                    tremon wrote 15 hours 1 min ago:
                    That's true, but now you have moved the goalposts. The
                    original claim upthread was "it takes just as much work to
                    delete a row as it takes to insert a row", not "it's hard
                    to predict the performance of a delete with cascade
                    effects". And the obvious rebuttal to that is that it's
                    equally hard to provide an upper bound for the runtime of a
                    single insert: an application cannot control the other
                    processes running on the database, some of which may delay,
                    interfere with or even invalidate your query and you must
                    account for that. A delete operation is just as much "it
                    might fail and you can retry" as an insert, or the database
                    you're working with isn't ACID-compliant.
       
                      fidotron wrote 14 hours 56 min ago:
                      > And the obvious rebuttal to that is that it's equally
                      hard to provide an upper bound for the runtime of a
                      single insert
                      
                      This is precisely where you're going wrong. The insert is
                      upper boundable in advance (you know the set of
                      everything you might potentially have to insert), the
                      delete isn't because you don't know what's in the db
                      until you look.
                      
                      I strongly recommend poking around with Foundation for
                      this, because it becomes clear that this problem is the
                      defining flaw with the way they tried to architect with
                      layers, to the point they have a queuing system for
                      processing large jobs of this type.
       
                        ElectricalUnion wrote 14 hours 3 min ago:
                        > The insert is upper boundable in advance
                        
                        A concurrent DML happening then suddenly your MERGE
                        INTO WHEN NOT MATCHED INSERT/INSERT INTO SELECT is way
                        larger that you thought? I thought "some workloads can
                        suddenly be way larger that I expected" was supposed to
                        be a thing in all non-trivial DML.
       
              pixl97 wrote 19 hours 59 min ago:
              Not directly database related, but when it comes to writing files
              on disks, deletes on SSDs can be rather expensive because of the
              delete block size vs a simple write.
       
                consp wrote 2 hours 27 min ago:
                Doesn't the block simply get marked as deleted and only wiped
                on a write?
       
              gopalv wrote 20 hours 2 min ago:
              > So in terms of allocating space, it's more like a delete plus
              an insert.
              
              Unless you're using zHeap, you have a narrow Heap-only-Tuples
              scenario where the indexes stay the same. TOAST kinda helps
              there, if the update is off the tuple area itself. The original
              zHeap docs have a lot of detail about why an UNDO log can help
              with long running transactions from the past etc.
              
              That is a postgresql specific thing though. Mysql indexes were
              created with the idea of different storage engines in mind, so
              Mysql doesn't suffer from the index update ovehead on
              update/delete the same way.
              
              Uber had a long blog post about switching to Mysql from Postgres
              for wide tables with hundreds of indexes. The HN entry is still
              there[1], but I can't read the original post now.
              
              As a side note, I've used postgres partitions to the same effect
              to drop old data periodically - detach and then drop the
              partition instead of a direct DELETE (similar tricks in HBase
              existed). [1] -
              
  HTML        [1]: https://news.ycombinator.com/item?id=10894047
       
                stuartd wrote 12 hours 20 min ago:
                > The HN entry is still there, but I can't read the original
                post now.
                
                The post on IA -
                
  HTML          [1]: https://web.archive.org/web/20160304013342/https://eng...
       
          echoangle wrote 21 hours 13 min ago:
          > And yes, it's a well-known trick for all major relational databases
          (not just Postgres) that if you want to delete 90% of rows from a
          large a table, it's much faster to just copy the rows you want to
          keep to a new table, run DROP TABLE on the old table, and rename the
          new table to the old table.
          
          Dumb question but why does the optimizer not just do that in secret
          then? Seems like something that should be detectable with some
          heuristics.
       
            layer8 wrote 18 hours 49 min ago:
            Because dropping a table effectively requires an exclusive lock on
            the table during that whole operation, affecting parallel
            transactions.
       
            crazygringo wrote 20 hours 33 min ago:
            Because what do you do if rows are being inserted in the original
            table, while the new table is having rows copied over? You'll get
            missing rows.
            
            You can only do the DROP TABLE trick if you know nothing else is
            writing to the table at the same time. You know if that's the case,
            according to your business logic. The database has no idea.
            
            The DROP TABLE trick effectively bypasses all the normal guarantees
            of data consistency. This is why it's so fast. But you have to know
            that that's a safe thing to do for your data.
       
              nostrademons wrote 19 hours 43 min ago:
              There are ways the DB could recover the data consistency
              guarantees, eg. keeping a log of operations that came in while
              the table was being copied over and then applying the relevant
              ones afterwards.
              
              The tricky part is that the latency characteristics of these
              operations would be pretty surprising and unintuitive.    It has
              the same problems as virtual memory and mark/sweep GC; sometimes,
              depending on system state and things that other threads are
              doing, an unrelated operation might block for very long time
              periods and give you huge user-visible pauses.    It's often better
              to force these expensive operations to be explicit so that the
              developer has to think through the latency & consistency
              implications and make the tradeoffs they want.
       
                convolvatron wrote 18 hours 32 min ago:
                except in this particular case, as long as you don't exhaust
                resources, mvcc kind of lets you get away with making a
                transactionally consistent copy under the covers without
                blocking anyone, since its a big-ol read.
       
            Retr0id wrote 20 hours 45 min ago:
            Maintaining the expected observable behaviours would get
            complicated if queries (especially other updates) against the same
            table are happening concurrently.
       
            sgarland wrote 20 hours 55 min ago:
            I assume partly because that would be extremely surprising
            behavior, and depending on the RDBMS and version, could introduce
            unexpected stalls. For example, MySQL < 8.0.23 scans the entire
            buffer pool to clear pages that were dropped, which can take a long
            time on large instances. There is / was a similar issue with its
            adaptive hash index, which AFAIK wasnât ever fixed, though
            AHIâs default being shifted to OFF in 8.4 is a workaround, in a
            very hacky way.
       
            mordae wrote 21 hours 6 min ago:
            It drops dependents.
       
        pstuart wrote 21 hours 16 min ago:
        Yep, partitions are the way to go there.
       
          awinter-py wrote 21 hours 8 min ago:
          ^ this
          
          been exploring clickhouse and while it is definitely not a general
          purpose DB, for time-series shaped data that can survive some insert
          latency, the automatic partition-based TTL is very nice and, at least
          so far, requires zero attention to maintain
          
          which I guess is solved by `pg_partman` at the bottom of the post
       
        sgarland wrote 21 hours 27 min ago:
        The same is true to a lesser extent in MySQL / MariaDB. It does better
        since it doesnât do oldest-to-newest tuple chains, but itâs still
        adding non-trivial work to the DB, much of which is effectively wasted
        if you donât care about the visibility of the deleted (or soon-to-be
        deleted) tuples to other transactions.
        
        I sincerely hope that Planetscaleâs efforts succeed long-term to
        shift devsâ understanding and acceptance of RDBMS operations. Their
        blog posts and docs are generally quite good. IME, devs (and even
        ops-ish teams) simply do not care about all of this, and will create
        elaborate bespoke tooling to run DELETEs in bulk, because they either
        donât understand the capabilities of the database, or donât want to
        deal with the [minor] increased complexity that a partitioned schema
        brings, and will happily pay the extra cost / latency for deletions.
       
          kro wrote 17 hours 45 min ago:
          mysql/maria also lets you turn off/down the isolation level for
          queries if you know the guarantees aren't needed, to speed things up.
          I think postgres does not have that option.
       
            alternbet25676 wrote 14 hours 52 min ago:
            Postgres does support changing the isolation level at query,
            session and config:
            
  HTML      [1]: https://www.postgresql.org/docs/current/sql-set-transactio...
       
       
   DIR <- back to front page