hngopher.com/1/live/items/47648538

       [HN Gopher] I imported the full Linux kernel git history into pgit
       ___________________________________________________________________
        
       I imported the full Linux kernel git history into pgit
        
       Author : ImGajeed76
       Score  : 161 points
       Date   : 2026-04-05 12:08 UTC (4 days ago)
        
  HTML web link (oseifert.ch)
  TEXT w3m dump (oseifert.ch)
        
       | gurjeet wrote:
       | Technically correct title would be: s/Kernel into/Kernel Git
       | History into/                   Pgit: I Imported the Linux Kernel
       | Git History into PostgreSQL
        
         | worldsayshi wrote:
         | Wow that has a very different meaning from what I thought.
        
       | JodieBenitez wrote:
       | Read the title and immediately thought "what a weird way to solve
       | the performance loss with kernel 7..." The mind tricking itself
       | :)
        
       | tombert wrote:
       | If I recall correctly, the Fossil SCM uses SQLite under the
       | covers for a lot of its stuff.
       | 
       | Obviously that's not surprising considering its creator, but
       | hearing that was kind of the first time I had ever considered
       | that you could translate something like Git semantics to a
       | relational database.
       | 
       | I haven't played with Pgit...though I kind of think that I should
       | now.
        
         | gjvc wrote:
         | "If I recall correctly, the Fossil SCM uses SQLite under the
         | covers for a lot of its stuff."
         | 
         | a fossil repository file is a .sqlite file yes
        
           | tombert wrote:
           | Makes sense, I haven't used the software in quite awhile.
        
           | ptdorf wrote:
           | So SQLite is versioned in SQLite.
        
             | yjftsjthsd-h wrote:
             | Yep:) To be fair, I expect git to be stored in git,
             | mercurial to be in mercurial, and... Actually now I wonder
             | how svn/cvs are developed/versioned.
        
               | deepsun wrote:
               | SVN in SVN for sure, it's a well made product. The market
               | just didn't like it's architecture/UX that doctates what
               | features available.
               | 
               | CVS is not much different from copying files around, so
               | would not be surprised if they copied the files around to
               | mimic what CVS does. CVS revolutionized how we think of
               | code versioning, so it's main contribution is to the
               | processes, not the architecture/features.
        
               | vidarh wrote:
               | The market did like it just fine until Git came around.
               | It just had a very brief moment in the sun....
        
               | tombert wrote:
               | My first software job, I was a junior person, and every
               | Friday, we would have The Merge, where we'd merge every
               | SVN branch into trunk. We always spoke of it like it was
               | this dreadful proper noun, like Voldemort or something.
               | 
               | The junior engineers were the ones doing this, and
               | generally my entire day would be spent fixing merge
               | conflicts. Usually they were easy to resolve, but
               | occasionally I'd hit one that would take me a very long
               | time (it didn't help that I was still pretty
               | inexperienced and consequently these things were just
               | sort of inherently harder for me). I just assumed that
               | this was the way that the world was until I found `git-
               | svn`.
               | 
               | `git-svn` made a task that often took an entire day take
               | something like 45 minutes, usually much less. It was like
               | a light shining down from heaven; I absolutely hated
               | doing The Merge, and this just made it mostly a _solved_
               | problem.
               | 
               | After that job, I sort of drew a soft line in the sand
               | that I will not work with SVN again, because at that
               | point I knew that merging could be less terrible. I
               | wasn't necessarily married to git in particular, but I
               | knew that whatever the hell it was that SVN was doing, I
               | didn't like it.
        
         | anitil wrote:
         | The sqlite project actually benefited from this dogfooding.
         | Interestingly recursive CTEs [0] were added to sqlite due to
         | wanting to trace commit history [1]
         | 
         | [0] https://sqlite.org/lang_with.html#recursive_query_examples
         | 
         | [1] https://fossil-scm.org/forum/forumpost/5631123d66d96486 -
         | My memory was roughly correct, the title of the discussion is
         | 'Is it possible to see the entire history of a renamed file?'
        
           | anitil wrote:
           | On and of course, the discussion board is itself hosted in a
           | sqlite file!
        
         | 20after4 wrote:
         | When you import a repository into Phabricator, it parses
         | everything into a MySQL database. That's how it manages to
         | support multiple version control systems seamlessly as well as
         | providing a more straightforward path to implementing all of
         | the web-based user interface around repo history.
        
         | adastra22 wrote:
         | Git was a (poor) imitation of the monotone DVCS, which stored
         | its data in sqlite.
        
           | xeubie wrote:
           | True, git poorly imitated monotone's performance problems.
        
       | niobe wrote:
       | Very cool
        
       | tonnydourado wrote:
       | That was an informative post but Jesus Christ on a bicycle, reign
       | in the LLM a bit. The whole thing was borderline painful to read,
       | with so many "GPTisms" I almost bailed out a couple of times. If
       | you're gonna use this stuff to write for you, at least *try* to
       | make it match a style of your own.
        
         | vidarh wrote:
         | To add a tip on _how_ to make it match your own style: You can
         | get decently far by pointing it to a page or so of your own
         | writing, and simply tell it to review the post section by
         | section and edit it to match the tone and style of the example.
         | It 's not perfect by any means, but it will tend to edit out
         | the type of language you're not likely to use, so really to
         | make it sound less LLM-like, almost any writing sample from a
         | human author works.
        
           | mplanchard wrote:
           | You can also just write it.
           | 
           | I'd much rather read someone's imperfect writing than the
           | soulless regression-to-the-mean that LLMs produce. If you're
           | not a native speaker or don't have confidence in your
           | writing, I'd urge you to first ask for an edit by another
           | human, but if that's not an option, to be extremely firm in
           | your LLM prompting to just have it fix issues of grammar,
           | spelling, etc.
        
             | vidarh wrote:
             | Almost nobody recognises well written AI texts. I've seen
             | plenty of AI written text pass right by people who are sure
             | they can always tell. It takes very little, because the
             | vast majority of AI writing you spot involves people doing
             | nothing to make it clean up the style.
        
             | erichanson wrote:
             | "soulless regression-to-the-mean", damn that's quote of the
             | day.
        
         | darkwater wrote:
         | 100% agreed. Maybe this inner reaction will disappear over the
         | years of being exposed to the GPT writing style, or maybe LLMs
         | will be "smarter" on this regard, and being able to use
         | different styles even by default. But I had the same exact
         | feelings as you reading this piece.
        
           | vidarh wrote:
           | It's really simple to fix by asking an LLM to apply a style
           | from a sample, so my guess is a lot of product will build in
           | style selection, and some provider will add more aggressive
           | rules in their system prompts over time.
        
             | mplanchard wrote:
             | It's not even just about the style. It's a matter of
             | respect for your readers. If you can't be bothered to take
             | the time to write it, why on earth should I care enough to
             | take the time to read it?
        
               | vidarh wrote:
               | If the content has value, I could not care less.
        
             | jillesvangurp wrote:
             | I would recommend using guard rails to guide tone,
             | phrasing, etc. This helps prevent whole categories of bad
             | phrasing. It also helps if you provide good inputs for what
             | you actually want to write about and don't rely too much on
             | it just filling empty space with word soup. And iterate on
             | both the guard rails and the text.
        
               | multjoy wrote:
               | Or, you know, just write it yourself.
        
             | darkwater wrote:
             | Yes, but you need a style before :) But in TFA's author
             | case, he actually had a few other blog posts which feel not
             | LLM generated to use as an example, I agree.
        
               | vidarh wrote:
               | But for plenty of applications it doesn't need to be your
               | _personal_ style. It only needs to be your personal style
               | if you want to present it as your own writing. Otherwise
               | it just matters that it 's well written. A catalogue of
               | styles would work well for lots of uses.
        
               | 47282847 wrote:
               | ,,Rewrite in a style appealing to Hacker News users
               | critical of AI slop".
        
         | consp wrote:
         | I stopped at "pgit handled it.". The tldr was appreciated
         | though as now I don't have to sieve though the LLM bloat.
        
         | mplanchard wrote:
         | I did bail out because of this, despite being pretty interested
         | in the content. I love reading, but I cannot stand LLM
         | "writing" output, and few things are important enough for me to
         | force myself through the misery of ingesting ChatGPT "prose." I
         | only made it to the second section of this one.
        
       | spit2wind wrote:
       | > only a handful of VCS besides git have ever managed a full
       | import of the kernel's history. Fossil (SQLite-based, by the
       | SQLite team) never did.
       | 
       | I find this hard to believe. I searched the Fossil forums and
       | found no mention of such an attempt (and failure). Unfortunately,
       | I don't have a computer handy to verify or disprove. Is there any
       | evidence for this claim?
        
         | gritzko wrote:
         | I was giving students an assignment to import git repo into
         | fossil and the other way around. git was a tad faster, but not
         | dramatically.
        
         | ImGajeed76 wrote:
         | i did look into this before writing the post. there's a fossil-
         | users mailing list post by Isaac Jurado where he reported that
         | importing Django took ~20 minutes and importing glibc on a 16GB
         | machine had to be interrupted after a couple of hours. he
         | explicitly warned against trying the linux kernel. the largest
         | documented import on the fossil site itself was NetBSD pkgsrc
         | (~550MB) which already showed scaling issues. so "never did" is
         | fair - not because anyone tried and failed, but because it was
         | known to be impractical and explicitly discouraged.
        
       | corbet wrote:
       | I hate to blow our own horn, but I'm gonna...if you are
       | interested in seeing this kind of kernel-development data mining,
       | fully human-written, LWN posts it every development cycle. The
       | 6.17 version (https://lwn.net/Articles/1038358/) included the
       | buggiest commit and much surrounding material. See our kernel
       | index (https://lwn.net/Kernel/Index/#Releases) for information on
       | every kernel release since 2.6.20.
       | 
       | Or see LWN on Monday for the 7.0 version :)
        
         | ImGajeed76 wrote:
         | Thanks! LWN's development cycle reports are incredible and were
         | actually an inspiration. The goal here wasn't to replace that
         | kind of expert analysis but to show what becomes possible when
         | you can just write SQL against the raw history. Your reports
         | add the context and understanding that no database query can
         | provide.
        
       | anonair wrote:
       | I wish one day tools like gitlab and forgejo ditch filesystem
       | storage for git repos and put everything in sqldb. I'm tired of
       | replicating files for DR
        
       ___________________________________________________________________
       (page generated 2026-04-09 23:02 UTC)