URI:
        _______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
  HTML Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
  HTML   Coccinelle: Source-to-source transformation tool
       
       
        mingodad wrote 11 hours 18 min ago:
        There is also [1] that can be used to help refactoring C code.
        
  HTML  [1]: https://sparse.docs.kernel.org/
       
        harel wrote 14 hours 57 min ago:
        What an interesting and very apt choice of name!
        
        Edit: I feel like it needs the addition that Coccinelle was the stage
        name of Jacqueline Charlotte Dufresnoy (1931), a French performer and
        probably the first celebrity who underwent a transition from male to
        female.
       
        peterfirefly wrote 1 day ago:
        The best thing Julia Lawall ever did!
        
        Not the hardest, not the thing with the most sophisticated theories
        behind it, not the thing that helped her academic career the most...
        but definitely the best and the most useful.
        
        There must be a lot of other academics who could do things that are
        less theoretical but more useful than what they normally do.
        
        There must be a lot of undervalued academics who in effect are punished
        for doing things that useful without requiring quite as much deep
        theory as their fields can muster.
        
        I'm glad she did something that she wasn't really rewarded for and I'm
        sad that the academic reward functions are so off.
       
          astahlx wrote 1 day ago:
          I can only agree. It is great work; I met Julia in several occasions
          were we other academics tried to push our formal methods stuff for
          checking properties of the Linux kernel. Also ours worked but in a
          way more complicated way, very resource intense, and less effective
          than Julia’s work.
       
        clarabennett26 wrote 1 day ago:
        coccinelle's one of those tools that's stupid powerful once it clicks,
        but man the learning curve is steep. i used it to migrate like ~200
        call sites when we tweaked an internal API signature in a big C
        codebase - doing that by hand wouldve been a multi-day slog. the
        semantic patch language feels kinda weird at first, but it catches edge
        cases regex stuff just misses, like matching through macro expansions
        and all that
       
        VorpalWay wrote 1 day ago:
        Not the same level of sophistication, but ast-grep allows this for far
        more languages, since it is based on the tree-sitter parser library. I
        have used it with some success on C++. Of course it only works on the
        AST level, and C++ famously need types for correct parsing, so it
        sometimes fall short (also on macros).
        
  HTML  [1]: https://ast-grep.github.io/
       
          gritzko wrote 1 day ago:
          I am working on AST level revision control and yes, macros make life
          difficult. On the other hand, merging/diffing on the AST level is
          fun.
          
  HTML    [1]: https://replicated.wiki/blog/partI.html
       
            VorpalWay wrote 1 day ago:
            I found that font extremely hard to read for some reason (on my
            phone). So I gave up. Maybe due to you using a monospace font for
            non-code?
            
            But I believe smalltalk represented code as functions in a database
            somehow, so maybe that is worth looking at.
       
              gritzko wrote 1 day ago:
              That is JetBrains Mono. HN is using Verdana, I believe, that one
              is recommended as the most usable common font.
              
              Smalltalk is ancient. I would say, Unison is an interesting
              recent experiment, and there are others. But, I am interested in
              universal revision control, any language.
       
              cpeterso wrote 1 day ago:
              > So I gave up.
              
              Try your mobile browser’s reader view mode.
       
        zabzonk wrote 1 day ago:
        I thought this was a misspelled article about Kokinelli, the Greek red
        wine, fairly accurately described here: [1] I used to drink this stuff
        back in the late 1960s, when my Dad was an RAF pilot based in Cyprus
        and I was about 15. You had to take it with a Sprite mixer if you
        wanted to retain your teeth.
        
        It would be a good name for a project, though.
        
  HTML  [1]: https://www.arrse.co.uk/wiki/Kokinelli
       
          adonovan wrote 21 hours 53 min ago:
          They are cognate: cochineal is a red dye derived from a red insect a
          little bit like a ladybug.
       
        pm215 wrote 1 day ago:
        I think Coccinelle is a really cool tool, but I find its documentation
        totally incomprehensible for some reason. I've read through it multiple
        times, but I always end up having to find some preexisting script that
        does what I want, or else to blunder around trying different variations
        at random until something works, which is frustrating.
       
        twic wrote 1 day ago:
        See also OpenRewrite: [1] And i assume any large organisation running a
        monorepo has some vaguely equivalent tooling for making mass changes.
        Have any of them published about that?
        
  HTML  [1]: https://github.com/openrewrite/rewrite
       
          karlding wrote 15 hours 42 min ago:
          You can write automated refactoring with clang tools if you need
          AST-level knowledge across your project (or monorepo).
          
          I’m not sure if there’s other public examples leveraging this,
          but Chromium has this document [0] which has a few examples. And
          there’s also the clang-tidy docs [1].
          
          [0] [1]
          
  HTML    [1]: https://chromium.googlesource.com/chromium/src/+/80a6fc33dee...
  HTML    [2]: https://releases.llvm.org/21.1.0/tools/clang/tools/extra/doc...
       
          conartist6 wrote 1 day ago:
          This is a business that I suspect may not survive BABLR.
          
          > Moderne's build plugins allow for LSTs to be serialized to disk.
          This makes the process of consuming and editing large quantities of
          them much more efficient. OpenRewrite's build plugins, on the other
          hand, store everything in memory and need to be reparsed every time
          there is a change.
          
          So yeah I'm giving away open standards to everyone for free that do
          the thing they expect people to pay them for...
       
            rzzzt wrote 1 day ago:
            What's BABLR?
       
              cstrahan wrote 1 day ago:
               [1] > The next-gen LR parser framework for creating elegant and
              efficient language tools
              
              > BABLR is a new kind of thing that does not quite fit into any
              category of things that has existed before it. In purpose it is
              made to be an instrument of code literacy -- a unified toolchain
              for software developers that supports a new generation of richly
              visual interfaces for coding. In form BABLR is a collection of
              scripts and virtual machines written in plain Javascript that run
              in almost any modern web browser. BABLR is also a community and
              an ecosystem, including a small but rapidly growing collection of
              ready-to-use parsers for popular languages.
              
  HTML        [1]: https://bablr.org/
       
                twic wrote 1 day ago:
                At first brush, everything about this sounds like overly
                ambitious vapourware. Is there a reason to think this is going
                to deliver? People involved, what's already shipped, etc?
                
                I particularly loved this from their roadmap:
                
                > Completed
                
                > Shift operation
                
                > Enables LR parsing of expressions like 2+2
                
                Being able to parse 2 + 2 is definitely good!
                
                And their thoughts on testing:
                
                > How our project reaches production stability is a process
                that often surprises people. We don't write a lot of tests for
                example, and we often don't do much testing before we ship
                releases. Instead we test exhaustively after we ship releases,
                which is the only way we know of knowing for sure that the
                product we shipped does what we think it does. [...] We also
                don't (usually) practice TDD. If you look at the number of
                tests we have, it likely won't seem like it's anywhere near
                enough to keep a project of this size stable! The secret sauce
                here is that our key invariants aren't written in our test
                files, they're baked into the core of the implementation. Every
                time you use the code, you're essentially testing it. To gain
                confidence in our core, we simply try to use it to do a lot of
                real work.
                
                Man, why did i not think of that, i could have got out of
                writing so many tests if i'd just baked the invariants into the
                core of the implementation!
       
                  conartist6 wrote 1 day ago:
                  It's not too hard to verify my central claim here which is
                  that we're giving away what they charge money for. Their
                  serialization format is secret, proprietary. Ours, CSTML, is
                  open: [1] . Their free product make you re-parse the entire
                  project with every code change you make. Ours is built with
                  copy-on-write immutable data structures so that you can
                  always build new things without losing old ones. Our way you
                  can compose fragments of trees together with new code into
                  new trees like you're playing with lego bricks.
                  
  HTML            [1]: https://docs.bablr.org/guides/cstml
       
                  conartist6 wrote 1 day ago:
                  In this case the tool is meant to parse programming
                  languages, so once I write some parser grammars every valid
                  code file in existence is a test case. Seen that way I have
                  more test cases than I know what to do with.
                  
                  We've come a ways from 2 + 2. This week my goal is to feed
                  our own whole codebase through the JS parser, and I should be
                  able to. I managed to parse a few hundred lines of real JS
                  last week before running into Automatic Semicolon Insertion
                  trouble that I needed to tinker with the core to fix.
                  
                  While I get that our low profile smacks of vapor, we actually
                  have working packages published: bablr and @bablr/cli. I'd
                  consider them to be beta quality right now, having gone
                  through many previous releases that I'd only consider
                  alpha-quality, and even more releases before that.
       
              conartist6 wrote 1 day ago:
              The mission is the same as OpenRewrite: parse and transform any
              code.
       
        twic wrote 1 day ago:
        According to [1] :
        
        > Nevertheless, detecting the holding of locks requires a careful and
        occasionally interprocedural analysis of the source code, and the other
        conditions, such as "in a completion handler", are not formally defined
        and require study of multiple files.
        
        > Due to the complexity of the conditions governing the choice of new
        argument for usb_submit_urb, 71 of the 158 calls to this function were
        initially transformed incorrectly to use GFP_KERNEL instead of
        GFP_ATOMIC.
        
        Okay, but how does Coccinelle help? Is it able to do this careful and
        not formally defined analysis? Or does it automate the undifferentiated
        heavy lifting and so make it easier for humans to do it?
        
  HTML  [1]: https://coccinelle.gitlabpages.inria.fr/website/ce.html
       
        eqvinox wrote 1 day ago:
        It's a bit of a disservice to call it "The Linux kernel's"; it's its
        own project that just happens to be used on the Linux kernel quite a
        bit.  It doesn't originate there or belong to the kernel or anything
        like that.
       
          dang wrote 16 hours 57 min ago:
          Ok, we've removed the Linux kernel from the title above.
       
        conartist6 wrote 1 day ago:
        I forgot about Coccinelle.
        
        I think semantic patching is an idea whose time has come though. I'm
        making a more modern set of tools for source-to-source transformation
        that will work with any desired languages as the input and output.
       
          anon111332142 wrote 9 hours 42 min ago:
          int result = ask_chatgpt("1 + 1");
       
          fweimer wrote 1 day ago:
          Those tools exist, but you have to pay by the token. I'm not sure if
          they scale financially to large code bases such as the Linux kernel.
          They are far more accessible than Coccinelle or Perl, though.
       
            eqvinox wrote 1 day ago:
            Honestly, I rather use Coccinelle, where I understand exactly what
            it does, when it does it and why it does it…
       
              conartist6 wrote 1 day ago:
              I would also rather use a tool that I trust than delegate the
              task to unreliable third party.
              
              But to the person bringing up AI, you don't have to choose one or
              the other! Models use tools. Good tools for people are usually
              also good tools for models. The problem models have in learning
              to use tools like Coccinelle effectively is that there are too
              many of the tools and not enough documentation for each tool. If
              there were a unified, standard platform however then many humans
              would start to gain abilities through fluent tool use and of
              enough of those people would write docs and blog posts. Where
              people lead, models follow without doubt. Once a large enough
              corpus of writing existed documenting a single platform the
              models would also be fluent, just like they are fluent in JS and
              React because of how large the web platform is
       
       
   DIR <- back to front page