URI:
        _______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
  HTML Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
  HTML   Speech and Language Processing (3rd ed. draft)
       
       
        jll29 wrote 1 hour 15 min ago:
        One can feel for the authors, it's such a struggle to write a textbook
        in a time when NeurIPS gets 20000 submissions and ACL has 6500
        registered attendees (as of August '05), and every day, dozens of
        relevant ArXiv pre-prints appear.
        
        Controversial opinion (certainly the publisher would disagree with me):
        I would not take out older material, but arrange it by properties like
        explanatory power/transparency/interpreability, generative capacity,
        robustness, computational efficiency, and memory footprint.
        For each machine learning method, an example NLP model/application
        could be shown to demonstrate it.
        
        Naive Bayes is way too useful to downgrade it to an appendix position.
        
        It may also make sense to divide the book into timeless material (Part
        I: what's a morphem? what's a word sense?) and (Part II:) methods and
        datasets that change every decade.
        
        This is the broadest introductory book for beginners and a must-read;
        like the ACL family of conferences it is (nowadays) more of an NLP book
        (i.e., on engineering applications) than a computational linguistics
        (i.e., modeling/explaining how language-based communication works)
        book.
       
        ivape wrote 1 hour 36 min ago:
        Were NLP people able to cleanly transition? I'm assuming the field is
        completely dead. They may actually be patient zero of the llm-driven
        unemployment outbreak.
       
        languagehacker wrote 1 hour 59 min ago:
        Good old Jurafsky and Martin. Got to meet Dan Jurafsky when he visited
        UT back in '07 or so -- cool guy.
        
        This one and Manning and Schutze's "Dice Book" (Foundations of
        Statistical Natural Language Processing) were what got me into
        computational linguistics, and eventually web development.
       
        MarkusQ wrote 2 hours 14 min ago:
        Latecomers to the field may be tempted to write this off as antiquated
        (though updated to cover transformers, attention, etc.) but a better
        framing would be that it is _grounded_.  Understanding the range of
        related approaches is key to understanding the current dominant
        paradigm.
       
        mfalcon wrote 2 hours 21 min ago:
        I was eagerly waiting for a chapter on semantic similarity as I was
        using Universal Sentence Encoder for paraphrase detection, then LLMs
        showed up before that chapter :).
       
        brandonb wrote 2 hours 44 min ago:
        I learned speech recognition from the 2nd edition of Jurafsky's book
        (2008). The field has changed so much it sometimes feels
        unrecognizable. Instead of hidden markov models, gaussian mixture
        models, tri-phone state trees, finite state transducers, and so on,
        nearly the whole stack has been eaten from the inside out by neural
        networks.
        
        But, there's benefit to the fact that deep learning is now the "lingua
        franca" across machine learning fields. In 2008, I would have struggled
        to usefully share ideas with, say, a researcher working on computer
        vision.
        
        Now neural networks act as a shared language across ML, and ideas can
        much more easily flow across speech recognition, computer vision, AI in
        medicine, robotics, and so on. People can flow too, e.g., Dario Amodei
        got his start working on Baidu's DeepSpeech model and now runs
        Anthropic.
        
        Makes it a very interesting time to work in applied AI.
       
          roadside_picnic wrote 1 hour 24 min ago:
          In addition to all this, I also feel we have been getting so much
          progress so fast down the NN path that we haven't really had time to
          take a breath and understand what's going on.
          
          When you work closely with transformers for while you do start to see
          things reminiscent of old school NLP pop up: decoder only LLMs are
          really just fancy Markov Chains with a very powerful/sophisticated
          state representation, "Attention" looks a lot like learning kernels
          for various tweaks on kernel smoothing etc.
          
          Oddly, I almost think another AI winter (or hopefully just an AI cool
          down) would give researchers and practitioners alike a chance to
          start exploring these models more closely. I'm a bit surprised how
          few people really spend their time messing with the internals of
          these things, and every time they do something interesting seems to
          come out of it. But currently nobody I know in this space, from
          researchers to product folks, seems to have time to catch their
          breath, let along really reflect on the state of the field.
       
          ForceBru wrote 2 hours 30 min ago:
          > Gaussian mixture models
          
          In what fields did neural networks replace Gaussian mixtures?
       
            brandonb wrote 2 hours 9 min ago:
            The acoustic model of a speech recognizer used to be a GMM, which
            mapped a pre-processed acoustic signal vector (generally
            MFCCs-Mel-Frequency Cepstral Coefficients) to an HMM state.
            
            Now those layers are neural nets, so acoustic pre-processing, GMM,
            and HMM are all subsumed by the neural network and trained
            end-to-end.
            
            One early piece of work here was DeepSpeech2 (2015):
            
  HTML      [1]: https://arxiv.org/pdf/1512.02595
       
              ForceBru wrote 28 min ago:
              Interesting, thanks!
       
       
   DIR <- back to front page