# Web Scraping Thing I'm pretty into the folk music of England, Ireland, Scotland, and Wales. Especially some of the better bits of the '60s folk revival (Ann Briggs, Planxty, the occasional Pentangle track--though they can get a little too fusiony for me). I found out, just this week, about an old blog/podcast called "A Folk Song a Day", in which a guy records a song each day for a year and posts it, along with a little bit of text. He tends toward a more traditional style and, I think, does a good job of presenting the material. The website is very reminiscent of an old "blogspot" or "blogger" website. It does not have any ability to see what song was done on a given day of the year (this was all done back in the 20-teens). So you can look at the song list (which is in alphabetical order) or just go through entries--five to a page. I was not quite satisfied with this approach. Nor did I want to see all of the "design", images, and comments on the pages. So I decided that I would scrape it. I started by loading the alphabetical index page, viewing source, and copying out the list of links. A quick find and replace and I had a comma separated list of double-quoted (string) URLs. I put this into an array literal ([365]string) in a file to be compiled by golang. A little research let me write four quick regexes: One for the song title, one for the month/day, one for the `src` attribute to the `source` element inside the `audio` element (basically, the link to the audio file), and one for all of the text that accompanied the post--but without the comments, navs, etc. I then set up a struct that would take those items and gave it the type name `Page`, then created a `[365]Page` array called `Pages`. I used coroutines and channels to grab each page, use the regex to parse things out, add the items to a `Page`, and add the page to the `Pages` array. Once that finished I dumped it out to a json file and the program ended. All in all it took 20 or thirty minutes. Next up was to move the json file to a new folder and add an `index.php` file. In that file I have it check the current day and month (according to the server it is running on). I then search the json file (after loading it as an associative array in php) for a "Page" with the given date string. I then render a template with just those four items (plus and h1 and some info text linking back to the original project). I added a teensy bit of CSS to make it look nice enough (and uncluttered) and voila! I can now go to a page on my website and get a song each day. They still host the audio, I just stream it from them. What a nice and easy/fun project. It is the type of thing I would have done in Python with bs4 at some point... but I have not used Python in so long and I can write Go code in my sleep, so that is where I landed. Plus, that way I only needed the standard library and did not have to deal with much else. If anyone is interested, here is the original website: http://www.afolksongaday.com/ Here is the page on my website that shows the song for the current month/day: https://sloum.colorfield.space/etc/fsad/ If it is up your alley, definitely try and support the original, but I admit a strong preference--personally--for my presentation method.