Brew Scraper: A tale of pain and code

Drew Thompson
6 min readSep 29, 2020

--

Welcome to my first blog! This maybe my first post, but it’s going to be a doozy of a post!

For my Full Stack Web Development class at Flatirons I was told to construct a CLI gem that utilized a scraper. Big kicker I could do it on practically anything I wanted. So being the brew-head that I am, I decided to go and make a scraper that collected ingredients and allowed you to see them all as well as descriptions and prices if you wish.

Well first things first I had to choose a target to scrape. The unfortunate victim was northernbrewer.com(Pic for reference so you don’t have to click on the link)

Wow! What a clean looking website and even better it had a section for ingredients to scrape even better! So without doing any further research on the website I went about creating my gem. Big mistake.

So according to the project briefing this project should take 3–7 days on average. My cocky self thought it would take 3. How wrong I was.

To start I did a pretty cut and dry bundle setup:
1. $ bundle gem brew_scraper (Creation of the gem through bundler. Awesome doodad)
2. $ chmod -x bin/brew_scraper (Makes it executable from ./bin/brew_scraper
3. Touch various .rb files (ingredients, cli, scraper were my three)

Now I would like to say that was pretty cut and dry, but to be honest I was very used to just using the in-house IDE and the learn submit command so took several days to learn how to use GIT and Github. In addition to that when I ran bundler it created a file within the Main directory so I eventually had to learn how to move that with
$(______) mv . A very useful tool for moving an entire directory or pieces of it if you modify the command. (Pic of the state as just described)

Just a pic of the repository state nothing crazy

Now I could launch into how I required pry, open-uri, and nokogiri in my files and what the purpose of that is, but I’m sure if you’re reading any other Portfolio Projects or are taking a similar class you already know what they do.

If you truly happened to run into this at a really base almost caveman understanding of them: Nokogiri scrapes a file, in this case the HTML and presents the XML which can be converted into .css and utilized, Open lets you access an html string as if it were a file. Handy Handy.

Now back to it. The only truly important thing to note is that you can merge Noko and Open into one line. Saves a lot of unnecessary code.
I.E. $ Nokogiri::HTML(open(url_you_want_to_scrape)) Cool, no?

So excited or not I’m sure you wanted to see the code from my scraper well here it is:

Wow very cool is it not? Well it was a real pain making it and involves a lot of little tricks which I think is the really cool part of the project. Why does it need to loop? What is this class << BrewScraper::Scraper…, why does it need to break if items.empty?

Well let’s get into that! First things first, if you noticed this is a class calling on another class. But wait, where is the BrewScraper class located? Well that’s located in it’s own .rb and calls upon the environment.rb which contains all these other little classes. Kind of convoluted but it allows me to call anything from anywhere and allows me to just call upon brew_scraper.rb to use any of them. What’s with the << ? Well in other languages it refers to a static class and in ruby terms its referred to as a singleton. This effectively allows me to call on BrewScraper::Scraper.instert_method_here without having to create a new instance. Wonderful little trick considering I don’t need to make a new Scraper everytime. If you browse my files you’ll see this in the CLI.rb as well.

Other little tricks in this class are the loop and the break. The Northernbrewer website has multiple pages holding ingredients that I needed to access. I first started out trying to work around this using a while loop, but ultimately found out that the pages go on infinitely! So rather than making hard points from which the while loop could work and iterate through, I just had the whole program loop until there was nothing to import from the scrape, hence the break if item.empty?. I won’t take credit for the idea as the inspiration came from a good friend of mine when I was showing him the problem.

Well that’s enough about the scraper, let’s have a looksy at the ingredients class.

There’s really nothing crazy to look at its kind of just your standard class setup. The class initialization though. That was a major mess. Trying to find the specific css selectors was a little bit of a pain considering their website has a million and one divs, classes, and a hrefs. You think it would be easy but this caused at least one day of hair pulling because of the results I was getting. Something not pictured here is the self.clear which was instrumental in making the app look pretty. Also, as a warning to any and everyone coding don’t forget when you put something in your code. I left an erroneous test in the code on this file that confused me for a good hour…

Alright last big important file to look at is cli.rb.

That is one fatty file is it not. Mostly just text nothing crazy to look at. Just a bunch of functions reliant on each other and looping ad infinitum if you wish. While this in of itself isn’t super interesting, it was around this time that I found about the awe inspiring gem rubocop. I’ll be damned if that isn’t one of the coolest gems out there just does all sorts of clean up on your files and it’s helped me get use to some better practices such as using single quotes for anything that doesn’t require a #{} or %w for arrays. Really suggest anyone check it out if you haven’t already.

That’s about it. Brew Scraper 1.0.0 is born and I intend to refine and make it bigger and better until I’m content with it down the line. While this is a project for a class it unintentionally coincided with something I wanted to do anyway so cheers to serendipity! Anywho, I look forward not only to continuing this project but also whatever projects come next!

On a side note, here’s a video of the gem in action.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Drew Thompson
Drew Thompson

Written by Drew Thompson

0 Followers

Coding diary with a hint of comedy

No responses yet

Write a response