Learning Regex

For past many days, I have been learning how to automate the mundane activities which me and my colleagues do at work. So, I got to know about web scraping. I didn’t knew there was a specific term for the objective I was trying to achieve.

Web scraping helps in data collection jobs, where all data is publicly available and you just need to copy/paste the data from one place to another.

Sometimes, my job involves these useless activities of copy-pasting from web to an excel sheet. So, I thought to learn web scraping and let the computers do brainless work, while I prefer doing work which is a good food for my brain.

As a I started to learn more about web scraping, I discoverer, that my existing knowledge of HTML/CSS would also help in getting desired result. This got me interesting. I also discovered that web scraping involves extensive use of Python and its libraries along with a little bit of Regex (Regular Expression) knowledge. So, to fulfill requisites, I took up Python learning at Learn Python the Hard Way, and Regex learning at RegexOne. Both are really great and engaging resources for beginners.

Thankfully, I was able to complete regex in one day. I would like to share some of the solutions, which I came up for Regex practice exercise at RegexOne

Problem My Solution
Problem 1 (-|)\d+(\.\d+[e]\d+|\.\d+|,\d+\.\d+|)[^p]$
Problem 2  (\d{3})
Problem 3  (\w+\.?\w+)
Problem 4  <(\w+)
Problem 5  (\w+)\.(jpg|png|gif)$
Problem 6  \s+(.+)
Problem 7  \w\/(\w+)?\( \w+\)\:\s+at widget.List.(\w+)\((\w+.java):(\w+)
Problem 8  (\w+)://(\w+\.?-?\w+\.?\w+):?(\d+)?

Some of the solutions differ from the solutions originally posted on website. This might help in better understanding of RegEx, when you view alternate code that can achieve same objective.

Credits: http://twiki.org/cgi-bin/view/Codev/TWikiPresentation2013x03x07
Credits: http://twiki.org/cgi-bin/view/Codev/TWikiPresentation2013x03x07

I really loved the learning experience at this website, and I would recommend this website for beginners to learn RegEx. [RegExOne]

Additional Resources may include cheat sheets, comics, and videos available on internet for learning Regex. Checkout more resources here.

Leave a Reply

Your email address will not be published. Required fields are marked *