For past many days, I have been learning how to automate the mundane activities which me and my colleagues do at work. So, I got to know about web scraping. I didn’t knew there was a specific term for the objective I was trying to achieve.
Web scraping helps in data collection jobs, where all data is publicly available and you just need to copy/paste the data from one place to another.
Sometimes, my job involves these useless activities of copy-pasting from web to an excel sheet. So, I thought to learn web scraping and let the computers do brainless work, while I prefer doing work which is a good food for my brain.
As a I started to learn more about web scraping, I discoverer, that my existing knowledge of HTML/CSS would also help in getting desired result. This got me interesting. I also discovered that web scraping involves extensive use of Python and its libraries along with a little bit of Regex (Regular Expression) knowledge. So, to fulfill requisites, I took up Python learning at Learn Python the Hard Way, and Regex learning at RegexOne. Both are really great and engaging resources for beginners.
Thankfully, I was able to complete regex in one day. I would like to share some of the solutions, which I came up for Regex practice exercise at RegexOne
Problem | My Solution |
---|---|
Problem 1 | (-|)\d+(\.\d+[e]\d+|\.\d+|,\d+\.\d+|)[^p]$ |
Problem 2 | (\d{3}) |
Problem 3 | (\w+\.?\w+) |
Problem 4 | <(\w+) |
Problem 5 | (\w+)\.(jpg|png|gif)$ |
Problem 6 | \s+(.+) |
Problem 7 | \w\/(\w+)?\( \w+\)\:\s+at widget.List.(\w+)\((\w+.java):(\w+) |
Problem 8 | (\w+)://(\w+\.?-?\w+\.?\w+):?(\d+)? |
Some of the solutions differ from the solutions originally posted on website. This might help in better understanding of RegEx, when you view alternate code that can achieve same objective.

I really loved the learning experience at this website, and I would recommend this website for beginners to learn RegEx. [RegExOne]
Additional Resources may include cheat sheets, comics, and videos available on internet for learning Regex. Checkout more resources here.