![]() ![]() The first idea was to crawl through the initial site using the crawler Xenu. So first I need a list of all the subpages in the site. My plan was simple: get a list of all URLs from the initial site, extract only the paths of each URL and then build requests in JMeter using the new domain (the staging environment) + the extracted paths. This was in no way something to do manually so i started looking for ways to automate the task. Ok, so, one page down, a few thousand more to check. If the page didn’t respond with 404 and the response was instead 200 or 301(redirect), the test was a success. So, let’s say that the the initial site had a particular page like and the task was to check that the path used in the new site by accessing would redirect me to a valid resource. The task was to check if all the relevant resources from the initial site were available in the new site. Such an example is something I had to do recently where a live website let’s call it was rewritten completely and was available at The problem was that the initial site was really old and had a very large number of pages. The testing context where I needed to use one ![]() The quick answer is that JMeter is more suitable for this task in certain contexts than other applications might be. ![]() Why use JMeter to do it then ? you may ask. There are lots of open source applications available on the web which can do website crawling. Apart from this, spiders can also be used for data mining, web scraping and website indexation so that the search engines we use return the most relevant data to us. In a similar manner, this app can also be used by cyber security specialists to help reveal vulnerabilities of a site before they are exploited but that is another story. They are sometimes associated with a negative context, for example the OWASP Zed Attack spider can be used by hackers who can map out an entire website and get an accurate idea of all its functionalities in a matter of minutes. These bots are automated scripts which browse through websites on the internet in a systematic way.Ĭrawlers consume resources on the visited systems and often do so without approval. Web crawlers come in different shapes and sizes and are also known as web spiders, bots or robots, indexers or web scutters. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |