BrandVerity provides automated brand protection for online content. To do so, we regularly crawl websites on behalf of our clients to ensure content compliance with brand and regulatory guidelines. This means that our crawlers go out and scan the web, similar to what Googlebot and other crawlers do, and then report back to us what they’ve found. We then analyze this data for our customers to make sure compliance standards are met.
Our goal is to provide the most up-to-date content collection to our customers. An integral part of that is working together to ensure we can crawl as many pages as they need each visit without overwhelming anyone’s system.
You have been identified as a partner of one or more of our clients, so we are crawling your site on their behalf. If you think this is an error, please contact us at firstname.lastname@example.org.
To ensure successful crawls, please whitelist our crawler on your website, on any bot detection software you may be using, and in robots.txt if you otherwise prevent bots from crawling your site.
You can whitelist BrandVerity crawlers by allowing user agents that identify with the following string:
This is the most robust solution to guarantee that we can crawl your site successfully.
You can most easily identify us by this part of our user agent string:
To help you identify our user agent in your logs, this is an example of BrandVerity's user agent:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:41.0) Gecko/20100101 Firefox/55.0 BrandVerity/1.0 (http://www.brandverity.com/why-is-brandverity-visiting-me)
Yes, when we do full-page, top level browser crawls to review content on behalf of our customers, we respect robots.txt. However, on those pages we then follow some links on the page to see where they go, and that step doesn't currently respect robots. When we follow those links, we are doing so to verify the destination url and not gathering any resources or looking at any content of the landing page.
If we are causing any problems to your site due to our crawling, please contact email@example.com. We are more than happy to work with you to adjust our crawl behavior.
BrandVerity’s product identifier is identified as a bot on the IAB/ABC International Bot and Spiders list. This list is used by Google Analytics to automatically filter out bot traffic, so as long as you have Google Analytics set to block bot traffic, then BrandVerity will be filtered out. If you use a different solution for analytics tracking, then you can inquire with them to determine if they filter out such traffic, or if they can specifically filter out BrandVerity traffic using the above BrandVerity product identifier in the user agent string. You can also subscribe to access the IAB list directly using the link above.
BrandVerity’s default crawl rate is a maximum of 34 pages per 30 seconds on a given domain. Additionally, for each page, our crawlers also follow a selection of links. Our link follow requests do not fetch page resources to minimize impact on your servers. Our crawlers cache crawl data and therefore rarely reach those request limits for most domains.
This rate works well for many publishers, but we are able to adjust that rate if you need us to.
If you are already blocking BrandVerity crawlers, it is important to unblock those IP addresses. BrandVerity can provide you with a list of IP addresses that appear to be blocked currently if you aren’t already aware of them.
Yes. If you would like us to add an additional string that is unique for your domain, we can include that in addition to our standard user agent. Simply send BrandVerity the unique string you’d like to be included, and it won’t be a problem.
Yes. If you would like us to add custom headers to use when crawling your domain, simply send BrandVerity what values you’d like to be used and we can do that.
If you have any questions about our crawling, please don't hesitate to reach out to us at firstname.lastname@example.org.