Our Google crawler gathers data returned by Google for a particular search term. If the search term has a lot of ads and other sponsored content associated with it, then they show up in the search results along with the search query and end up in the crawled data.
To avoid unwanted ads/duplicate data gathered from google search, you could try using google power search (https://sites.google.com/view/togetherlearning/learn/digitalliteracy/powersearching) to generate input to use in the crawler that can help you get the exact results for your search.