all 12 comments

[–][deleted]  (2 children)

[deleted]

    [–]In-the-clouds[S] 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (1 child)

    You're right. Google is caching your page, so they get past the Cloudflare captcha. How can a GoogleBot do that?

    [–]xoenix 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (0 children)

    Some Saidit user managed to script their way around Cloudflare, so I suspect Google could do it too.

    [–]Jiminy 2 insightful - 1 fun2 insightful - 0 fun3 insightful - 1 fun -  (4 children)

    I just tested this using some keywords from your subject and the word saidit and this shows up so google can still read the site, I'm sure cloud flare actually pays google to show its sites anyway.

    [–]In-the-clouds[S] 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (3 children)

    Thanks for testing. I also confirmed that GoogleBot is crawling the website.

    Cloudflare claims to prevent bots from accessing the website. We must prove that we are human by clicking a checkbox. (That's enough of a test?) GoogleBot.... is a bot. How does GoogleBot pass this test? Or, how does Cloudflare know that its GoogleBot and let them through the front gate?

    My limited understanding: Cloudflare checks the IP address, the user agent string, and the ASN (Autonomous System Number).

    [–]Jiminy 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (2 children)

    It's so easy for a bot to be programmed to click a box so it's definitely not to stop bots. I'm sure tho cloudflare let's google do what they want and in fact pays them to.

    [–]In-the-clouds[S] 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (1 child)

    I cannot confirm that Cloudflare pays Google for anything.

    [–]Jiminy 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (0 children)

    Yeah I just mean that it's well known that google has companies pay to be in their search algorithms, it's advertising, they're being sued right now cuz they do anti trust things for it and lie to companies. Cloudflare is kind of a big company so I'm just assuming they pay to advertise as well.

    [–][deleted]  (1 child)

    [removed]

      [–]dissidentrhetoric 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (0 children)

      [–]dissidentrhetoric 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (0 children)

      They should block search engines, so that spammers waste their time posting.

      [–]package 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (0 children)

      Crawling isn't really a thing anymore; every site is locking down so they can license their datasets