Web Connection
BINGBOT ACTIVITY
Gravatar is a globally recognized avatar based on your email address. BINGBOT ACTIVITY
  Mark Randle
  All
  Mar 4, 2017 @ 04:15am

Hi All

checking my logs on one of my customers and have noticed an increase in activity from unusual addresses. Looked up the IP adrresses and they appear to be bingbot.

The strange thing is that they are using URL's including querystrings which are only generated from behind a logon ie the user is logged in.

Fortunately every call into the system has a check for a logged in user before actioning.

My big issue here is that these URL's can only have come from extracting links from a users browser - are they relly doing this now rather than just crawling??

I suppose what we now need to watch for are any calls that maybe don't check for a valid user before actioning as Microsoft could be triggering all sorts of actions !!

Anyone seen this happening and any guidance on blocking it ?

Thanks

Mark

Gravatar is a globally recognized avatar based on your email address. re: BINGBOT ACTIVITY
  Rick Strahl
  Mark Randle
  Mar 4, 2017 @ 01:27pm

Mark,

I don't think that's what's happening. Altough Microsoft surely has access to browser history and navigations from Edge and IE, I doubt they share that information - people would be pretty incised about that and that's easy to catch as it has to go out over a network connection. I think we'd know about that and I seriously doubt that's where those links are generated from.

My guess is there's a leak in the application somewhere with a back link into the private links, or perhaps somebody shared links somewhere public.

+++ Rick ---

Gravatar is a globally recognized avatar based on your email address. re: BINGBOT ACTIVITY
  Mark Randle
  Rick Strahl
  Mar 5, 2017 @ 04:23am

Hi Rick

That was my initial thought also.

However when I look at the Session records and request log the IP address shows Microsoft Bingbot as ownership,

Also looking at the Browser in the Request I am seeing this :

Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)

My main concern is these aren't random hits to any page they are fully formed requests that are only possible by clicking links in pages that would have been returned only to a signed in user.

I just can't get my head round how this is possible without it being activity being represented from earlier user activity . I don't think its transferring from the client pc.

The wierd thing if it was from an IP address belonging to a random user other than the couple of Ip addresses registered to Microsoft I may accept foul play.

Looking yesterday the particualr client was closed however hundreds of requests were fired at the server all from Microsoft Ip Addresses.

Still confused !

Thanks

Mark

Gravatar is a globally recognized avatar based on your email address. re: BINGBOT ACTIVITY
  Rick Strahl
  Mark Randle
  Mar 5, 2017 @ 02:14pm

The first thing to do is add a robots.txt that excludes any virtuals you don't want included in robot searches.

If it's a legitimate Microsoft bot it'll respect that as will other major search engines.

+++ Rick ---

Gravatar is a globally recognized avatar based on your email address. re: BINGBOT ACTIVITY
  Harvey Mushman
  Mark Randle
  Mar 5, 2017 @ 07:15pm

It sounds to me like Rick said, someone posted an the url of the page to some public space. Now Bing is trying to learn all it can from the address. The suggestion of directing robots.txt is a very good next step.

Gravatar is a globally recognized avatar based on your email address. re: BINGBOT ACTIVITY
  Michael Hogan (Ideate Hosting)
  Mark Randle
  Mar 6, 2017 @ 07:36am

In addition to what Rick suggested - (which you should DEFINITELY do first) - I have found that many browser helper plug-ins share user accessed pages with their respective search engines. I got bitten by that behavior years ago, when one of my customers complained that their 'private' documents were accessible via search engine.

I replaced all direct links to files with a webconnect process that check's credentials before sending files.

© 1996-2024