Walter Watts
Archon
Gender:
Posts: 1571 Reputation: 8.61 Rate Walter Watts
Just when I thought I was out-they pull me back in
|
|
virus: Anyone else seen this behavior on Google
« on: 2004-03-19 21:37:16 » |
|
Google is using some new procedures in its caching procedures: (picking their cached page instead of the real one was always my default, safe choice)--Now you can't always trust it. Clicking on the cached page might take you to the "real" site, and all the nasty behavior that can entail. I'm sure they're doing this to try and trim the indexing from their own figures around 400 terrabytes of data at the lowest estimate and as much as 700 terrabytes ( depending on exactly what they can "deep crawl" )......see below.
Damn them!.
Anyone else seen this behavior on Google
Walter ---------------------------------------------------------------------------
... So when in the middle of this last google claimed to be "Caching" my pages..and yet the "cached" versions were able to show my images being loaded ...I realised they had made an important change in the way they handle data ... They currently are indexing from their own figures around 400 terrabytes of dat at the lowest estimate and as much as 700 terrabytes ( depending on exactly what they can "deep crawl" )...... To treat this data by whatever routines they run ( for example via msql or whatever ) is not too complex even if they are running hugely discriminatory algo's ....however it is very very costly in processing power annd upto now was indexed stored and ranked "off net" with the dance reflecting the reintroduction of the treated data into the "publically accessible index"...what we call "Google"...( this I know is horribly simplifying what happens ...but otherwise it will get too aesotheric for this forum )... from a practical point of view it would be more simple to at least store the data to be treated in situ (wher it already is on your website server )...ie ..why make a on googles hard drives when it can ( by spidering much more intensiveley and more frequently ...and by using more spiders each with its own functions ) simply treat all spidered sites as "in ram"..( again I'm simplifying horribly ) this would require less outlay by google and would actually result in very much faster updates as it is effectivly now "ranking" on "the fly".... stes with purely html would not notice that their "cached" was now "hotlinked" into their sever and standard java etc neither ... Side routines ..php..msql etc wouldnt be affected either as its not "writing to disc " when it comes by ... However where this gets really interesting is that up until now You couldn't build pages in "flash" etc because the "bot" couldn't see them and would just skate blindly over the top of them and probably not index the page at all...
if its doing what I think it is it may not now care wether you coded in "flash" as long as there is the basic minimum of html to get you a position ....
I don't have a page currently running .swf ...if any one does ? ..When you click on your "cached" page in google ...Do you see your movie ?? If so it must be "hotlinked" to your page in real time and using the movie player "you" have installed on your machine to show the movie....Ok
If this is the case people searching will shft relatively quickly to the pages which are "interactive" ...and eventually google and the other engines will notice the diversion in traffic and rerank accordingly .... Maybe those of us with "picture" or "multi media sites" will seee the difference ?
from Google's group: google.public.support.general
--- To unsubscribe from the Virus list go to <http://www.lucifer.com/cgi-bin/virus-l>
|