Mysterious Google non-indexing
-
We have a problem with google indexing (not!) the 2003 Congress on Evolutionary Computation web pages (www.cs.adfa.edu.au/cec_2003). That is, they don't appear in google at all. This is despite the pages having been in existence since May 2002, and being fairly heavily linked (www.alltheweb.com returns these pages as the second highest-ranked search for 'cec 2003', and finds 68 external links including from two top-level domains (www.wcci2002.org and www.cec2003.org). Can anyone explain .why we aren't getting indexed .most important: what we can do to fix it Possibly relevant information: Our whole school web site (www.cs.adfa.edu.au) is not getting indexed very nicely: it is actually indexed under csadfa.cs.adfa.edu.au instead. Some years ago, this was the primary address of our site, but this was fixed quite a while ago, though until recently a query to csadfa.cs.adfa.edu.au would return that as the address (now it will be returned as a redirect to www.cs.adfa.edu.au). Anyway, the current status is if you query google with www.cs.adfa.edu.au, it returns 'no information available'; if you query csadfa.cs.adfa.edu.au, it finds the page, but then a query to link:csadfa.cs.adfa.edu.au returns zero links (alltheweb finds around 800 external links to www.cs.adfa.edu.au, so it _ought_ to have a reasonable PR). We _know_ the pages are being explored, because we can see the googlebot probes. We _think_ that the problems may have arisen because .csadfa.cs.adfa.edu.au being the original site, www.cs.adfa.edu.au, having identical information, was treated as a duplicate site .maybe the redirect from csadfa.cs.adfa.edu.au to www.cs.adfa.edu.au is too recent to have influenced google's indexing yet .perhaps we will have to specifically request removal of csadfa.cs.adfa.edu.au from google to get things to eventually sort themselves out? .we're guessing that the zero links to csadfa.cs.adfa.edu.au are because all the 800 external links reference www.cs.adfa.edu.au; since this isn't even indexed, the link counts are lost What constitutes a reasonable answer: .a clear explanation of why www.cs.adfa.edu.au/cec_2003 isn't getting indexed, together with either .an explanation of how to fix it .an explanation of why it can't be fixed .I'd certainly like to know whether we need to request removal of all of csadfa.cs.adfa.edu.au from google to fix up the general problem, but I don't believe this is the only cause of www.cs.adfa.edu.au/cec_2003's non-indexing (because csadfa.cs.adfa.edu.au/cec_2003 isn't being indexed either), so I don't believe that this would count as an answer to the question
-
Answer:
Hi, cec2003! There are certainly some things that you can do to get your site indexed by Search Engines, including Google. HOWEVER, ***your first priority should be to get the University website's Redirect problem straightened out***. First, I took a quick look at the links that you provided above. I viewed your portal page, and did a "View Source" to open Notepad with the html code for this page. I also clicked on each of the links to the "full" and "text only" versions to take a quick look at those. After that, one of the first things I always do when I analyze a website to suggest improvements is to run it through the Search Engine Spider Simulator at SearchEngineWorld. http://www.searchengineworld.com/cgi-bin/sim_spider.cgi Plugging your URL http://www.cs.adfa.edu.au/cec_2003 into the Simulator provides some REALLY disturbing results. -------------------------------------------------------------- Spider url http://www.cs.adfa.edu.au/cec_2003 Spider title Congress on Evolutionary Computation 2003 Spider meta desc No description available. Spider meta keywords Spider Text The Congress on Evolutionary Computation, one of the leading international conferences in the field, will be held in Canberra, Australia, 8th - 12th December 2003. CEC 2003 logo Enter full website Enter text only website The full website requres a CSS aware browser. Recommended oldest browsers: Internet Explorer v6.0, Netscape v6.1, Mozilla v1.1, Opera v6.0 -------------------------------------------------------------- Spidered Links = spider this link with current settings. = keyword density analyze this link. Link http://www.cs.adfa.edu.au/home.html http://www.cs.adfa.edu.au/textonly.html -------------------------------------------------------------- This is really strange! The html code showing in my Notepad window shows that you have specified a Description and Keywords, so why doesn't the Simulator see them? <meta name="description" content=" The Congress on Evolutionary Computation, one of the leading international conferences in the field, will be held in Canberra, Australia, 8th - 12th December 2003." /> <meta name="keywords" content="evolutionary computation, conference, computer science, adfa" /> Do you suppose that the Simulator is being redirected to an earlier version of your page somewhere??? Now, this is even weirder -- the two links listed by the Simulator: http://www.cs.adfa.edu.au/home.html http://www.cs.adfa.edu.au/textonly.html are NOT the same URL as the two links on your page when I "mouseover" them! http://www.cs.adfa.edu.au/cec_2003/home.html http://www.cs.adfa.edu.au/cec_2003/textonly.html Now since the links specified in your html source code are relative links: <a href="home.html"> <a href="textonly.html"> It seems clear that the Simulator thinks that the Home page it is looking at has the URL: http://www.cs.adfa.edu.au and NOT: http://www.cs.adfa.edu.au/cec_2003 ! After dozens of uses, this is the first time I have EVER seen the Simulator have this kind of a problem. It seems apparent that there is something going seriously wrong with the Redirect for your University's website. While human requests via a browser show the site correctly, what the Googlebot sees is, according to the Simulator, a page with no description, no keywords, the text: "The Congress on Evolutionary Computation, one of the leading international conferences in the field, will be held in Canberra, Australia, 8th - 12th December 2003. CEC 2003 logo Enter full website Enter text only website The full website requres a CSS aware browser. Recommended oldest browsers: Internet Explorer v6.0, Netscape v6.1, Mozilla v1.1, Opera v6.0" and 2 DEAD links. No wonder you're not getting indexed. The Googlebot doesn't think that you have any content on your site. Before you do anything else, you need to have a conference with your University's web administrator, and have them figure out why this is happening. Because you state that "Our whole school web site (www.cs.adfa.edu.au) is not getting indexed very nicely: it is actually indexed under csadfa.cs.adfa.edu.au instead", I would have to say that this problem is not just bolloxing up your Conference site, it is bolloxing up ALL the University's sites, making it a VERY serious problem that needs to be resolved as quickly as possible. Temporary fixes to help bridge the gap until your Redirect problem is resolved: Instead of your portal page, try submitting the URL http://www.cs.adfa.edu.au/cec_2003/home.html to Google. The Search Engine Spider Simulator seems to be able to crawl that page correctly, and the links to your other pages all appear to be valid. Also, try replacing the relative links in http://http://www.cs.adfa.edu.au/cec_2003 <a href="home.html"> <a href="textonly.html"> with absolute links: <a href="http://www.cs.adfa.edu.au/cec_2003/home.html" <a href="http://www.cs.adfa.edu.au/cec_2003/textonly.html"> (While you're at it, correct "The full website requres a CSS aware browser." to "requires".) Once you have gotten that pesky redirect problem resolved (and verified that it's resolved by getting the correct results from the Simulator), here are some things that you can do to help your site get indexed as quickly as possible: According to Google, 12 sites that it has indexed have links to your Conference Page: ://www.google.com/search?q=%22%2Bwww.cs.adfa.edu.au/cec_2003%22&num=100&hl=en&lr=&ie=UTF-8&oe=UTF-8&c2coff=1&safe=off&filter=0 and their Page Rank scores are: 7,2,5,6,5,3,3,2,2,4,4,0 This is EXCELLENT. Google will index your site for "backward links" (links from other sites) if you have 3 or more such sites with a Page Rank of 4 or higher -- and you have 6 such sites going for you right now! Furthermore, if you can get the Googlebot to "see" your site, you should end up with a good Page Rank because of this, which will place you high in Google's Search Results. Submit your site to the Googlebot again: ://www.google.com/addurl.html Now, you should also submit your site to be listed in the DMOZ ODP (Directory Mozilla - Open Directory Project) at http://www.dmoz.org . There are two reasons for this: 1) Because DMOZ is humanly edited (each URL is personally checked out for legitimacy before being added to the Directory), a lot of crap that makes it into Search Engine Results never makes it into DMOZ. For this reason, many people rely heavily on the ODP as a guide to quality websites, so you really want to be listed in it. 2) Also for that reason, the Googlebot uses (among other things) the ODP as a kind of "checklist" for what sites to crawl when it indexes the web. So if you can get listed in the ODP, your chances of being indexed -- and of obtaining a high Google Page Rank -- are GREATLY increased. My guess as to where you would choose to be submitted for cataloging would be: Computers > Computer Science > Conferences > 2003 http://dmoz.org/cgi-bin/add.cgi?where=Computers/Computer_Science/Conferences/2003 You mentioned that your site is doing well on AllTheWeb ( http://www.alltheweb.com ). If you have not already been submitted and listed in the following Search Engines, you may wish to do that as well: HotBot: http://ldbreg.lycos.com/cgi-bin/mayaLogin?m_PR=29&m_CBURL=http://insite.lycos.com/searchservices/lite?step1.asp AltaVista: http://addurl.altavista.com/addurl/new Zeal (LookSmart/MSN free submission w/free registration) http://www.zeal.com/users/register.jhtml For more information on developing a Google-friendly website, I recommend that you study the information in Google's Help Department: ://www.google.com/webmasters Guidelines ://www.google.com/webmasters/guidelines.html Facts & Fiction (myths dispelled) ://www.google.com/webmasters/facts.html Search Engine Optimization (SEO) ://www.google.com/webmasters/seo.html Frequently Asked Questions (FAQ) ://www.google.com/webmasters/faq.html User Support Discussion Forum http://groups.google.com/groups?q=google.public.support.general Another fabulous resource is the forum at WebmasterWorld.com: http://www.webmasterworld.com and at Search Engine World: http://www.searchengineworld.com I encourage you to visit these sites and learn more about making your site attractive and friendly to Search Engines. Before Rating my Answer, if you have any questions about this information, please post a Request for Clarification, and I will be glad to see what I can do for you. I hope that this Answer provides exactly the information that you needed! Best wishes for a quick resolution to your Redirect problem, and greatly increased traffic to your Conference website! Regards, aceresearcher
cec2003-ga at Google Answers Visit the source
Related Q & A:
- What is the difference between Google earth and Google earth plus?Best solution by Yahoo! Answers
- How do non-polar substances dissolve other non-polar substances?Best solution by Yahoo! Answers
- How does google make money with google chrome?Best solution by Quora
- What is the difference between a non-profit organization and a non-governmental organization?Best solution by maxwell.syr.edu
- What's the best non-medical and non-law job in the US?Best solution by usnews.com
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.