To avoid undesirable content in the search indexes, webmasters can instruct spiders not to crawl certain files or directories through the standard robots.txt file in the root directory of the domain. Additionally, a page can be explicitly excluded from a search engine's database by using a meta tag specific to robots (usually ). When a search engine visits a site, the robots.txt located in the root directory is the first file crawled. The robots.txt file is then parsed and will instruct the robot as to which pages are not to be crawled. As a search engine crawler may keep a cached copy of this file, it may on occasion crawl pages a webmaster does not wish crawled. Pages typically prevented from being crawled include login specific pages such as shopping carts and user-specific content such as search results from internal searches. In March 2007, Google warned webmasters that they should prevent indexing of internal search results because those pages are considered search spam.
So: how to proceed? On the one hand, SEO best practices recommend that you include relevant keywords in a number of high-attention areas on your site, everywhere from the titles and body text of your pages to your URLs to your meta tags to your image file names. On the other hand, successfully optimized websites tend to have thousands or even millions of keywords. You can't very well craft a single, unique page for every one of your keywords; at the same time, you can't try to cram everything onto a handful of pages with keyword stuffing and expect to rank for every individual keyword. It just doesn't work that way.
OBSERVATION – You can have the content and the links – but if your site falls short on even a single user satisfaction signal (even if it is picked up by the algorithm, and not a human reviewer) then your rankings for particular terms could collapse – OR – rankings can be held back – IF Google thinks your organisation, with its resources, or ‘reputation, should be delivering a better user experience to users.
Page and Brin founded Google in 1998. Google attracted a loyal following among the growing number of Internet users, who liked its simple design. Off-page factors (such as PageRank and hyperlink analysis) were considered as well as on-page factors (such as keyword frequency, meta tags, headings, links and site structure) to enable Google to avoid the kind of manipulation seen in search engines that only considered on-page factors for their rankings. Although PageRank was more difficult to game, webmasters had already developed link building tools and schemes to influence the Inktomi search engine, and these methods proved similarly applicable to gaming PageRank. Many sites focused on exchanging, buying, and selling links, often on a massive scale. Some of these schemes, or link farms, involved the creation of thousands of sites for the sole purpose of link spamming.
Sometimes I think if your titles are spammy, your keywords are spammy, and your meta description is spammy, Google might stop right there – even they probably will want to save bandwidth at some time. Putting a keyword in the description won’t take a crap site to number 1 or raise you 50 spots in a competitive niche – so why optimise for a search engine when you can optimise for a human? – I think that is much more valuable, especially if you are in the mix already – that is – on page one for your keyword.
But Google isn’t the only reason why keywords are important. Actually, it’s less important, because you should always focus on the user: on your visitors and potential clients. With SEO you want people to land on your website when using a certain search term or keyword. You need to get into the heads of your audience and use the words they use when they are searching.
Repeat this exercise for as many topic buckets as you have. And remember, if you're having trouble coming up with relevant search terms, you can always head on over to your customer-facing colleagues -- those who are in Sales or Service -- and ask them what types of terms their prospects and customers use, or common questions they have. Those are often great starting points for keyword research.
Google is a link-based search engine. Google doesn’t need content to rank pages but it needs content to give to users. Google needs to find content and it finds content by following links just like you do when clicking on a link. So you need first to make sure you tell the world about your site so other sites link to yours. Don’t worry about reciprocating to more powerful sites or even real sites – I think this adds to your domain authority – which is better to have than ranking for just a few narrow key terms.
QUOTE: “Tell visitors clearly that the page they’re looking for can’t be found. Use language that is friendly and inviting. Make sure your 404 page uses the same look and feel (including navigation) as the rest of your site. Consider adding links to your most popular articles or posts, as well as a link to your site’s home page. Think about providing a way for users to report a broken link. No matter how beautiful and useful your custom 404 page, you probably don’t want it to appear in Google search results. In order to prevent 404 pages from being indexed by Google and other search engines, make sure that your webserver returns an actual 404 HTTP status code when a missing page is requested.” Google, 2018
QUOTE: “I’ve got a slide here where I show I think 8 different URLs you know every single one of these URLs could return completely different content in practice we as humans whenever we look at ‘www.example.com’ or just regular ‘example.com’ or example.com/index or example.com/home.asp we think of it as the same page and in practice it usually is the same page so technically it doesn’t have to be but almost always web servers will return the same content for like these 8 different versions of the URL so that can cause a lot of problems in search engines if rather than having your backlinks all go to one page instead it’s split between (the versions) and it’s a really big headache….how do people fix this well …. the canonical link element” Matt Cutts, Google
It's important to check that you have a mix of head terms and long-tail terms because it'll give you a keyword strategy that's well balanced with long-term goals and short-term wins. That's because head terms are generally searched more frequently, making them often (not always, but often) much more competitive and harder to rank for than long-tail terms. Think about it: Without even looking up search volume or difficulty, which of the following terms do you think would be harder to rank for?
Use common sense – Google is a search engine – it is looking for pages to give searchers results, 90% of its users are looking for information. Google itself WANTS the organic results full of information. Almost all websites will link to relevant information content so content-rich websites get a lot of links – especially quality links. Google ranks websites with a lot of links (especially quality links) at the top of its search engines so the obvious thing you need to do is ADD A LOT of INFORMATIVE CONTENT TO YOUR WEBSITE.
However, that’s totally impractical for established sites with hundreds of pages, so you’ll need a tool to do it for you. For example, with SEMRush, you can type your domain into the search box, wait for the report to run, and see the top organic keywords you are ranking for. Or, use their keyword position tracking tool to track the exact keywords you’re trying to rank for.
Google and Bing use a crawler (Googlebot and Bingbot) that spiders the web looking for new links to find. These bots might find a link to your homepage somewhere on the web and then crawl and index the pages of your site if all your pages are linked together. If your website has an XML sitemap, for instance, Google will use that to include that content in its index. An XML sitemap is INCLUSIVE, not EXCLUSIVE. Google will crawl and index every single page on your site – even pages out with an XML sitemap.
I think it makes sense to have unique content as much as possible on these pages but it’s not not going to like sync the whole website if you don’t do that we don’t penalize a website for having this kind of deep duplicate content and kind of going back to the first thing though with regards to doorway pages that is something I definitely look into to make sure that you’re not running into that so in particular if this is like all going to the same clinic and you’re creating all of these different landing pages that are essentially just funneling everyone to the same clinic then that could be seen as a doorway page or a set of doorway pages on our side and it could happen that the web spam team looks at that and says this is this is not okay you’re just trying to rank for all of these different variations of the keywords and the pages themselves are essentially all the same and they might go there and say we need to take a manual action and remove all these pages from search so that’s kind of one thing to watch out for in the sense that if they are all going to the same clinic then probably it makes sense to create some kind of a summary page instead whereas if these are going to two different businesses then of course that’s kind of a different situation it’s not it’s not a doorway page situation.”
The reality in 2019 is that if Google classifies your duplicate content as THIN content, or MANIPULATIVE BOILER-PLATE or NEAR DUPLICATE ‘SPUN’ content, then you probably DO have a severe problem that violates Google’s website performance recommendations and this ‘violation’ will need ‘cleaned’ up – if – of course – you intend to rank high in Google.
Google will INDEX perhaps 1000s of characters in a title… but I don’t think anyone knows exactly how many characters or words Google will count AS a TITLE TAG when determining RELEVANCE OF A DOCUMENT for ranking purposes. It is a very hard thing to try to isolate accurately with all the testing and obfuscation Google uses to hide it’s ‘secret sauce’. I have had ranking success with longer titles – much longer titles. Google certainly reads ALL the words in your page title (unless you are spamming it silly, of course).
It’s important to note that entire websites don’t rank for keywords — pages do. With big brands, we often see the homepage ranking for many keywords, but for most websites this isn’t usually the case. Many websites receive more organic traffic to pages other than the homepage, which is why it’s so important to diversify your website’s pages by optimizing each for uniquely valuable keywords.
In March 2006, KinderStart filed a lawsuit against Google over search engine rankings. KinderStart's website was removed from Google's index prior to the lawsuit, and the amount of traffic to the site dropped by 70%. On March 16, 2007, the United States District Court for the Northern District of California (San Jose Division) dismissed KinderStart's complaint without leave to amend, and partially granted Google's motion for Rule 11 sanctions against KinderStart's attorney, requiring him to pay part of Google's legal expenses.
Did you know that nearly 60% of the sites that have a top ten Google search ranking are three years old or more? Data from an Ahrefs study of two million pages suggests that very few sites less than a year old achieve that ranking. So if you’ve had your site for a while, and have optimized it using the tips in this article, that’s already an advantage.
Moz Keyword Explorer - Input a keyword in Keyword Explorer and get information like monthly search volume and SERP features (like local packs or featured snippets) that are ranking for that term. The tool extracts accurate search volume data by using live clickstream data. To learn more about how we're producing our keyword data, check out Announcing Keyword Explorer.
QUOTE: “Keyword Stuffed” Main Content Pages may be created to lure search engines and users by repeating keywords over and over again, sometimes in unnatural and unhelpful ways. Such pages are created using words likely to be contained in queries issued by users. Keyword stuffing can range from mildly annoying to users, to complete gibberish. Pages created with the intent of luring search engines and users, rather than providing meaningful MC to help users, should be rated Lowest.” Google Search Quality Evaluator Guidelines 2017
Some page titles do better with a call to action – a call to action which reflects exactly a searcher’s intent (e.g. to learn something, or buy something, or hire something. THINK CAREFULLY before auto-generating keyword phrase footprints across a site using boiler-plating and article spinning techniques. Remember this is your hook in search engines, if Google chooses to use your page title in its search snippet, and there is a lot of competing pages out there in 2019.
I do not obsess about site architecture as much as I used to…. but I always ensure my pages I want to be indexed are all available from a crawl from the home page – and I still emphasise important pages by linking to them where relevant. I always aim to get THE most important exact match anchor text pointing to the page from internal links – but I avoid abusing internals and avoid overtly manipulative internal links that are not grammatically correct, for instance..
Try and get links within page text pointing to your site with relevant, or at least, natural looking, keywords in the text link – not, for instance, in blogrolls or site-wide links. Try to ensure the links are not obviously “machine generated” e.g. site-wide links on forums or directories. Get links from pages, that in turn, have a lot of links to them, and you will soon see benefits.
Being ‘relevant’ comes down to keywords & key phrases – in domain names, URLs, Title Elements, the number of times they are repeated in text on the page, text in image alt tags, rich markup and importantly in keyword links to the page in question. If you are relying on manipulating hidden elements on a page to do well in Google, you’ll probably trigger spam filters. If it is ‘hidden’ in on-page elements – beware relying on it too much to improve your rankings.
Some search engines have also reached out to the SEO industry, and are frequent sponsors and guests at SEO conferences, webchats, and seminars. Major search engines provide information and guidelines to help with website optimization. Google has a Sitemaps program to help webmasters learn if Google is having any problems indexing their website and also provides data on Google traffic to the website. Bing Webmaster Tools provides a way for webmasters to submit a sitemap and web feeds, allows users to determine the "crawl rate", and track the web pages index status.