Webmasters and content providers began optimizing websites for search engines in the mid-1990s, as the first search engines were cataloging the early Web. Initially, all webmasters only needed to submit the address of a page, or URL, to the various engines which would send a "spider" to "crawl" that page, extract links to other pages from it, and return information found on the page to be indexed. The process involves a search engine spider downloading a page and storing it on the search engine's own server. A second program, known as an indexer, extracts information about the page, such as the words it contains, where they are located, and any weight for specific words, as well as all links the page contains. All of this information is then placed into a scheduler for crawling at a later date.
QUOTE: “Content which is copied, but changed slightly from the original. This type of copying makes it difficult to find the exact matching original source. Sometimes just a few words are changed, or whole sentences are changed, or a “find and replace” modification is made, where one word is replaced with another throughout the text. These types of changes are deliberately done to make it difficult to find the original source of the content. We call this kind of content “copied with minimal alteration.” Google Search Quality Evaluator Guidelines March 2017
Don’t be a website Google won’t rank – What Google classifies your site as – is perhaps the NUMBER 1 Google ranking factor not often talked about – whether it Google determines this algorithmically or eventually, manually. That is – whether it is a MERCHANT, an AFFILIATE, a RESOURCE or DOORWAY PAGE, SPAM, or VITAL to a particular search – what do you think Google thinks about your website? Is your website better than the ones in the top ten of Google now? Or just the same? Ask, why should Google bother ranking your website if it is just the same, rather than why it would not because it is just the same…. how can you make yours different. Better.
QUOTE: “Over time, we’ve seen sites try to maximize their “search footprint” without adding clear, unique value. These doorway campaigns manifest themselves as pages on a site, as a number of domains, or a combination thereof. To improve the quality of search results for our users, we’ll soon launch a ranking adjustment to better address these types of pages. Sites with large and well-established doorway campaigns might see a broad impact from this change.” Google 2015
In a few cases, see what happens if you make more risky changes. I’m working with a website that wasn’t even in the top 100 positions for many of its 20 strategic keywords. Based on some data, it looked like the client’s sweet spot for keywords may be in the 10 to 30 range for average search value. We targeted one phrase with 700 searches a month. It’s now ranking No. 12 on Google after making two sets of SEO changes on one page. Ultimately, the client may need a new page to grab a spot among the top 10 positions.
QUOTE: “Ultimately, you just want to have a really great site people love. I know it sounds like a cliché, but almost [all of] what we are looking for is surely what users are looking for. A site with content that users love – let’s say they interact with content in some way – that will help you in ranking in general, not with Panda. Pruning is not a good idea because with Panda, I don’t think it will ever help mainly because you are very likely to get Panda penalized – Pandalized – because of low-quality content…content that’s actually ranking shouldn’t perhaps rank that well. Let’s say you figure out if you put 10,000 times the word “pony” on your page, you rank better for all queries. What Panda does is disregard the advantage you figure out, so you fall back where you started. I don’t think you are removing content from the site with potential to rank – you have the potential to go further down if you remove that content. I would spend resources on improving content, or, if you don’t have the means to save that content, just leave it there. Ultimately people want good sites. They don’t want empty pages and crappy content. Ultimately that’s your goal – it’s created for your users.” Gary Illyes, Google 2017
A breadcrumb is a row of internal links at the top or bottom of the page that allows visitors to quickly navigate back to a previous section or the root page. Many breadcrumbs have the most general page (usually the root page) as the first, leftmost link and list the more specific sections out to the right. We recommend using breadcrumb structured data markup28 when showing breadcrumbs.
To avoid undesirable content in the search indexes, webmasters can instruct spiders not to crawl certain files or directories through the standard robots.txt file in the root directory of the domain. Additionally, a page can be explicitly excluded from a search engine's database by using a meta tag specific to robots (usually ). When a search engine visits a site, the robots.txt located in the root directory is the first file crawled. The robots.txt file is then parsed and will instruct the robot as to which pages are not to be crawled. As a search engine crawler may keep a cached copy of this file, it may on occasion crawl pages a webmaster does not wish crawled. Pages typically prevented from being crawled include login specific pages such as shopping carts and user-specific content such as search results from internal searches. In March 2007, Google warned webmasters that they should prevent indexing of internal search results because those pages are considered search spam.
Being ‘relevant’ comes down to keywords & key phrases – in domain names, URLs, Title Elements, the number of times they are repeated in text on the page, text in image alt tags, rich markup and importantly in keyword links to the page in question. If you are relying on manipulating hidden elements on a page to do well in Google, you’ll probably trigger spam filters. If it is ‘hidden’ in on-page elements – beware relying on it too much to improve your rankings.
QUOTE: “Supplementary Content contributes to a good user experience on the page, but does not directly help the page achieve its purpose. SC is created by Webmasters and is an important part of the user experience. One common type of SC is navigation links which allow users to visit other parts of the website. Note that in some cases, content behind tabs may be considered part of the SC of the page.” Google Search Quality Evaluator Guidelines 2017
Early versions of search algorithms relied on webmaster-provided information such as the keyword meta tag or index files in engines like ALIWEB. Meta tags provide a guide to each page's content. Using metadata to index pages was found to be less than reliable, however, because the webmaster's choice of keywords in the meta tag could potentially be an inaccurate representation of the site's actual content. Inaccurate, incomplete, and inconsistent data in meta tags could and did cause pages to rank for irrelevant searches.[dubious – discuss] Web content providers also manipulated some attributes within the HTML source of a page in an attempt to rank well in search engines. By 1997, search engine designers recognized that webmasters were making efforts to rank well in their search engine, and that some webmasters were even manipulating their rankings in search results by stuffing pages with excessive or irrelevant keywords. Early search engines, such as Altavista and Infoseek, adjusted their algorithms to prevent webmasters from manipulating rankings.