Showing posts with label Quality. Show all posts
Showing posts with label Quality. Show all posts

Friday, September 26, 2014

SEO: How to Identify Low Quality Links

Links are the lifeblood of organic search. But the quality of those links can boost or kill a site’s rankings. This article suggests methods to determine the quality of in-bound links to your site. At the end of the article, I’ve attached an Excel spreadsheet to download, to help you evaluate links to your site.
Importance of Links

Search engine algorithms have traditionally relied heavily on links as a measure of a site’s worthiness to rank. After all, links are, essentially, digital endorsements from the linking site as to the value of the site to which it is linking.

Google was founded on this concept of links indicating value. In addition to the relevance signals that other engines used in their algorithms, Google added PageRank, a method of calculating ranking value similar to the way that citations in the scientific community can indicate the value of a piece of research.

When site owners began creating artificial methods of increasing the number of links pointing to their sites to improve their rankings, the search engines retaliated with link quality measures. Google’s Penguin algorithm is one such algorithmic strike intended to remove the ranking benefit sites can derive from poor quality links.

What Makes a Low Quality Link?

Unfortunately, the definition of a poor quality link is murky. Poor quality links come from poor quality sites. Poor quality sites tend to break the guidelines set by the search engines. Those guidelines increasingly recommend that sites need to have unique content that real people would get real value from. That’s pretty subjective coming from companies (search engines) whose algorithms are based on rules and data.

The “unique” angle is easy to ascertain: If the content on a site is scraped, borrowed, or lightly repurposed it is not unique. If the site is essentially a mashup of information available from many other sources with no additional value added, it is not unique. Thus, if links come from a site that does not have unique content — i.e., a site considered low quality — those links would be low quality as well.

Search engines can identify unique content easily because they have records of every bit of content they’ve crawled. Comparing bits and bytes to find copies is just a matter of computing power and time. For site owners, it’s more difficult and requires manual review of individual sites.

There are other known indicators of low-quality sites as well, such as overabundance of ads at the top of the page, interlinking with low-quality sites, and presence of keyword stuffing and other spam tactics. Again, many of these indicators are difficult to analyze in any scalable fashion. They remain confusing to site owners.

In the absence of hard data to measure link and site quality in a scalable way, search engine optimization professionals can use a variety of data sources that may correlate with poor site quality. Examining those data sources together can identify which sites are likely to cause link quality issues for your site’s link profile.

Data such as Google toolbar PageRank, Alexa rankings, Google indexation and link counts, and other automatable data are unreliable at best in determining quality. In most cases, I wouldn’t even bother looking at some of these data points. However, because link quality data and SEO performance metrics for other sites is not available publicly, we need to make due with what we can collect.

These data should be used to identify potential low-quality sites and links, but not as an immediate indicator of which sites to disavow or request link removal. As we all know, earning links is hard even when you have high quality content, especially for new sites. It’s very possible that some of the sites that look poor quality based on the data signals we’ll be collecting are really just new high-quality sites, or sites that haven’t done a good job of promoting themselves yet.

While a manual review is still the only way to determine site and link quality, these data points can help determine which sites should be flagged for manual review.

A couple of reports can provide a wealth of information to sort and correlate. Receiving poor marks in several of the data types could indicate a poor quality site.

Google reports the top 1,000 domains that link to pages on your site. “Links” refers to the total number of links that domain has created pointing to any page on your site. “Linked Pages” refers to the number of pages that domain has linked to. So a domain may link to 10 pages on your site, but those links are on every page of their own site. If the linking site has 100 pages, that’s 1,000 “links” to 10 “linked pages.”

You can also download this report that shows a large sample of the exact pages linking to your site. In some cases the links are from domains not listed in the Link Domain Report, so you may want to add the domains from this report also.

Red flags. Generally, higher numbers of “links” and “linked pages” indicate that the domain is a poor-quality site.

This plugin turns Excel into an SEO data collector, enabling you to enter formulas that gather data from various websites.

What to use. For link quality I typically use the following.
  • Home page Google PageRank. Shows Google toolbar PageRank, which is only updated every three months and may not show accurate data but useful as a relative comparison. Higher numbers are better.
  • Google indexation. The number of pages Google chooses to report are indexed for the domain. The pages reported by Google are widely believed to be a fraction of the actual number, but it’s useful as a relative comparison. It’s the same as doing a site:domain.com search. Higher numbers are better.
  • Google link count. The number of links pointing to a domain according to Google. Wildly underreported, but just barely useful as a relative comparison. Same as doing a link:domain.com search. Higher numbers are better.
  • Alexa Reach. The number of Alexa toolbar users that visit the domain in a day. Higher numbers are better.
  • Alexa Link Count. The number of links to the domain according to Alexa’s data. Higher numbers are better.
  • Wikipedia entries. The number of times the domain is mentioned in Wikipedia. Higher numbers are better.
  • Facebook Likes. The number of Facebook Likes for the domain. Higher numbers are better.
  • Twitter count. The number of Twitter mentions for the domain. Higher numbers are better.

Cautions. Every cell in the spreadsheet will execute a query to another server. If you have many rows of data, this plugin will cause Excel to not respond and you’ll have to force it to quit in your task manager. I recommend the following steps.
  • Turn on manual calculation in the Formulas menu: Formulas > Calculation > Calculate Manually. This prevents Excel from executing the formulas every time you press enter, and will save a lot of time and frustration. Formulas will only execute when you save the document or click Calculate Now in the aforementioned options menu.
  • Paste the formulas down one column at a time in groups of 50 to 100. It seems to respond better when the new formulas are all of the same type (only Alexa Reach data, for example) than if you try to execute multiple types of data queries at once.
  • Use Paste Special. When a set of data is complete, copy it and do a Paste Special right over the same cells. That removes the formulas so they don’t have to execute again. I’d leave the formulas in the top row so you don’t have to recreate them all if you need to add more domains later.
  • Use a PC if you can because Apple computers tend to stall out more quickly with this plug in.

Manual Quality Review

If a site has high numbers in the Google Webmaster Tools reports and low numbers in the SEO Tools data, it should be manually checked to determine if it’s a poor quality site, sending poor-quality links your way. The following are the quality signals I use for manually reviewing link quality.
  • Trust. Would you visit this site again? Do you feel confident about buying from the site or relying on its advice? Would you recommend it to your friends? If not, it’s probably low quality.
  • Source. Is this site a source of unique information or products? Does this site pull all of its content from other sites via APIs? Is it scraping its content from other sites with or without a link back to the source site? Does it feel like something you could get from a thousand other sites? If so, it’s probably low quality.
  • Ad units in first view. How many paid ad units are visible when you load the page? More than one? Or if it’s only one, does it dominate the page? If you weren’t paying close attention would it be possible to confuse the ads with unpaid content? If so, it’s probably low quality.
  • Use Searchmetrics. Enter the domain in the Searchmetrics’ search box to get search and social visibility, rankings, competitors, and more. It’s free, with an option to subscribe for many more features. I’ve included this in the manual review section because you have to paste each domain in separately. It does, however, provide a balancing analytical approach to the subjective nature of manual review.

Finally, when reviewing sites manually, don’t bother clicking around the site to review multiple pages. If one page is poor quality it’s likely that they all are. In particular, the home page of a site typically represents the quality of the entire site. Download this Excel spreadsheet to help organize and evaluate links to your site.a

Thursday, August 21, 2014

Google’s Push For HTTPS Is More About PR Than Search Quality

Is it worth it for webmasters to switch to HTTPS in light of Google's recent announcement?

Earlier this month, Google announced that its search ranking algorithm will now consider whether a site is HTTPS. Does this mean you should now go out and make the switch to HTTPS, or is this just political jousting with no real search relevance on Google’s part?
What Is HTTPS, Anyway?
HTTPS stands for Hyper Text Transfer Protocol Secure. It’s a variation of the popular HTTP used to transfer web pages across the internet. The difference (the “S”) is that HTTPS adds a layer of security by encrypting the data.

A normal website is accessed by putting http:// before the domain name, such as http://facebook.com. If the site supports HTTPS, the URL will look like https://facebook.com. Typically, browsers will add a padlock icon and will highlight the address bar in green when a site uses HTTPS.

The Push For Security
Over the past few years, Google has pushed for improved security on its site as well as sites in general across the internet, and for good reason. Between the NSA’s spying and routine security breaches that pilfer millions of passwords from popular sites, it’s not a bad idea for a company like Google to take security seriously.

We saw the beginnings of this a few years ago when Google began encrypting search referral terms for logged-in users, which led to a lot of frustration for marketers who no longer had access to keyword data in their analytics packages. This frustration was compounded late last year when Google moved to 100% secure search — whether searchers were logged in to Google or not.

Now, we see another step toward security with Google announcing a potential rankings boost for sites that run HTTPS.

How Will This Impact Your Rankings?
Several years ago, Google announced that site speed would be considered a ranking factor in its search algorithm. As a result, many sites rushed to improve their site load time. While users certainly appreciated the speed improvement, hardly anyone noticed a direct impact to their rankings. Why was that?

Page Speed is what’s called a “modifier.” If two web pages have very similar quality and relevance scores, Google considers which page loads faster as the deciding factor on which ranks higher. The loading speed of the page modified the ranking score only ever so slightly.

Similarly, HTTPS looks to be a modifier, from what I’ve seen. Ninety-nine percent of searches will happen without HTTPS even being looked at; but, in those rare cases when two search results are otherwise “equal,” HTTPS might push one over the edge for the higher ranking.

This Is About Politics, Not Search Quality
Google has a phase it likes to use: “HTTPS Everywhere.” In fact, that’s what they named this year’s I/O Conference. The idea is that if every site implemented HTTPS, the web would be that much more secure; but, it’s a red herring. Here’s why:

HTTPS only protects against a very limited number of site vulnerabilities, specifically wiretapping and man-in-the-middle type attacks – in other words, spying. It makes the NSA’s job of tracking and spying on internet users more difficult, but it doesn’t protect against hackers, denial-of-service attacks and scripting, server or database exploits.

Essentially, HTTPS is useful for sites that collect and transmit personal information. Banks, e-commerce sites, even social networks need to have HTTPS in place to make sure consumers’ sensitive information is protected.

For all the blogs, news sites, brand brochure-type sites or any information site that doesn’t require a member login, HTTPS is useless. It’s like the post office telling you to that all your mail needs to be written in secret code. That’s fine for the military, but do your Christmas greeting cards really need to be written in unbreakable secret code? Probably not. It’s just as pointless to require HTTPS on sites that do not transfer sensitive information.

That’s why it doesn’t make sense for Google to consider using HTTPS as a ranking signal for the majority of sites and queries. If used at all, it will always be a very lightweight signal used on a very narrow set of queries, acting only as a tie breaker between two identically ranked pages.

No, this announcement is not about search quality. It’s about Google trying to get back at the NSA for making it look bad during the PRISM scandal, and it is doing this under the guise of a social cause — internet privacy under the “HTTPS Everywhere” banner.

It’s a classic “greater good” story. Google says HTTPS will be a ranking signal so that everyone runs out and switches to HTTPS. What they’re not saying is that this change will only affect a minuscule number of sites. For everyone else, they’ve wasted time and energy switching to HTTPS for no reason – but that’s okay, because it serves the greater good of improving privacy for the internet as a whole.

What Should I Do?
What should you do about switching to HTTPS? When in doubt, do what’s best for users.

If you run a site in the e-commerce, financial, search, social networking or related fields, you should already be running HTTPS on it. In fact, if your site utilizes a member login or any type of shopping cart, you should really switch to HTTPS.

On the other hand, if you’re running a blog, brochure site, news site, or any sort of information site where users don’t provide you with any personal information, I would recommend not using HTTPS. It costs money; it takes resources to implement; it slows down your site; it’s not needed; and it won’t hurt your rankings.

Long story short: if you make the switch, do it for the users and not because Google said it’s a ranking signal, because it really isn’t.