Tutorial: WebPageTest Private Instance with Google Cloud Compute (GCP)

Bulk test isn’t available in public WebPageTest offer.
I’ve quickly made the tutorial which you can reuse to create your own instance, for a price starting from $1 per month if you are careful 🙂

1/ Setup the main server

  • Ubuntu 16.04 LTS
  • 1-2 CPU
  • persistent (non persistent ok only for agents)
  • SSD – 10 GB
  • Add public SHH key
  • Allow HTTP traffic
  • Create instance
  • Connect to ssh using Google Cloud Console and paste:
    • bash <(curl -s https://raw.githubusercontent.com/WPO-Foundation/wptserver-install/master/gce_ubuntu.sh)
  • Leave empty – hit enter
  • Save location key
  • Save wpt_data metadata

2/ Setup agent template

  • Launch Google Cloud Shell and paste:
  • bash <(curl -s https://raw.githubusercontent.com/WPO-Foundation/wptagent-install/master/gce_image.sh)
  • Leave wpt-agent as is.
  • Paste wpt_data metadata saved when creating server

3/ Start an agent in the desired location

  • Click “Create an instance”
  • “New instance from VM” (left panel)
  • Select wpt-agent
  • Continue
  • Name = informational.
    • Use something which helps to identify in which Area where it will be hosted
  • Allow HTTP trafic (did not test without this option)
  • Create the instance

4/ Start a test

  • Access http://$Public_IP_of_The_Server/
  • Select test location
  • If no test location, restart the main server (not the agent)
  • Bulk testing is now available.

How much it cost ?

Traffic cost is close to 0$ / month, excepted heavy usage.

If you don’t shutdown or close the server, it will cost:

  • Agent: $15 / month / in US based datacenter ($0,021 / hour)
  • Main server: $50 / month / in US based datacenter ($0,068 / hour)
    => $65 / month
  • For entreprise needs, with daily monitoring:
    • We cover all regions available
    • The main server & agents are never stopped
    • 25 regions x 18$ on average
    • 1 main server at 50$
    • a bit of storage + backup
    • Then add variable cost.
    • => Around $600 / month

Official documentation is here: https://docs.webpagetest.org/private-instances/gce_server/

Add column with domain name to CSV file with Python

A script to add a column containing only the domain name to an existing CSV file. It extract it from a column containing an URL.

It works with .co.uk and other country code top-level domain.

Just change “5” by the column containing the URL.

Also don’t forget to adjust. Here is it setup for semi-colon for input and output. Just change the delimiter by the one you need.

import csv
import tldextract

with open('input.csv','r') as csvinput:
    with open('output.csv', 'w') as csvoutput:
        writer = csv.writer(csvoutput, delimiter=';')
        reader = csv.reader(csvinput, delimiter=';')

        all = []
        row = next(reader)

        for row in reader:
            #Column of URL is #5
            ext = tldextract.extract(row[5])


Deep diving in Search Console data

Bellow a list of interesting ressources to play with Search Console data.

I’ve put in bold what makes the dashboard somehow different.

Feel free to comment to complete this list.


Using Google Sheet

Search Console Explorer Sheet by Hannah Rampton

Using R

Average position per Continent analysis by Pascal Schmidt

google analytics r search console

Correlations for Clicks, Impressions, and Positions

correlation google search console R

Finding the LCP node with Chrome DevTools

Official documentation about Largest Contentful Paint is super interesting and explicit. But it misses one thing: How to identify the largest node / block / image /text ?

Chrome DevTools allows you to find which node you should optimize. Simply follow the steps below.

  • Open Chrome
  • Open the page you want to find the LCP block on
  • Open Chrome DevTools
    • If you are on a PC, type Ctrl+Shift+C
    • If you are on a Mac, just type the following hieroglyph:
The shortcut to open Chrome DevTools
  • Follow the steps recorded in this video or follow steps written below the video if you need more details.
  • Open the tab “Performance”, between “Network” and “Memory”
The horizontal menu with the Performance tab
  • Two options to record all the events and other data
    • Option #1 – Record and manually reload and stop
      • Click on the Record icon and reload manually the page by clicking on the icon next to the URL, then come back in DevTools and click on the blue button “stop”
    • Option #2 – Automatic record, reload and stop
      • Click on the Reload icon
Three buttons: Record, Record->Reload->Stop and Clear results

Which option choosing ?

It depends… Sometimes you will have to interact with the page to get the LCP. In this case you should choose to record manually.
  • The LCP tag should be in the Timings row. This section is below Frames and Interactions. If you don’t see it immediately, try by scrolling below Interactions.
  • Click on LCP, it will show you which element is considered as the LCP
  • Click on the related node (if present), it will send you directly to the node in the source code of the page

I hope it is clear enough. If you have any question, feel free to contact me on Twitter.

Google and Canonical URL

In this post I will change the canonical URL in the <head>, just to see how Google and Googlebot behave between discovery of the URL discovered in the tag.

The canonical tag looks like this:

<link rel="canonical" href="https://www.url-to-crawl.com" /> 

I’ve changed the URL in this example on purpose, in order to make sure that Googlebot discovers the URL only through the rel canonical tag.

I’ll keep you posted with the results !

Edit 15 days after: Googlebot didn’t crawl the URL set in the canonical !

A slow page, just to check the impact on log

It took a certain to load this page, isn’t it ?

It took probably a bit more than 10 seconds.

It’s normal, the idea is to check how Google SERPs behave with this slow page.

I’ve added a PHP code snippet to make this page to take at least 10 seconds to load using “Insert PHP Code Snippet” WP plugin.

FYI, it is how it looks in the UI of WP:

Well, I had to add content to make this test to work, so here it is !

To find this page during the test, he magic word is: drumblebassseo .

I’ve copied and past few sentences below, found randomly on the web, just to give to credibility to this page:

The classic editor does not solve this issue nor does the tinymce help. I’m embedding HTML from Amazon. Amazon gives HTML for products for their affiliates.

And also this one:

A few people were expressing their frustration with inserting Amazon compliant images in WordPress.  The problem seems simple enough.  Take an image from Amazon and put it on your website.

edit: I’ve removed the tag to slow down the website. I’ll check the results now !

Impact of search of images on log for SEO

I recently stated a huge difference between the traffic monitored in the Search Console and what I could find in the log files using Kelogs log analyzer. My first hypothesis is that Google Image could be at the origin of this difference, with a preload in background. Second hypothesis is that the first results in Google Web is preloaded if the page is usually too slow.

Protocol to check the first hypothesis:

I’ll try to get the image below indexed for this website, despite it is hosted on amazon, and I will live check the impact on log files when I search for it in Google.

Let’s come back in few days when Googlebot will have indexed this page and display the image in Google Image (I’ll use “site:quentinadt.com” to check it).

So here is the picture of someone swimming:

A women crawling

To check the second hypothesis, Ill make a page which is very slow but ranks on a specific unique keyword.

Test of Indexation of images in background by Google

In this post, there is a background image. It is the picture of my 3 years old Macbook Pro 2016 which is already dying… The screen is sometimes not usable, and most of the time there is just one line, vertical.

The idea is to check if either of not Google will index an image which is accessible only from a background CSS call.

John Muller, from Google, in 2018:

And as far as I know we don’t use CSS images at all for image search.

Source: https://www.seroundtable.com/google-image-search-css-25068.html

Let’s see what if it is still true in 2020 with the mobile first passage, chrome evergreen, the interpretation of the JS, etc.

Results of this SEO test in a few days!

Edit: One month after…

The image is still not indexed.
Current conclusion is that background images aren’t indexable by default.