Parsely-Page

Specifying meta information in the page is the second step of the integration. Jump to documentation about the installing tracker code if you don’t have the tracker code running yet!

You can add the meta information to your pages in three different ways:

  • JSON-LD (recommended) - the metadata is included in a script tag and thus simplifies escaping of the content (most of the time the serving framework’s JSON converter will suffice). Please note: our crawlers don’t run javascript, so the contents have to be in the actual source of the page.
  • repeated metatags (alternative) - if the CMS you are using has a way to provide page information as meta tags in the page header, then this might be a more convenient option.
  • parsely-page (deprecated) - legacy format where the metainformation is stored in JSON. This has now been replaced with JSON-LD. The crawlers will keep supporting this format but new projects should consider preferring JSON-LD

The parsely-page meta tag is used to reliably identify information about the page, like the author(s), publishing date, title, and the section it belongs to. Add the tag to the <head> element of the tracked pages. The content parameter should be a JavaScript dictionary serialized to JSON.

Example:

<meta name='parsely-page'
      content='{"title": "Zipf\u0027s Law of the Internet: Explaining Online Behavior",
                "link": "http://blog.parsely.com/post/57821746552",
                "image_url": "http://blog.parsely.com/inline_mra670hTvL1qz4rgp.png",
                "type": "post",
                "post_id": "57821746552",
                "pub_date": "2013-08-15T13:00:00Z",
                "section": "Programming",
                "authors": ["Alan Alexander Milne"],
                "tags": ["statistics","zipf","internet","behavior"]
               }'>

Note

For the purposes of readability the value of the content attribute in the code above is indented and attributes come each on a new line. This is not valid HTML and in production environment the value of the content attribute should be all in a single line.

Field description

Field Description
title Post or page title (article headline).
link Canonical URL for post/page. For page groups like galleries, it should always point to the main article.
image_url URL of the image associated with the post/page.
type Page type - “post”, “frontpage” or “sectionpage”
post_id String that uniquely identifies this post. When omitted, will fall back to the canonical link.
pub_date Publication date, as ISO 8601 UTC timezone string.
section Section the page belongs to (e.g. A+E, Politics).
authors List of post authors.
tags List of tags associated with this post.

Technical Caveats

Escape single and double quotes in JSON item values. Single quotes should be replaced with the JSON unicode equivalent \u0027. Double quotes should be escaped with a backslash symbol like this: \".

Values in parsely-page will appear literally inside Parse.ly Dash. String values supplied here, specifically title, author, and section, will appear in Parse.ly analytics exactly as they are specifed in the tag. As a result, make sure to use proper capitalization and specify the values as you expect them to appear.

The parsely-page metadata tag cannot be loaded asynchronously. The Parse.ly crawler will not execute JavaScript. It must be able to access the metadata tag from the results of a single GET request.

Content behind pay walls

If you have content that is accessible only after logging into the system, you should coordinate with Parse.ly support team to arrange for a special login account, only accessible to our Crawler. The credentials for this account will only be used by the Crawler and will not be shared with anyone else.