My List ♥ (0)

Content Analysis Fields

Answer some questions to help select fields

If we take the traditional view of a content inventory or audit, we have rows representing each page (so each row has a unique URL) and then we have columns for things like the meta description or crawl depth. These columns are the different fields we have available to us in our content analysis.

①. Define what you are trying to accomplish.

Your content analysis needs to be grounded on your analyze goal.

Examples: Plan Digital Transformation, Test Content Hypothesis, Provide Better Bid.

②. Define your analysis approach.

Size and complexity of your digital presence

Size and complexity of your digital presence should drive your content analysis approach.

My digital presence is:

Use the calculator

Your approach

Content analysis does not necessarily mean opening up a spreadsheet. Before diving in, you should define your basic approach to the analysis.

Examples: Brute Force; Sample, Rules, Repeat; Quick Take.

③. Select fields toward your goal, grounded in your prioritized list of questions you want answered.

Although you can use this database however you like, in general we recommend that you build up a list of fields that will be useful for your analysis. To do so, just click on the heart next to any field name. After you have hearted some fields, you can see an analysis of your list at My List ♥ (at which point you can move to ④. Start iterating on your analysis, starting with the basics).

Audience ♡

Audience the content is actually written to target (whether or not it is supposed to).

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Audit Comments ♡

Free-text notes and findings recorded during a content audit. Used to capture observations, issues, and recommendations for individual pages or content assets.

General Usefulness:

Traditional Ease of Automation:

Compare with other Quality fields.

Author ♡

The person(s) who wrote the content. This may be different than who published or crafted the page.

General Usefulness:

Traditional Ease of Automation:

Compare with other Org fields.

Bucket ♡

When planning a transformation, it can be useful to bucket similar content (often grouping content that will be treated similarly to explain the situation to stakeholders).

General Usefulness:

Traditional Ease of Automation:

Consider instead: Disposition

[Category] Revenue ♡

Particularly useful for product or property pages, this field represents the revenue generated by this product or property (regardless of whether it was directly generated by digital channels or not).

General Usefulness:

Traditional Ease of Automation:

Compare with other Org fields.

Content Type ♡

Content Type (semantic type of content, such as Product Page or Event) is usually an extremely effective way to group and look for patterns across a digital presence.

General Usefulness:

Traditional Ease of Automation:

Compare with other Category fields.

Crawl Depth ♡

How many links the crawler needed to follow to get to this item.

General Usefulness:

Traditional Ease of Automation:

Consider instead: [IA] Depth

Date Last Updated ♡

The date the page content was last modified or updated. Often available in HTTP headers or CMS metadata. Distinct from Date Published, which records when the content was first made public.

General Usefulness:

Traditional Ease of Automation:

Compare with other Basic fields.

Date Published ♡

Date the content was originally published. This is frequently a useful factor in deciding what content can be culled.

General Usefulness:

Traditional Ease of Automation:

Compare with other Quality fields.

Disposition ♡

This is the treatment a piece of content will get during a transformation.

General Usefulness:

Traditional Ease of Automation:

Consider instead: Effort

Division ♡

The organizational division (or department, vice presidency, company, etc) that owns the page.

General Usefulness:

Traditional Ease of Automation:

Compare with other Org fields.

Effort ♡

Expected manual effort to transform the content item.

General Usefulness:

Traditional Ease of Automation:

Compare with other Decision fields.

File Format ♡

File format (as opposed to content type) is the actual format of the file as delivered by the web server (PDF, HTML, etc). This is especially useful for sites with a large amount of non-HTML.

General Usefulness:

Traditional Ease of Automation:

Consider instead: File Group

File Group ♡

On particularly complex digital presences, there may be so many file formats that seeing all of them in charts or presentations is confusing. File Group groups the file formats, for instance "Data or Spreadsheet" to capture CSV and Excel files.

General Usefulness:

Traditional Ease of Automation:

Compare with other Basic fields.

Folder1 ♡

Folder1 is the first "folder" in the path, such as "blog" in "test.com/blog/". This is often an effective proxy for site section.

General Usefulness:

Traditional Ease of Automation:

Consider instead: Site Section

H1 Count ♡

Count of `<h1>` elements on the page.

General Usefulness:

Traditional Ease of Automation:

Compare with other Technical fields.

Has [Problem] ♡

Yes or no, does this piece of content have this specific problem? The actual field name would depend on your situation, such as "Has Wall of Text".

General Usefulness:

Traditional Ease of Automation:

Compare with other Quality fields.

[IA] Depth ♡

The depth from the perspective of the main navigational structures, for instance the Breadcrumb Depth.

General Usefulness:

Traditional Ease of Automation:

Compare with other Category fields.

Images Without Alt (count) ♡

Count of `<img>` elements with missing or empty `alt` attributes.

General Usefulness:

Traditional Ease of Automation:

Compare with other Technical fields.

Landmark Count ♡

Count of HTML5 landmark regions (`nav`, `main`, `header`, `footer`, `aside`) or elements with landmark ARIA roles.

General Usefulness:

Traditional Ease of Automation:

Compare with other Technical fields.

MIME Type ♡

The technical content type reported by the web server, which the web browser uses to determine how to display it.

General Usefulness:

Traditional Ease of Automation:

Consider instead: File Format

Meta Description ♡

The meta description. This is of limited use, aside from simply discovering what pages do not have a meta description (and therefore require one).

General Usefulness:

Traditional Ease of Automation:

Compare with other Technical fields.

Meta Keywords ♡

Meta keywords. In most cases, very limited value. More precise meta tags (for instance topics) are usually far more useful if they exist.

General Usefulness:

Traditional Ease of Automation:

Consider instead: Topic

Near Text Duplicate ♡

Is there a near text duplicate of the page? If so, what is the URL for that near duplicate.

General Usefulness:

Traditional Ease of Automation:

Compare with other Quality fields.

Nielsen H1 Visibility ♡

Does the page keep users informed about where they are and what is going on? Look for breadcrumbs, clear page titles, current-section highlighting in navigation, and timely feedback for user actions.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H1 Visibility Rationale ♡

LLM-written rationale for the Nielsen Heuristic 1 (H1 Visibility) score.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H1 Visibility Score (LLM) ♡

LLM-assigned Nielsen Heuristic 1 (H1 Visibility) score (0-3). Used as input to the calculated Nielsen H1 Visibility field.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H10 Help & Docs ♡

Final per-heuristic score for Nielsen Heuristic 10 (H10 Help & Docs). Derived from the LLM score with type coercion (and an n/a-aware path for H5/H9).

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H10 Help & Docs Rationale ♡

LLM-written rationale for the Nielsen Heuristic 10 (H10 Help & Docs) score.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H10 Help & Docs Score (LLM) ♡

LLM-assigned Nielsen Heuristic 10 (H10 Help & Docs) score (0-3). Used as input to the calculated Nielsen H10 Help & Docs field.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H2 Real-World Match ♡

Final per-heuristic score for Nielsen Heuristic 2 (H2 Real-World Match). Derived from the LLM score with type coercion (and an n/a-aware path for H5/H9).

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H2 Real-World Match Rationale ♡

LLM-written rationale for the Nielsen Heuristic 2 (H2 Real-World Match) score.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H2 Real-World Match Score (LLM) ♡

LLM-assigned Nielsen Heuristic 2 (H2 Real-World Match) score (0-3). Used as input to the calculated Nielsen H2 Real-World Match field.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H3 User Control ♡

Final per-heuristic score for Nielsen Heuristic 3 (H3 User Control). Derived from the LLM score with type coercion (and an n/a-aware path for H5/H9).

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H3 User Control Rationale ♡

LLM-written rationale for the Nielsen Heuristic 3 (H3 User Control) score.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H3 User Control Score (LLM) ♡

LLM-assigned Nielsen Heuristic 3 (H3 User Control) score (0-3). Used as input to the calculated Nielsen H3 User Control field.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H4 Consistency ♡

Final score for H4 Consistency, blending the LLM score 60% with the structural sub 40% via 3:2 weighting in avg(). The repetition trick bypasses Chimera's strict left-to-right evaluation without explicit decimal multiplication.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H4 Consistency Rationale ♡

LLM-written rationale for the Nielsen Heuristic 4 (H4 Consistency) score.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H4 Consistency Score (LLM) ♡

LLM-assigned Nielsen Heuristic 4 (H4 Consistency) score (0-3). Used as input to the calculated Nielsen H4 Consistency field.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H4 Structural Sub ♡

Intermediate helper that aggregates the three structural-signal pattern counts (images without alt, H1 count, landmark count) into a single 0-3 value, used by the H4 Consistency formula. Exists because Chimera expressions cannot multiply directly through case() returns — pre-materializing this helper as a float column sidesteps that limitation.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H5 Error Prevention ♡

Does the page prevent problems from happening in the first place? On pages with forms or user actions, look for clear labels, sensible defaults, required-field marking, and confirmation for destructive actions. Does not apply to purely informational pages.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H5 Error Prevention Rationale ♡

LLM-written rationale for the Nielsen Heuristic 5 (H5 Error Prevention) score.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H5 Error Prevention Score (LLM) ♡

LLM-assigned Nielsen Heuristic 5 (H5 Error Prevention) score (0-3 or n/a). Used as input to the calculated Nielsen H5 Error Prevention field.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H6 Recognition ♡

Final per-heuristic score for Nielsen Heuristic 6 (H6 Recognition). Derived from the LLM score with type coercion (and an n/a-aware path for H5/H9).

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H6 Recognition Rationale ♡

LLM-written rationale for the Nielsen Heuristic 6 (H6 Recognition) score.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H6 Recognition Score (LLM) ♡

LLM-assigned Nielsen Heuristic 6 (H6 Recognition) score (0-3). Used as input to the calculated Nielsen H6 Recognition field.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H7 Flexibility ♡

Final per-heuristic score for Nielsen Heuristic 7 (H7 Flexibility). Derived from the LLM score with type coercion (and an n/a-aware path for H5/H9).

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H7 Flexibility Rationale ♡

LLM-written rationale for the Nielsen Heuristic 7 (H7 Flexibility) score.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H7 Flexibility Score (LLM) ♡

LLM-assigned Nielsen Heuristic 7 (H7 Flexibility) score (0-3). Used as input to the calculated Nielsen H7 Flexibility field.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H8 Aesthetic ♡

Final per-heuristic score for Nielsen Heuristic 8 (H8 Aesthetic). Derived from the LLM score with type coercion (and an n/a-aware path for H5/H9).

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H8 Aesthetic Rationale ♡

LLM-written rationale for the Nielsen Heuristic 8 (H8 Aesthetic) score.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H8 Aesthetic Score (LLM) ♡

LLM-assigned Nielsen Heuristic 8 (H8 Aesthetic) score (0-3). Used as input to the calculated Nielsen H8 Aesthetic field.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H9 Error Recovery ♡

Final per-heuristic score for Nielsen Heuristic 9 (H9 Error Recovery). Derived from the LLM score with type coercion (and an n/a-aware path for H5/H9).

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H9 Error Recovery Rationale ♡

LLM-written rationale for the Nielsen Heuristic 9 (H9 Error Recovery) score.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen H9 Error Recovery Score (LLM) ♡

LLM-assigned Nielsen Heuristic 9 (H9 Error Recovery) score (0-3 or n/a). Used as input to the calculated Nielsen H9 Error Recovery field.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Nielsen Overall Score ♡

Average of the 10 Nielsen per-heuristic scores. avg() skips real NULLs, so n/a paths on H5 and H9 (which materialize as NULL on float columns) do not drag the average down.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

PDF Page Count ♡

The count of pages in a PDF can help us understand whether there are primarily short PDFs (perhaps most easily converted to HTML) or very long PDFs (perhaps for specialist audiences).

General Usefulness:

Traditional Ease of Automation:

Compare with other Technical fields.

Page Views ♡

Page views are often the first thing to be added from an additional source, after the basic rows of content in the inventory/audit have been determined. Although not always the most useful metric for the value of content, it's often the most immediately tangible proxy.

General Usefulness:

Traditional Ease of Automation:

Consider instead: [Success Event] Count

[Problem] Count ♡

How often does the problem happen on the page? This would be a specific issue, so something like "Left Nav Count".

General Usefulness:

Traditional Ease of Automation:

Compare with other Quality fields.

[Problem] Example ♡

An example of a problem (on a specific page) you are investigating. This field could be repeated in an analysis, with actual fields like "Table Example" or "Bad Character Encoding Example".

General Usefulness:

Traditional Ease of Automation:

Compare with other Quality fields.

Reading Level ♡

Reading Level represents the education level required to understand text. Since much content is overly complex, this can be useful to identify where there may be education requirement mismatches.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

Redundant ♡

Is the information redundant with respect to other content on the site? This is one of the anchors of the highly popular ROT three fields.

General Usefulness:

Traditional Ease of Automation:

Consider instead: Near Text Duplicate

Resourcing ♡

Who will do the actual transformation. For a large site this would be the team, and for a smaller site it may be the individual.

General Usefulness:

Traditional Ease of Automation:

Compare with other Decision fields.

Site ♡

Within which site (as experienced by the site visitor) does this content appear?

General Usefulness:

Traditional Ease of Automation:

Consider instead: Site Type

Site Section ♡

The section of a site (for instance the news section, or a section for a particular program).

General Usefulness:

Traditional Ease of Automation:

Compare with other Category fields.

Site Type ♡

For large scale digital presences, grouping sites by type can be a highly effective way of managing and transforming.

General Usefulness:

Traditional Ease of Automation:

Compare with other Category fields.

Source System ♡

Where is the primary source of content for this URL? For instance, what CMS, document management system, or product information system does this content primarily come from?

General Usefulness:

Traditional Ease of Automation:

Compare with other Category fields.

[Success Event] Count ♡

The count of events that were successful from this page, such as the count of purchases or the count of downloads.

General Usefulness:

Traditional Ease of Automation:

Compare with other User fields.

[Target] Field ♡

What the *desired* field value would be. For instance, there may be an existing content type but in some cases it should be another content type (so you would have a Content Type field as well as a Target Content Type field).

General Usefulness:

Traditional Ease of Automation:

Compare with other Decision fields.

Title ♡

The title of the content is the most useful to people when looking at individual "rows" of an inventory. That said, unlike URL, these are not guaranteed to be unique.

General Usefulness:

Traditional Ease of Automation:

Compare with other Basic fields.

Tone ♡

How the content is communicated with language. Tone may reasonably vary across a digital presence.

General Usefulness:

Traditional Ease of Automation:

Compare with other Brand fields.

Topic ♡

The topic/subject of the content.

General Usefulness:

Traditional Ease of Automation:

Compare with other Category fields.

URL ♡

This is a basic, foundational requirement of an inventory where each "row" is a URL.

General Usefulness:

Traditional Ease of Automation:

Compare with other Basic fields.

Unique Content ID ♡

An ID unique for the piece of content. This should be unique across the entire list.

General Usefulness:

Traditional Ease of Automation:

Compare with other Basic fields.

Voice ♡

Brand voice

General Usefulness:

Traditional Ease of Automation:

Compare with other Brand fields.

Legend