Content Analysis DB > Fields > Field Types > My List ♥ ()

Technical Fields for Content Inventories, Audits, & other Analysis

Field Type: Technical

Technical fields can be essential to an analysis, and they have the advantage of usually being relatively easy to gather. That said, we shouldn't use that as an excuse to impress each other with huge inventories with lots of technical fields that don't provide much value to our actual goals. Some technical fields can prove very useful (like the count of pages in each PDF).

See Technical fields below. Or show fields for all field types.

If we take the traditional view of a content inventory or audit, we have rows representing each page (so each row has a unique URL) and then we have columns for things like the meta description or crawl depth. These columns are the different fields we have available to us in our content analysis.

①. Define what you are trying to accomplish.

Your content analysis needs to be grounded on your analyze goal.

②. Define your analysis approach.

Size and complexity of your digital presence

Size and complexity of your digital presence should drive your content analysis approach.
My digital presence is:
Use the calculator

Your approach

Content analysis does not necessarily mean opening up a spreadsheet. Before diving in, you should define your basic approach to the analysis.

③. Select fields toward your goal, grounded in your prioritized list of questions you want answered.

Although you can use this database however you like, in general we recommend that you build up a list of fields that will be useful for your analysis. To do so, just click on the heart next to any field name. After you have hearted some fields, you can see an analysis of your list at My List ♥ (at which point you can move to ④. Start iterating on your analysis, starting with the basics).

Crawl Depth
How many links the crawler needed to follow to get to this item.
General Usefulness:
Ease of Automation:
Consider instead: [IA] Depth
MIME Type
The technical content type reported by the web server, which the web browser uses to determine how to display it.
General Usefulness:
Ease of Automation:
Consider instead: File Format
Meta Description
The meta description. This is of limited use, aside from simply discovering what pages do not have a meta description (and therefore require one).
General Usefulness:
Ease of Automation:
Meta Keywords
Meta keywords. In most cases, very limited value. More precise meta tags (for instance topics) are usually far more useful if they exist.
General Usefulness:
Ease of Automation:
Consider instead: Topic
PDF Page Count
The count of pages in a PDF can help us understand whether there are primarily short PDFs (perhaps most easily converted to HTML) or very long PDFs (perhaps for specialist audiences).
General Usefulness:
Ease of Automation:

Legend

General usefulness is a blend of the difficulty in getting the value and how useful it is once you have it. These stars roughly correspond to:

  • ★★★ Broadly Useful. These would be worth including a most analyses.
  • ★★ Frequently useful for particular needs. These may not be quite as broadly useful, but they are frequently useful. Notably, if you have a general reason for a field that is rated two stars then you may wish to go to the category and look for others that may be slightly more useful.
  • ★ Rarely useful. These are listed since they still have a "following" or because they are easy to implement so are tempting to rely upon. Your mileage of course may vary, but in general these are less useful fields.

Ease of automation is how easy it is to get the value:

  • ⚙⚙⚙⚙ Easy to automate. Completely point-and-shoot automation (although, of course, there can be exceptions to when some information is more difficult to extract than it should be).
  • ⚙⚙⚙ Relatively easy to automate. With the right tool, this can almost certainly be automated with very limited configuration (not requiring deep technical knowledge).
  • ⚙⚙ Can probably be automated. With the right tool and some technical ability and/or time to configure (for instance, providing xpath and regex information to select content out of a page) this can probably be automated. But it probably is not just clicking a single button to set up and run (unlike the three and four gears items).
  • ⚙ Very difficult to automate. This almost certainly requires manual intervention. Note: there is a technique that can be applied in many cases to sample and then use rules and repeat.

Obviously all of the above is ratings in the general case. You may have particular needs for fields that are generally not useful, and you may already have some clean data that makes automation trivial for some elements that are more generally difficult.