Ultimately we want to make it easy for users to complete certain tasks. Sometimes we can directly capture this type of metric, but usually we need to confine ourselves to metrics that are more removed from the actual user experience.
If we take the traditional view of a content inventory or audit, we have rows
representing each page (so each row has a unique URL) and then we have columns
for things like the meta description or crawl depth. These columns are the different
fields we have available to us in our content analysis.
Since you are analyzing a focused site, unless you are
trying to set up an ongoing content analysis dashboard,
you can probably get away with using a spreadsheet.
That said, any extra fields you add here will magnify your
manual work. So you still should not take a kitchen sink approach
to your analysis, but still be deliberate with the fields you add.
Since you are analyzing a complex site, you need to be very
aware of how difficult it will be to get the values (or
figure out a way to sample and create rules to leverage manual
effort).
Some fields
have more value than others in content analysis (the content analysis value is the
y axis) and some are more amenable to automation (the x axis).
③. Select fields toward your goal, grounded in your prioritized list of questions you want answered.
Although you can use this database however you like,
in general we recommend that you build up a list of
fields that will be useful for your analysis. To do so,
just click on the heart next to any field name.
After you have hearted some fields, you
can see an analysis of your list at My List ♥
(at which point you can move to ④. Start iterating on your analysis, starting with the basics).
Page views are often the first thing to be added from an additional source, after the basic rows of content in the inventory/audit have been determined. Although not always the most useful metric for the value of content, it's often the most immediately tangible proxy.
Reading Level represents the education level required to understand text. Since much content is overly complex, this can be useful to identify where there may be education requirement mismatches.
General usefulness is a blend of the difficulty in getting the value
and how useful it is once you have it. These stars roughly correspond to:
★★★ Broadly Useful. These would be worth including a most analyses.
★★ Frequently useful for particular needs. These may not be quite as broadly
useful, but they are frequently useful. Notably, if you have a general reason for a field
that is rated two stars then you may wish to go to the category and look for others that
may be slightly more useful.
★ Rarely useful. These are listed since they still have a "following"
or because they are easy to implement so are tempting to rely upon. Your mileage of course
may vary, but in general these are less useful fields.
Ease of automation is how easy it is to get the value:
⚙⚙⚙⚙ Easy to automate. Completely point-and-shoot automation (although, of course,
there can be exceptions to when some information is more difficult to extract than it should be).
⚙⚙⚙ Relatively easy to automate. With the right tool, this can
almost certainly be automated with very limited configuration (not requiring deep technical
knowledge).
⚙⚙ Can probably be automated. With the right tool and some
technical ability and/or time to configure (for instance, providing xpath and regex
information to select content out of a page) this can probably be automated. But it
probably is not just clicking a single button to set up and run (unlike the three and four
gears items).
⚙ Very difficult to automate. This almost certainly
requires manual intervention. Note: there is a technique that can be applied in many cases
to sample and
then use rules and repeat.
Obviously all of the above is ratings in the general case. You may have particular
needs for fields that are generally not useful, and you may already have some
clean data that makes automation trivial for some elements that are more generally
difficult.