Most content analysis is toward the goal of making some sort of content transformation. In particular, there are fields you may add that are for that decision you make about what to do with the content.
If we take the traditional view of a content inventory or audit, we have rows
representing each page (so each row has a unique URL) and then we have columns
for things like the meta description or crawl depth. These columns are the different
fields we have available to us in our content analysis.
Since you are analyzing a focused site, unless you are
trying to set up an ongoing content analysis dashboard,
you can probably get away with using a spreadsheet.
That said, any extra fields you add here will magnify your
manual work. So you still should not take a kitchen sink approach
to your analysis, but still be deliberate with the fields you add.
Since you are analyzing a complex site, you need to be very
aware of how difficult it will be to get the values (or
figure out a way to sample and create rules to leverage manual
effort).
Some fields
have more value than others in content analysis (the content analysis value is the
y axis) and some are more amenable to automation (the x axis).
③. Select fields toward your goal, grounded in your prioritized list of questions you want answered.
Although you can use this database however you like,
in general we recommend that you build up a list of
fields that will be useful for your analysis. To do so,
just click on the heart next to any field name.
After you have hearted some fields, you
can see an analysis of your list at My List ♥
(at which point you can move to ④. Start iterating on your analysis, starting with the basics).
When planning a transformation, it can be useful to bucket similar content (often grouping content that will be treated similarly to explain the situation to stakeholders).
What the *desired* field value would be. For instance, there may be an existing content type but in some cases it should be another content type (so you would have a Content Type field as well as a Target Content Type field).
General usefulness is a blend of the difficulty in getting the value
and how useful it is once you have it. These stars roughly correspond to:
★★★ Broadly Useful. These would be worth including a most analyses.
★★ Frequently useful for particular needs. These may not be quite as broadly
useful, but they are frequently useful. Notably, if you have a general reason for a field
that is rated two stars then you may wish to go to the category and look for others that
may be slightly more useful.
★ Rarely useful. These are listed since they still have a "following"
or because they are easy to implement so are tempting to rely upon. Your mileage of course
may vary, but in general these are less useful fields.
Ease of automation is how easy it is to get the value:
⚙⚙⚙⚙ Easy to automate. Completely point-and-shoot automation (although, of course,
there can be exceptions to when some information is more difficult to extract than it should be).
⚙⚙⚙ Relatively easy to automate. With the right tool, this can
almost certainly be automated with very limited configuration (not requiring deep technical
knowledge).
⚙⚙ Can probably be automated. With the right tool and some
technical ability and/or time to configure (for instance, providing xpath and regex
information to select content out of a page) this can probably be automated. But it
probably is not just clicking a single button to set up and run (unlike the three and four
gears items).
⚙ Very difficult to automate. This almost certainly
requires manual intervention. Note: there is a technique that can be applied in many cases
to sample and
then use rules and repeat.
Obviously all of the above is ratings in the general case. You may have particular
needs for fields that are generally not useful, and you may already have some
clean data that makes automation trivial for some elements that are more generally
difficult.