Content Analysis Database > Dispositions > Change Format, and Restructure
Opportunity for improvement: high

Change Format, and Restructure

How to use dispositions


Manually change the format, but also restructure (for instance, convert a single multi-page PDF into multiple HTML pages)

In this disposition we are doing two things:

  • Changing file format (HTML, PDF, etc)

  • Restructuring the content items (such as splitting a PDF into multiple HTML pages)

Note that this is within a constellation of related dispositions:

Same file format

Different file format


Move As Is (Manual)

Change Format, One for One


Rewrite, and Regroup Pages

This Disposition
Change Format, and Restructure

The type of reformatting we are talking about here is file format (HTML, PDF, Excel, MP4, etc) and not restructuring the HTML (new classes, more bullets, etc) or other within-file-format restructuring. Content is frequently stuck in formats that are not very useful to site visitors, and this format is frequently PDF. Many times this stems from existing print-heavy processes. Some examples of format changes are:

  • Data locked in a PDF → a spreadsheet with multiple tabs of data (or, for maximum portability, multiple CSVs)

  • General content locked in a PDF → an interlocking set of HTML pages.

In most cases the resulting newly structured content will replace the existing format. This is because we need to break the link to the print process. If we have both the existing and the new format then it will perpetuate the need to generate the print-ready version (instead of, for example, creating a strong style to automatically generate a PDF from the Excel, in cases where that is relevant).

When to assign this disposition to content:
  • Only for high priority content (since it is a high cost disposition, also with wide variability)
  • Preferably when have clearly defined instructions for people to do this conversation (even though a large amount of discretion is involved in this disposition, you want to make it as streamlined as possibly)
  • When the resulting format will be easier for site visitors (for example: in general you may move from PDF to HTML but not the other way around). Also, sometimes other data types are "locked" into less-than-ideal formats (like data, which is much more natural as a spreadsheet, stored as a PDF)
  • Move As Is (Manual)
  • Change Format, One for One
  • Rewrite, and Regroup Pages
System-wide startup effort

As a heavyweight disposition, you want to be prepared. In particular, you need to:

  • Have a plan for generating the original format on demand, as needed (if you are converting data from PDF to CSV, then you probably don't need a plan to generate PDF from the CSV — that said, you may still need a plan for automatically generating a print-ready version of HTML pages that have been converted from PDF)
  • Clearly define the general standards of how you will do the conversion. For the site visitor, this should be done in a consistent manner. For example, if you are converting many multi-page PDFs to multiple HTML pages then you may want a consistent approach to creating a left nav and tiering the information for site visitors of different audiences.
  • Define the workflow, so you don't unnecessarily touch content more than it needs (although this disposition is relatively high touch regardless)
Average per-item manual effort
1 hours,30 minutes
Effort per (potential) handling step:
Step Effort What is this?
Sort high Decide what to do with content item
Place high Place in IA
Edit high Edit text/content (NOT technical)
Move / Transform high Physically move/enter/transform content (NOT the words)
Enhance / Tag high Prepare the metadata, especially tagging/retagging
QA high Review content for quality
More on estimating effort: article | webinar | slideshare | mini-report.