Go beyond the URL list

A myriad of tools will generate a URL list. But a list of URLs only gets you so far. Content Chimera can either crawl your site to generate a list of URLs or it can import from many of these tools (like Screaming Frog, Sitebulb, or Xenu).

Lists of URLs might be ok for small sites, and if other tools already have all the canned reports you need then perhaps that's enough.

But to do strong content strategy oriented analysis, we need:

  • Content List. Many sites we analyze are misconfigured, resulting in URLs that really represent the same content. Content Chimera distills the list of URLs to a list of content.
  • Multi-source Data. Sometimes you need data from other sources. Content Chimera is agnostic about data sources, and you can intelligently merge data from whatever sources you wish.
  • Ad hoc Charts Canned charts can be nice, and Content Chimera has canned charts. But not all sites are the same, so Content Chimera allows you to dynamically chart and explore your content.
  • Multi-site. Some of the most interesting opportunities for transformation are across an entire constellation of an organization's sites. Content Chimera allows analysis at the organization, group of sites, or specific site level.
  • Rules. As an industry we need to stop analyzing all content page by page. Content Chimera builds in a rules engine that allows you to define rules and see the implications of those rules.
  • Duplicates. Aside from the tiniest sites, it's impossible to manually tell if a piece of content is a duplicate of another. Content Chimera analyzes the text of content to find duplicates.

The Content List

A list of URLs is very technical. Web servers are often misconfigured so the same page may be represented multiple times in a URL list. Furthermore, even a well-configured web server and CMS will render pagination pages which conceptually are not separate pages from a content analysis perspective. Content Chimera distills the list of URLs down to a smaller list, attempting to get to the actual content.

Furthermore, we want to derive information from the URLs that is useful for content analysis. For instance, Content Chimera groups file types so you aren't reduced to juggling multiple mime types that really represent the same content.

For this you need:

  • By default collapse URLs that are differentiated only for technical reasons.
  • Deep customization on how to deduplicate URLs.
  • Automatically extract and derive information useful for content analysis (like the URL "folders").

In Content Chimera, you don't spend time staring at the list of content. Most of your analysis time is spent in charts, where you can click to drill down to the actual content. That said, the underlying database of content has deduplicated URLs and added information to ease analysis. You can export the manifest at any time.

Multi-source Data

Content Chimera assumes it does not have all the answers. But it lets you bring data you already have about your content from any source in order to analyze your content. For instance, built-in data sources include Google Analytics and Matomo, but you can bring data from any source that can export CSV (like your CMS or another analytics tool).

Furthermore, Content Chimera allows you to scrape information off pages.

Note that regardless of the source of data (information Content Chimera collects by default, imported information from other sources, or scraped information), the information can be used for charting as well as defining rules.

For this you need:

  • Intelligent aggregation of data (for instance, when there are multiple rows in Google Analytics corresponding to one page on your site: summing page views and averaging bounce rates).
  • Store arbitrary data.
  • Define new data sources.
  • Deal with data sources that don't provide full URLs.
Animated gif of a screencast of merging Matomo web analytics using Content Chimera

In this example, we merge Matomo web analytics into the existing inventory. You can see there were several built-in data sources, but you can also define custom data sources. After merging, we go to change a bar chart so the heights are based on overall pageviews.

Ad Hoc Charts

Some tools have canned reports, but Content Chimera allows you to truly explore your data. In particular, you can create charts that combine data from multiple sources . For instance, you can chart the distribution of pageviews (from Google Analytics) but color the chart by file group (from Content Chimera's native analysis). Or you could chart the distribution of publication dates (either from the CMS or from scraping off the pages) against content type (either from the CMS, from scraping off the pages, or by defining Content Chimera rules about folders). Read the documentation.

For this you need:

  • Ability to quickly chart any field, regardless of data source.
  • Automatically summarize long-tail data, allowing you to focus on large swaths of content.
  • Automatically sample any part of a chart, so you can see both the big picture and details.
Animated GIF of a screencast of Content Chimera changing the level of analysis from a group of sites to a specific site

We can change the level of analysis in Content Chimera. Here, we are switching between charting a group of sites and a specific site.


Some of the most interesting analysis is across an entire digital presence that spans multiple sites. But these projects can be particularly difficult to get our hands around.

Content Chimera allows you to manage clients (organizations), groups of sites, and sites -- and you can do your analysis at any of these levels. Then you can chart against the entire digital presence, and also run rules across all sites (in addition, Content Chimera supports "tokens" for sites to aid in multi-lingual analysis). Read the documentation.

For this you need:

  • Manage clients, site groups, and sites.
  • Chart at any of these levels.
  • Run rules on any of these levels.
  • Rationalize data so that it is consistent across levels.


As an industry we need to:

  • Stop assuming that content analysis means manually reviewing all the rows in a content inventory spreadsheet.
  • Separate deciding from acting on the decision.

One of the reasons we continue to do line-by-anlysis is that using rules can be technically challenging (at David Hobbs Consulting in the past we have duct-taped this on top of Excel / Google Sheets, but hacking this is ugly and error-prone). Content Chimera builds in a rules engine, so that you can make decisions at scale. Read the documentation.

For this you need:

  • A rules engine.
  • Define a rule based on any of the content data, regardless of source.
  • Assign to bucket (what the content is), disposition (what to do with it), and assignment (what team will execute upon this later).
  • Export the manifest of all assignments.
Animated GIF of screencast of Content Chimera running a set of rules and seeing the resulting charts.

In this example, we have a set of rules of how we want to transform content. We then run that ruleset against the existing inventory in order to see the implications of the rules.


Duplicate content is a problem. Finding duplicate content is virtually impossible to do manually, except on the smallest sites.

Content Chimera analyzes the actual text on pages to discover near-duplicates (Content Chimera does NOT use the approach of fooling Google to tell Content Chimera what Google thinks is duplicate content). As with everything in Content Chimera, this is then another set of data that can be used in charting and rules.

For this you need:

  • Textual analysis to discover duplicate and near-duplicate content
  • Analysis to break out canonical-copies
  • Charting to summarize duplicates and their distribution

Content Chimera can search for, and then chart, a duplicate content analysis. Red is the duplicate content.

Key Features

Content Chimera is optimized for making content decisions at scale. Below are some key features.


Rather than do a line-by-line analysis of URLs, Content Chimera has a rules engine that allows you to assign rules en masse.


Content Chimera is built for big sites and large suites of sites. Although it stores information on all your content, in charting you can randomly sample examples.


Getting a long list of URLs isn't that helpful if large swaths of the URLs are really for the same content (for instance with session IDs in URLs). Content Chimera deduplicates URLs. 


Using the rules that you define, Content Chimera assigns content to: bucket of content (similar content), disposition (how it will be treated), and resourcing (who will do it). 

Arbitrary metadata

Content Chimera makes no attempt to declare up front what metadata is important for any site. Instead, you can merge in data that you need for your analysis from any source. 

Manifest export

At any time you can export your content manifest, which is a line-by-line spreadsheet of the content, the assignments, and metadata. 


Not only is Content Chimera designed for large digital presences, it is designed for a single organization to have multiple clients (each with its own sites). 


Content Chimera is built for complex, global digital presences. Not only can you analyze a lot of sites individually, but you can analysis groups of sites or an entire organization's presence.


Content Chimera has a wide range of patterns built in that it can scrape of its cache of your site, but you can also scrape off arbitrary patterns as well. 


If you are analyzing dates, Content Chimera will automatically "bin" into histogram bars if you have too much to usefully chart one by one.

"Other" in charting

You probably want to start with the big picture, so charting by default only shows the biggest grouping of any data you chart, creating an "other" for the rest.


Content Chimera automatically derives metadata from your content to aid in your analysis, such as "folders" in URLs and grouping into file groups. 

Easy to crawl and watch progress

Many tools that people use to understand their content are basically crawlers that generate spreadsheets of a static set of fields. Although Content Chimera can consume content from any data source for your analysis, the most straightfoward way is almost always its industrial-strength crawler. One thing that is very challenging to see in crawlers is how well the crawl is actually proceeding. For that we've built in a variety of mechanisms of tracking the progress (and the ability to start and restart crawls).

Work on large digital presences?

This is a tool for teams that deal with complex digital presences, especially teams that believe content is important and should be dealt with in a coherent and high-impact manner.

  • Integrators and marketing agencies. Use Content Chimera when proposing work, both to better understand the project and to communicate to potential clients that you take content seriously. Of course, also use it to work with your clients to implement strong content transformations.
  • Content strategy consultancies. If you are already helping large clients to develop content strategies, Content Chimera will help you provide even higher value, without having to get into a ton of technical details in Excel. And actually test for things like duplicates.
  • Large digital presence owners. As of now, Content Chimera is optimized for professional services firms that work on lots of client sites. Except for extremely large organizations, it may not make sense to use Content Chimera directly. Please contact us to help you decide.