A myriad of tools will generate a URL list. But a list of URLs only gets you so far. Content Chimera can either crawl your site to generate a list of URLs or it can import from many of these tools (like Screaming Frog, Sitebulb, or Xenu).
Lists of URLs might be ok for small sites, and if other tools already have all the canned reports you need then perhaps that's enough.
But to do strong content strategy oriented analysis, we need:
A list of URLs is very technical. Web servers are often misconfigured so the same page may be represented multiple times in a URL list. Furthermore, even a well-configured web server and CMS will render pagination pages which conceptually are not separate pages from a content analysis perspective. Content Chimera distills the list of URLs down to a smaller list, attempting to get to the actual content.
Furthermore, we want to derive information from the URLs that is useful for content analysis. For instance, Content Chimera groups file types so you aren't reduced to juggling multiple mime types that really represent the same content.
For this you need:
In Content Chimera, you don't spend time staring at the list of content. Most of your analysis time is spent in charts, where you can click to drill down to the actual content. That said, the underlying database of content has deduplicated URLs and added information to ease analysis. You can export the manifest at any time.
Content Chimera assumes it does not have all the answers. But it lets you bring data you already have about your content from any source in order to analyze your content. For instance, built-in data sources include Google Analytics and Matomo, but you can bring data from any source that can export CSV (like your CMS or another analytics tool).
Furthermore, Content Chimera allows you to scrape information off pages.
Note that regardless of the source of data (information Content Chimera collects by default, imported information from other sources, or scraped information), the information can be used for charting as well as defining rules.
For this you need:
In this example, we merge Matomo web analytics into the existing inventory. You can see there were several built-in data sources, but you can also define custom data sources. After merging, we go to change a bar chart so the heights are based on overall pageviews.
Some tools have canned reports, but Content Chimera allows you to truly explore your data. In particular, you can create charts that combine data from multiple sources . For instance, you can chart the distribution of pageviews (from Google Analytics) but color the chart by file group (from Content Chimera's native analysis). Or you could chart the distribution of publication dates (either from the CMS or from scraping off the pages) against content type (either from the CMS, from scraping off the pages, or by defining Content Chimera rules about folders). Read the documentation.
For this you need:
We can change the level of analysis in Content Chimera. Here, we are switching between charting a group of sites and a specific site.
Some of the most interesting analysis is across an entire digital presence that spans multiple sites. But these projects can be particularly difficult to get our hands around.
Content Chimera allows you to manage clients (organizations), groups of sites, and sites -- and you can do your analysis at any of these levels. Then you can chart against the entire digital presence, and also run rules across all sites (in addition, Content Chimera supports "tokens" for sites to aid in multi-lingual analysis). Read the documentation.
For this you need:
As an industry we need to:
One of the reasons we continue to do line-by-anlysis is that using rules can be technically challenging (at David Hobbs Consulting in the past we have duct-taped this on top of Excel / Google Sheets, but hacking this is ugly and error-prone). Content Chimera builds in a rules engine, so that you can make decisions at scale. Read the documentation.
For this you need:
In this example, we have a set of rules of how we want to transform content. We then run that ruleset against the existing inventory in order to see the implications of the rules.
Duplicate content is a problem. Finding duplicate content is virtually impossible to do manually, except on the smallest sites.
Content Chimera analyzes the actual text on pages to discover near-duplicates (Content Chimera does NOT use the approach of fooling Google to tell Content Chimera what Google thinks is duplicate content). As with everything in Content Chimera, this is then another set of data that can be used in charting and rules.
For this you need:
Content Chimera can search for, and then chart, a duplicate content analysis. Red is the duplicate content.
Content Chimera is optimized for making content decisions at scale. Below are some key features.
Rather than do a line-by-line analysis of URLs, Content Chimera has a rules engine that allows you to assign rules en masse.
Content Chimera is built for big sites and large suites of sites. Although it stores information on all your content, in charting you can randomly sample examples.
Getting a long list of URLs isn't that helpful if large swaths of the URLs are really for the same content (for instance with session IDs in URLs). Content Chimera deduplicates URLs.
Using the rules that you define, Content Chimera assigns content to: bucket of content (similar content), disposition (how it will be treated), and resourcing (who will do it).
Content Chimera makes no attempt to declare up front what metadata is important for any site. Instead, you can merge in data that you need for your analysis from any source.
At any time you can export your content manifest, which is a line-by-line spreadsheet of the content, the assignments, and metadata.
Not only is Content Chimera designed for large digital presences, it is designed for a single organization to have multiple clients (each with its own sites).
Content Chimera is built for complex, global digital presences. Not only can you analyze a lot of sites individually, but you can analysis groups of sites or an entire organization's presence.
Content Chimera has a wide range of patterns built in that it can scrape of its cache of your site, but you can also scrape off arbitrary patterns as well.
If you are analyzing dates, Content Chimera will automatically "bin" into histogram bars if you have too much to usefully chart one by one.
You probably want to start with the big picture, so charting by default only shows the biggest grouping of any data you chart, creating an "other" for the rest.
Content Chimera automatically derives metadata from your content to aid in your analysis, such as "folders" in URLs and grouping into file groups.
Many tools that people use to understand their content are basically crawlers that generate spreadsheets of a static set of fields. Although Content Chimera can consume content from any data source for your analysis, the most straightfoward way is almost always its industrial-strength crawler. One thing that is very challenging to see in crawlers is how well the crawl is actually proceeding. For that we've built in a variety of mechanisms of tracking the progress (and the ability to start and restart crawls).
This is a tool for teams that deal with complex digital presences, especially teams that believe content is important and should be dealt with in a coherent and high-impact manner.