Content Analysis DB > Fields > Source Types > My List ♥ ()

Crawler Content Analysis Fields

Source Type: Crawler

A crawler follows all the links of a site to find the URLs and pull basic data out of it. A crawler in particular has information that no other tool can provide, such as information like crawl depth.

In Content Chimera

Chimera is an industrial-strenth crawler, with features such as circuit breakers to avoid persisting down unproductive paths. It also does some initial graph database analysis such as the most common label for links to a page.

See Crawler fields below. Or show fields for all field types.
Crawl Depth
How many links the crawler needed to follow to get to this item.
General Usefulness:
Ease of Automation:
Consider instead: [IA] Depth
File Format
File format (as opposed to content type) is the actual format of the file as delivered by the web server (PDF, HTML, etc). This is especially useful for sites with a large amount of non-HTML.
General Usefulness:
Ease of Automation:
Consider instead: File Group
MIME Type
The technical content type reported by the web server, which the web browser uses to determine how to display it.
General Usefulness:
Ease of Automation:
Consider instead: File Format
Meta Description
The meta description. This is of limited use, aside from simply discovering what pages do not have a meta description (and therefore require one).
General Usefulness:
Ease of Automation:
Compare with other Technical fields.
Meta Keywords
Meta keywords. In most cases, very limited value. More precise meta tags (for instance topics) are usually far more useful if they exist.
General Usefulness:
Ease of Automation:
Consider instead: Topic
Title
The title of the content is the most useful to people when looking at individual "rows" of an inventory. That said, unlike URL, these are not guaranteed to be unique.
General Usefulness:
Ease of Automation:
Compare with other Basic fields.
URL
This is a basic, foundational requirement of an inventory where each "row" is a URL.
General Usefulness:
Ease of Automation:
Compare with other Basic fields.