LinkScan Reference Manual

Section 12. LinkScan File Formats

This section describes the formats of the various LinkScan configuration files:

  1. linkscan.sys - LinkScan system configuration file
  2. linkscan.mas - List of configured Projects
  3. linkscan.cfg - Project configuration file
  4. mime.types - MIME Associations
  5. linkscan.rep - Command line report options
  6. linkscan.sum - Audit trail of scans by Project

12.1 linkscan.sys

Purpose: The primary LinkScan system configuration file
Location: The LinkScan directory only
Required: Always
Applies to: All configured Projects
Inheritance: Not applicable

linkscan.sys - Essential Parameters - Must be Configured

linkscan.sys - Optional Parameters - Defaults will Normally Suffice

Other required parameters and settings are included in linkscan.sys. We suggest you do not modify these unless you are completely familiar with LinkScan's features.

12.2 linkscan.mas

Purpose: To maintain a list of configured Projects
Location: The LinkScan directory only
Required: When multiple Projects are configured
Applies to: All configured Projects
Inheritance: Not applicable

This file contains a one line entry for each configured Project. The syntax is:

directory-name [*]

You may configure additional Projects manually or ask LinkScan to help you by executing:

perl linkscan.pl -newproject my-new-project

See How to define new Projects or remove old ones for more examples.

12.3 linkscan.cfg

Purpose: The Project configuration file
Location: The LinkScan directory and the Project Directory
Required: Always
Applies to: The selected Project
Inheritance: The LinkScan directory is checked first. Those settings are overriden by the configuration file in the selected Project Directory

linkscan.cfg - Essential Parameters - Must be Configured

To select a Home Page that is beneath the server root directory use:

Do not use:

Other linkscan.sys Overrides

For flexibility, the following linkscan.sys parameters may be overriden within the Project-specific linkscan.cfg file:

Timeout1
Timeout2
Dprocs
Nprocs
Masterport

Customization Commands

Multiple customization commands may be included in either the Global linkscan.cfg file (applies to all Projects) and/or the Project linkscan.cfg file (applies to that specific Project). Please see Customizing LinkScan for details.

SiteMap Customization Commands

Multiple customization commands may be included in either the Global linkscan.cfg file (applies to all Projects) and/or the Project linkscan.cfg file (applies to that specific Project). Please see How To Customize the SiteMap/TapMap for details.

Other Optional Settings

In a default configuration, the following parameters are set in the Global linkscan.cfg file. You may edit these settings to reconfigure All Projects, or insert one of more of these commands in an individual Project linkscan.cfg file to modify a specific Project.

12.4 mime.types

The mime.types file controls the MIME-type header that the LinkScan Server transmits for each request based on the file extension of the requested document/file. The version of the mime.types file installed with LinkScan includes most of the common/standard associations. For example:


# MIME type			Extension

text/html			shtml html htm

This entry causes the LinkScan Server to transmit the following HTTP response header with each request for a .htm, .html or .shtml file:


Content-Type: text/html

12.5 linkscan.rep

# LINKSCAN CUSTOMIZATION FILE - LINKSCAN.REP
#
# Lines beginning with "#" are comments
#
# Purpose:     Select options for command line reports
# Location:    The LinkScan Project Directory
# Required:    Yes (for command line reports)
# Applies to:  The selected Project
# Inheritance: Not applicable
#
# DO NOT EDIT SECTION HEADERS - lines within [square brackets]
#

[sr Summary/Detail Report]
Html      = 1      # 0 = TEXT format; 1 = HTML format
Graphics  = 1      # 0 = Graphics off; 1 = Graphics on
Unclean   = 0      # 0 = List all documents; 1 = List only documents with errors
Sort      = 1      # 1 = Most errors first; 2 = Alphabetically; 3 = Newest first
                   # 4 = Least errors first; 5 = Reverse alphabetically; 6 = Oldest first
Incl      =        # relative-path-expression
Excl      =        # relative-path-expression


[xr Summary Statistics Report]
Html      = 1      # 0 = TEXT format; 1 = HTML format
Graphics  = 1      # 0 = Graphics off; 1 = Graphics on


[dr Detailed Report]
Html      = 1      # 0 = TEXT format; 1 = HTML format
Graphics  = 1      # 0 = Graphics off; 1 = Graphics on
Intext    = 3      # 1 = Internal only; 2 = External only; 3 = Internal and External
Sev0      = 0      # 1 = Display No Status
Sev1      = 1      # 1 = Display Errors
Sev2      = 1      # 1 = Display Possible Errors
Sev3      = 1      # 1 = Display Warnings
Sev4      = 0      # 1 = Display Advisories
Sev5      = 0      # 1 = Display Good Links
Sort      = 1      # 1 = By referer; 2 = By status code; 3 = By links alphabetically
Match     = 3      # 1 = Match on referer; 2 = Match on target; 3 = Match on either
Incl      =        # relative-path-expression
Excl      =        # relative-path-expression


[cr Selected Status Codes Report]
Html      = 1      # 0 = TEXT format; 1 = HTML format
Graphics  = 1      # 0 = Graphics off; 1 = Graphics on
Intext    = 3      # 1 = Internal only; 2 = External only; 3 = Internal and External
Stat1     = 1      # Good HTML Files
Stat2     = 1      # Missing HTML Files
Stat3     = 1      # Good non-HTML Files
Stat4     = 1      # Missing non-HTML Files
Stat5     = 1      # Good Anchors
Stat6     = 1      # Missing Anchors
Stat7     = 1      # Unsafe Characters
Stat8     = 1      # Status Unknown
Stat9     = 1      # Good URL
Stat10    = 1      # Moved Permanently
Stat11    = 1      # Moved Temporarily
Stat12    = 1      # Trailing  Missing from URL
Stat13    = 1      # Server Not Found - No DNS Entry
Stat14    = 1      # URL Not Found
Stat15    = 1      # Timed Out
Stat16    = 1      # Other
Sort      = 1      # 1 = By referer; 2 = By status code; 3 = By links alphabetically
Match     = 3      # 1 = Match on referer; 2 = Match on target; 3 = Match on either
Incl      =        # relative-path-expression
Excl      =        # relative-path-expression


[mr SiteMap Report]
Html      = 1      # 0 = TEXT format; 1 = HTML format
Graphics  = 1      # 0 = Graphics off; 1 = Graphics on
Custom    = 0      # 1 = Use alternate header/footer
Levels    = 10     # Maximum number of levels to display
Filenames = 0      # 1 = Display relative-path on report
Truncate  = 100    # Maximum line length (characters)
Decimal   = 1      # 1 = Display dot-decimal notation
Font      = -1     # Relative font size for titles
New       = 1      # 1 = Flag new files
Newdays   = 5      # Defines "New" (in days)
Anchors   = 1      # 1 = Display anchors (Link order format only) 
Indent    = 0      # Number of spaces to indent (default is to use tab)
Files     = 0      # 1 = Display file size and date on report
Linkmap   = 0      # 0 = Use directory structure format; 1 = Link order format


[hr Site History Report]
Html      = 1      # 0 = TEXT format; 1 = HTML format
Graphics  = 1      # 0 = Graphics off; 1 = Graphics on
Incl      =        # absolute-url-expression


[or Orphaned Files Report]
Html      = 1      # 0 = TEXT format; 1 = HTML format
Graphics  = 1      # 0 = Graphics off; 1 = Graphics on
Sort      = 2      # 2 = Alphabetically; 3 = Newest first
                   # 5 = Reverse alphabetically; 6 = Oldest first
Incl      =        # relative-path-expression
Excl      =        # relative-path-expression


[ar All Files Linking to ... Report]
Html      = 1      # 0 = TEXT format; 1 = HTML format
Graphics  = 1      # 0 = Graphics off; 1 = Graphics on
Intext    = 3      # 1 = Internal only; 2 = External only; 3 = Internal and External
Match     = 5      # 4 = Exact match; 5 = Partial match
Incl      =        # relative-path-expression | absolute-usr-expression


[rr Redirections Report]
Html      = 1      # 0 = TEXT format; 1 = HTML format
Graphics  = 1      # 0 = Graphics off; 1 = Graphics on


[pr System Configuration Report]
Html      = 1      # 0 = TEXT format; 1 = HTML format
Graphics  = 1      # 0 = Graphics off; 1 = Graphics on


[qr LinkScan/QuickCheck]
Html      = 1      # 0 = TEXT format; 1 = HTML format
Graphics  = 1      # 0 = Graphics off; 1 = Graphics on
Sev0      = 0      # 1 = Display No Status
Sev1      = 1      # 1 = Display Errors
Sev2      = 1      # 1 = Display Possible Errors
Sev3      = 1      # 1 = Display Warnings
Sev4      = 0      # 1 = Display Advisories
Sev5      = 0      # 1 = Display Good Links
Source    = 1      # 1 = Display full source code
Linkscan  = 1      # 1 = Display link status
Weblint   = 1      # 1 = Display weblint errors
Combo     = 1      # 1 = Combined format
Http      = 2      # 0 = Read via file system; 2 = Read via HTTP; 3 = Automatic
Now       = 0      # 0 = Link status from database; 1 = Check link status now

12.6 linkscan.sum

This tab-delimited file contains an audit trail of each scan on a per Project basis and it may be imported into spreadsheets or other applications for management reports. The file is formated with one record per scan. The data fields are tab delimited and include:

Field  0    LinkScan Version Number
Field  1    Date and Time of Scan (Seconds since 00:00:00 UTC, January 1, 1970)
Field  2    Total HTML Documents Scanned
Field  3    Total HTML Documents Missing
Field  4    Total HTML Documents Containing Hard Errors
Feild  5    Total non-HTML Files Scanned
Field  6    Total non-HTML Files Missing
Field  7    Total Anchors Found
Field  8    Total Anchors Broken
Field  9    External URL's - Total Checked
Field 10    External URL's - Errors
Field 11    External URL's - Possible Errors
Field 12    External URL's - Warnings
Field 13    Total Orphaned Files

These data items correspond to those displayed on the Summary Statistics Report.

[Previous] [Contents] [QuickRef] [Next]

Electronic Software Publishing Corporation (Elsop)
[ Elsop ] - [ About ] - [ Contact ] - [ LinkScan ] - [ SiteMap ]
© Copyright 1997-99 Electronic Software Publishing Corporation
Updated: November 28, 1999