ArTeX User Manual

N D Efford

Version 1.00b, 18/2/95

ArTeX is a Perl script which analyzes a LaTeX2e document to determine whether it loads any non-standard files. Any such files are bundled with the document by means of filecontents or filecontents* environments, as appropriate. The result should be portable to any system with a standard installation of LaTeX.

Contents

  1. Introduction
  2. Configuration
  3. Running ArTeX
  4. How it Works
  5. Customisation
  6. Bugs and Limitations

1 Introduction

A number of situations can arise in which LaTeX document portability is important. Suppose, for instance, that you are writing a paper with a colleague at another institution, and that you have chosen to exchange draft versions back and forth via email. Clearly, either of you should be able to process the document successfully, regardless of changes or additions made by the other. When you come to submit the finished manuscript, you may have the option to do this electronically - but this can work only if the publisher, too, is able to process your document successfully.

Anyone wishing to process a LaTeX document must also have on their system all the files that are loaded by that document. Thus portability can only be guaranteed by bundling a document with the files on which it depends and then distributing this bundle or archive to others. Unfortunately, the business of identifying dependencies by hand is prone to error, and the archives that are created are often OS-specific and inherently non-portable: an archive created on a UNIX system with tar, gzip and uuencode cannot be unpacked on a normal PC system running MS-DOS.

Fortunately, LaTeX2e provides

  1. A method for identifying automatically the files on which a document depends;
  2. A built-in, platform-independent mechanism for combining a document with those files.
ArTeX uses these two new features to automate the entire process of archive creation. Running ArTeX on a LaTeX document will create a new version suitable for distribution to anyone with a standard installation of LaTeX2e on their system.

Aside: the name `ArTeX' comes from a (rather loose) analogy with the UNIX ar tool. The latter typically archives files of related object code to make program linking easier; in the same way, ArTeX archives a collection of related LaTeX files to make document distribution easier...

table of contents


2 Configuration

Before attempting to run ArTeX, you must configure it for your operating system by editing the script and uncommenting the appropriate set of variable definitions. Two sets are provided in the script; one for a typical UNIX system, the other for an MS-DOS system with emTeX. Edit these definitions as necessary to suit the particular system you are using.

It is assumed that the commands latex and bibtex invoke LateX2e and BibTeX, respectively, on your system. If this is not the case, you will need to change $latex_cmd and/or $bibtex_cmd.

The configuration file to be loaded at script startup is specified in the variable $config_file. On UNIX systems, the default name for this file is ~/.artexrc; on DOS systems, the default file is ARTEX.INI in the current working directory. In the latter case, you might wish to give an absolute pathname so that there is only one global configuration file.

The script defines an associative array, %star, which flags file types that are to be included using a filecontents* environment instead of the normal filecontents environment. The default settings will cause PostScript files to be included using the former, with all other files included by means of the latter. You can change this permanently by editing the array definition, or temporarily by redefining array elements in the configuration file.

table of contents


3 Running ArTeX

3.1 Command Line Syntax

On UNIX systems, the command line syntax for running ArTeX is
    % artex [options] input_file [output_file]

Under DOS, which doesn't support the #! syntax for specifying a script interpreter, you can use

    C:\> perl artex [options] input_file [output_file]
The above can be placed in a batch file, ARTEX.BAT, to achieve the same effect as in UNIX; alternatively, you can put the following batch file wrapper around the script:
    @rem = '
    @echo off
    perl -S %0.bat %1 %2 %3 %4 %5 %6 %7 %8 %9 
    goto endofperl
    ';
    # insert Perl code here
    __END__
   :endofperl
This avoids the need for two separate files.

Those who use 4DOS in place of COMMAND.COM have a further option. The script can be renamed ARTEX.PL and the following command can be added to the 4START.BAT file:

    set .pl=c:\path\to\perl.exe
where c:\path\to\perl.exe is the full pathname of the Perl interpreter.

3.2 Command Line Options

The following command line options can be specified before the input filename:
-f
Ensures fast startup of the script, i.e., the configuration file is not accessed. Variables will have their default values and there will be no forced inclusion/exclusion of files.
-i file1[,file2,...]
Specifies a list of files to be forcibly included in the archive. Overrides what was specified in the configuration file.
-i none
Cancels forced inclusion that may have been requested via the configuration file.
-e file1[,file2,...]
Specifies a list of files to be forcibly excluded from the archive. Overrides anything specified in the configuration file.
-e none
Cancels forced exclusion that may have been requested via the configuration file.
-c
Prompts the user to confirm the inclusion of each file.
-b
Forces the script to use the .bib and .bst files if the document contains a \bibliography command. These files are candidates for inclusion regardless of whether any related citations occur within the document. The default behaviour is to include the .bbl file, generated by running BibTeX, instead - but this is done only if citations are found in the document.
-q
Quiet mode. The screen output normally produced by LaTeX is suppressed.
-v
Prints the program version and exits.
-h
Prints help on command line options, then exits.

table of contents


4 How it Works

After parsing command line options and, if necessary, reading the configuration file, the script checks the input document for LaTeX2e compliance. A temporary copy of the document is then created, to which \nonstopmode and \listfiles commands are added (if not already present). LaTeX is run on this file, thereby generating a list of dependencies for the document. This is stored in the logfile and (unless the -q option has been used) is echoed on the screen.

An initial list of candidates for inclusion in the archive is generated by extracting filenames from the list of dependencies in the logfile. For each file in this list, the associated description, if any, is examined. Files with descriptions containing the words `Standard LaTeX' are assumed to be part of the standard distribution of LaTeX2e and are ignored. For all remaining files in the dependency list, full pathnames are determined by searching the directories specified by the TEXINPUTS environment variable. The files and their full pathnames are stored in an associative array named %dependency.

Bibliographies are dealt with separately, by examining the auxiliary file (extension .aux). If the -b option (requesting the inclusion of .bib and .bst files only) has been specified and a \bibdata command is found in the auxiliary file, then the bibliographies that are arguments to this command become candidates for inclusion. A list of directories obtained from the BIBINPUTS environment variable is searched to identify the full pathnames of each bibliography file and this information is stored in the dependencies array. The bibliography style becomes a candidate for inclusion unless it is flagged as a `standard style' in array %stdbib (by default the standard styles are plain, unsrt, abbrv and alpha). A full pathname for the .bst file is found by searching the TEXINPUTS directories.

By default, bibliographic data are bundled with a document in the form of a .bbl file, generated by running BibTeX. However, this is done only if \citation commands are found in the auxiliary file.

Next, dependencies are added or removed according to the contents of the @include and @exclude arrays. These are defined either in the runtime configuration file or via the command line (options -i and -e). Command line specifications override any in the configuration file. If the -c option has been selected, the user is prompted to confirm the inclusion of each candidate. If no confirmation is given, the candidate is removed from the list of dependencies.

Finally, an expanded version of the original document, complete with filecontents and filecontents* environments, is written to a file. The name of this file can be specified on the command line. If no name is supplied, a backup copy of the original document is created (extension .te~ or .lt~, depending on whether the input file extension is .tex or .ltx) and the original is overwritten by the new version.

table of contents


5 Customisation

Runtime customisation of script behaviour is achieved through use of a configuration file. The main purpose of this file is to allow specification of files that will always be included in or excluded from the archive. Such files should be listed in arrays named @include and @exclude. A typical configuration file might contain the following:
    @include = (
      'mystuff.sty'  # I always use this in documents
    );

    @exclude = (
      'known.sty',   # I know that recipients have this package
      'odd.sty'      # standard package that doesn't announce itself as such
    );

You can override definitions of @include and @exclude using the -i and -e options on the command line. These options are normally followed by comma-separated lists of filenames, or by the word `none'. If `none' is specified, any definition of the corresponding array in the configuration file is ignored. Thus if the configuration file was as given above and the command

    artex -e none test.tex
was issued, the definition of @exclude would be ignored and the files known.sty and odd.sty would be included if the document depended on them.

The configuration file may also be used to change the definitions of the %star and %stdbib associative arrays.

table of contents


6 Bugs and Limitations

ArTeX was hacked together fairly quickly, and hasn't been tested as thoroughly as I would like. I'd appreciate being notified of any bugs that you find. If you feel moved to fix any of them, please send me your patches. I developed the script for Perl 4.036 in two environments: a Silicon Graphics workstation running Irix 5.2 and a 486 PC with MS-DOS 6.2 and emTeX. If your system resembles either of these then you should not have any difficulties. I've assumed that TeX and LaTeX use the environment variables TEXINPUTS and BIBINPUTS to identify directories which will be searched for files. If this is not the case on your system, the script will need some alteration.

One known problem concerns the -q option (quiet mode), which doesn't seem to work under MS-DOS (I've not tried it with any other DOS - feedback on this would be welcome).

ArTeX relies on the LaTeX2e \listfiles command for the detection of dependencies and hence is subject to the limitations of that command. In particular, it cannot cope with files that are loaded via the low-level TeX \input directive.

The other main limitation is the mechanism used to determine whether files are part of the standard LaTeX distribution. ArTeX assumes that a file is a standard one if the words `Standard LaTeX' appear in the description printed by \listfiles. A standard file which doesn't use this wording will be accidentally bundled with the document, and a non-standard file which happens to use this wording will be wrongly omitted. Mistakes that occur regularly can be rectified by adding the offending files to the @include or @exclude lists in the configuration file.

Syntax errors in the configuration file are trapped, but note that it is still possible for Perl code in this file to break the script - e.g., by incorrectly redefining $copy_cmd or @texinputs.

table of contents


Nick Efford
[email protected]