An overview of Printing on the Linux Platform

Introduction

For several years, many OEM's considered Linux to be a free "hackers" server operating system. However, as the operating system has grown, many now see Linux as a viable, even popular, personal desktop operating system. This change has prompted the manufacturers of consumer model printers to consider Linux base systems as a candidate platform for consumer printer sales. One major problem facing these manufacturers is the integration of driver support for their printers into the Linux operating system. Traditionally manufacturers have supplied the consumer with printer drivers for Windows and Macintosh-based systems. These operating systems have a well-defined printer interface and developer's kits to facilitate driver creation. Although groups within the Linux community are working toward a standard, the Linux operating system and its methods of interfacing with printer hardware does not yet have a standard supported by a developer's kit. This paper examines several issues that are common to Linux printer driver development.

Overview

Printing functions in Linux, as in other operating systems, can be divided into three major functional groups: queuing, formatting, and error processing. Since Linux is a multiprocessing and multitasking operating system, there is a need to organize the information that is sent to the printer. The queuing mechanism is tasked with the job of organization and control of the information being sent to the printer. This information must be in a format that the printer can understand. The conversion of information from the various native formats to a format that can be used by the printer is the function of the formatting mechanism. Since errors can and do occur during the printing process, the error processing functions provide for notification of and recovery from these errors.

Queuing Solutions

The easiest method of printing in Linux is to redirect the output of a command to the printing device:

ls > /dev/lp0

This command will cause the directory listing of the current working directory to be sent to the device on line printer port 0. Unfortunately, this method does not take advantage of the multitasking capabilities of the Linux system and will result in the current user waiting until the printer has finished accepting the data. Additionally, this method requires that the printer have the capability to process ASCII text data.

Spooling the data, collecting the print data in a file and subsequently starting a background task to send the data to the printer, is a more efficient method.

Generally this latter method is how printing on the Linux platform is typically handled. Each printer available to the user has an area designated for spooling the print jobs to that printer. The print data is gathered in a file, one per print job, and placed into the spool area. A background task, the printer daemon, periodically checks the spool areas for a file to print. When a file is found, the data is sent to the printer associated with the spool area. If multiple files are waiting to be printed, they are processed in a first in, first out basis. The spool area serves as a queue for the print jobs for the associated printer.

The print daemon needs to know the spool area to check, the device to use, and any preprocessing required in performing its job. This information is contained in a file called /etc/printcap.

Five applications are used in basic Linux printing. They are:

  • lpr - submit a job to the printer (queue a print job)
  • lpq - show the content of the spool area for a particular printer
  • lpc - controls the queue
  • lprm - removes unprinted jobs from the queue
  • lpd - the printer daemon

Detailed information about each of these applications can be found using the Linux man pages.

The /etc/printcap File

The /etc/printcap file is a text based file that is somewhat cryptic-looking, but is easier to understand when one understands how it works. The man pages will supply the additional information.

Each /etc/printcap entry will describe a single printer. The entry provides a logical name for the printer, and then describes the method for handling the data being sent to the printer. For example, a /etc/printcap entry will associate the physical device used, the spool area for this device, the data processing required, and where any errors from the print process should be logged.

Multiple entries can exist for a single physical device. This occurrence allows for different types of processing to be performed on the data destined for the printing device. For example, a printer may process PostScript and PCL data formats. Logically, two different logical printers should be defined, one for each format. This approach would allow applications that output PostScript data to print on a PostScript printer and applications that produce PCL output to print on a PCL printer.

Parameters in the /etc/printcap File

There are many parameters used in the /etc/printcap file. Only the most frequently used parameters will be discussed in this paper. Those parameters that are not discussed here can be found in the Linux man pages for printcap.

All parameters are enclosed by colons and are designated by a two-character code. The code is followed by a value that depends on the parameter type. There are three types of values for the parameters: string, numeric, and Boolean. The following parameters are the most frequently used:

  • lp

  • string

    the device to receive the print data

  • sd

  • string

    the name of the spool directory

  • lf

  • string

    the name of log file to receive errors

  • if

  • string

    the name of the input filter

  • rm

  • string

    the name of the remote host

  • rp

  • string

    the name of the remote printer

  • sh

  • Boolean

    suppress headers (banners)

  • sf

  • Boolean

    suppress end-of-job form feeds

  • mx

  • numeric

    the maximum print job size (blocks)

    The lp Parameter

    The Ip parameter specifies the device to which the print daemon will send the printer output. The device field must not be empty unless the printer designated is a remote printer. The remote printer is set up using the rm and rp parameters.

    The sd Parameter

    The sd parameter specifies the spool directory. This directory is checked by the print daemon for file to print. Files are added to this directory by the lpr command. If a file is copied to this directory, the print daemon will attempt to print it.

    The lf Parameter

    The file specified by this parameter will receive a log of the errors during a print job. This file is not created by the print daemon and must exist in order for error logging to occur.

    The if Parameter

    The input filter is a script file that is used to determine how to process (reformat) the print data from its current form to the format used by the printer. If an input filter is specified, the print daemon will execute the input filter and supply the spooled data as the standard input and the print device as the standard output. More information about input filters appears later in this paper.

    The rm and rp Parameters

    These parameters are used to provide the capability to print on a printer connected to another machine. The remote machine, name or IP address, is specified in the rm parameter, and the remote printer (a logical name) is specified in the rp parameter.

    The sh and sf Parameters

    If there are many users of this printer, banner pages can help determine job separation. If this multi-user situation is not an issue, the sh parameter can save time and media.

    The end of each job either performs a final feed of the last page or a form feed must be sent. The sf parameter suppresses the sending of a form feed after each job. If a blank page is printed after every job, the sf parameter should be added.

    The mx Parameter

    This parameter is used to limit the amount of spooled data. The number specified in this parameter is the number of blocks (1KB in Linux) available for each spooled job. However, the mx parameter does not limit the amount of data sent to the printer. If this parameter specifies zero, the limit is removed for this printer and the spooled files are limited by the amount of disk space available.

    Syntax of the /etc/printcap File

    The /etc/printcap file has a rather simple syntax. Though for the uninitiated, the contents of the file may appear to be like a foreign language. The basics of the syntax are presented below.

    Lines that start with '#' are comments. Everything following the '#' is considered as a comment until the end of the line.

    Each printer accessible by the lpr command on the user system has one logical line in the /etc/printcap file. In order to make the file more readable, each logical line may be spread over several physical lines by placing a backslash '\' as the last character on all but the last physical line.

    Each logical line has the following format:

    name|name1|name2:string_parameter=string:\

    :numeric_parameter#number:Boolean_parameter:

    The leading spaces and the colon on the second line are not required. They are present here to aid in readability.

    The printer can have as many logical names as desired. The convention is for the last logical name to be a description of the printer. This description can be used in the same manner as the other logical names for this printer. One of the logical names for the default printer must be 'lp'.

    The list of parameters can be as long as necessary, and the order is not significant. Each parameter is enclosed by colons and is designated by a two-character code. The parameters requiring a string value have a '=' character delimiting the name and value. Numeric parameters are delimited by either the '#' or '=' character. Boolean parameters are considered TRUE if present and FALSE if absent.

    A string can have special characters expressed using the backslash-escape as in 'C'. Additionally, the sequence '\E' indicates ESC. The '^' character can be used to indicate an ASCII control character. The sequence '^a' will represent the same as \001. The characters '\' and '^' can be represented by '\\' and '\^', respectively. The character ':' can be represented using the sequence '\:'. However, to avoid confusion, the value \072 would be a wiser choice.

    Example:

    lp|MyPrinter|The printer in my office:\


      :lp=/dev/lp0:\

      :sd=/var/spool/lpd/lp:\

      :sh:\

      :mx#0:

      :if=/usr/local/lib/filters/lp.if:

    This entry indicates the printer's logical name is lp (the default printer) but can also be referred to as MyPrinter. A description of the printer exists that can also be used to reference this printer, but such a description is not normally used.

    Communication with the physical printer is on the first parallel port /dev/lp0. The spool area set up for this printer is /var/spool/lpd/lp. Banner headers will not be printed, and no limit to the file size exists. All files will be processed using the /usr/local/lib/filter/lp.if filter.

    Formatting Printer Files

    Although lpd handles network protocols, queuing, access control, and other aspects of printing, most of the real work happens in the filters.

    Filters

    Filters are programs that communicate with the printer and handle its device dependencies and special requirements. In a simple printer setup, a plain text filter is installed. This filter is an extremely simple one that should work with most printers. However, in order to take advantage of format conversion, printer accounting, specific printer quirks, and other features, one should understand how filters work. Ultimately, the filter is responsible for handling these aspects. The bad news - most of the time you have to provide the filters. The good news - many are generally available; when they are not, they are usually easy to write.

    How Filters Work

    When lpd wants to print a file in a job, it starts a filter program. lpd sets the filter's standard input to the file to print, its standard output to the printer, and its standard error to the error-logging file specified in the lf capability in /etc/printcap. Which filter lpd starts and what the filter's arguments are depend on what is listed in the /etc/printcap file and what arguments the user has specified for the job on the lpr command line. Three kinds of filters can be specified in /etc/printcap: text filters, conversion filters, and magic filters.

    Text (Input) Filters

    The text filter, called the input filter in lpd documentation, handles regular text printing. lpd expects every printer to be able to print plain text by default, and the text filter's job is to make sure backspaces, tabs, or other special characters do not confuse the printer. If in an environment where tone must account for printer usage, the text filter must also account for pages printed, usually by counting the number of lines printed and comparing that number to the number of lines per page that the printer supports.

    If no parameters are present on the lpr command line, the default text filter designated by the if entry in the /etc/printcap file will be used.

    Conversion Filters

    A conversion filter converts a specific file format into one that the printer can render onto paper. For example, troff-typesetting data cannot be directly printed, but a conversion filter can be installed for troff files to convert the troff data into a form that the printer can digest and print. Conversion filters make printing various kinds of files easy. As an example, suppose you do a lot of work with the TeX typesetting system, and have a PostScript printer. Every time you generate a DVI file from TeX, you cannot print it directly until you convert the DVI file into PostScript.

    For each of the conversion options the printer is to support, one must install a conversion filter and specify its pathname in /etc/printcap. A conversion filter is like the text filter for the simple printer setup except that instead of printing plain text, the filter converts the file into a format the printer can understand.

    When a conversion filter is not available, the only way to provide support for different file formats is to define several logical printers in the /etc/printcap file that all point to the same printer and that designate a different "if" entry for each entry. Unfortunately, this method can cause problems for users. The correct logical printer must be selected for each file format. Additionally, this approach provides no control over the order in which the files are printed from each queue.

    A more simple approach for the user would be to provide the proper formatting by using only the default text filter. This method would require no additional lpr command line parameters that must be remembered. This approach is made possible by the use of a magic filter.

    Magic Filters

    The magic filter derives its name from the way in which it determines the file format, by a magic number. A magic number is a distinctive pattern of bytes at a particular offset in the file. Magic filters are usually perl scripts, shell scripts, or C programs that identify the file type and call the appropriate filter to handle that particular type of file. The use of a magic filter does not preclude the use of other conversion filters, but simply allows the default text filter to handle multiple file formats.

    Scripts used in Filters

    The scripts listed below are available in the RedHat Linux distribution. Other Linux distributions would contain similar scripts for data conversion.

  • asc-to-printer.fpi

  • Send ASCII to printer

  • asc-to-ps.fpi

  • Convert ASCII to PostScript

  • bmp-to-pnm.fpi

  • Convert bitmap data to portable pixmap data

  • dvi-to-ps.fpi

  • Script to convert Tex DVI to PostScript

  • gif-to-pnm.fpi

  • Convert GIF to portable anymap

  • jpeg-to-pnm.fpi

  • Convert JPEG to portable anymap

  • master-filter

  • Magic filter script

  • pnm-to-ps.fpi

  • Convert portable anymap to PostScript

  • ps-to-printer.fpi

  • Send PostScript data to the printer

  • rast-to-pnm.fpi

  • Convert Sun rasterfile to portable anymap

  • rpm-to-asc.fpi

  • Convert Redhat package management file to ASCII

  • tiff-to-pnm.fpi

  • Convert Tag Image File Format to portable anymap

  • troff-to-ps.fpi

  • Convert troff to PostScript

    Applications Used in Filters

    These applications are also available in the RedHat Linux distribution. As with the scripts, other Linux distributions would contain applications with similar conversion capabilities.

  • GhostScript

  • Provides a PostScript interpreter that actually provides printer drivers for many popular printer models. Generally the printers that are supported by GhostScript have filters that convert any file format not handled by GhostScript into a PostScript format. GhostScript is then used to reformat the file for use by the printer. GhostScript can also output various bit-mapped images of PostScript files that can be processed by other printer formatting applications.

  • Mpage

  • Converts input to PostScript

  • Bmptoppm

  • Converts bitmap data to portable pixmap data

  • Dvips

  • Convert TeX DVI to PostScript

  • Giftopnm

  • Convert gif to portable anymap

  • Djpeg

  • Decompress a jpeg file to image file

  • Pnmtops

  • Convert portable anymap to PostScript

  • Rasttopnm

  • Convert Sun rasterfile to portable anymap

  • Tifftopnm

  • Convert TIFF to portable anymap

  • Grog

  • Guess options for GROFF

    Output Filters

    An output filter is intended for printing plain text only, like the text filter, but with many simplifications. When using an output filter, but no text filter, lpd starts an output filter once for the entire job instead of once for each file in the job. Also, lpd does not make any provision to identify the start or the end of files within the job for the output filter. In addition, lpd does not pass the user's login or host to the filter, so it is not intended to do accounting.

    Wise developers, however, should not be fooled by an output filter's simplicity. An output filter will not allow each file in a job to start on a different page. Instead, one should use an input filter. Furthermore, despite its appearance, an output filter is actually more complex than a text filter since the output filter must examine the byte stream being sent to it for special flag characters and must send signals to itself on behalf of lpd.

    However, an output filter is necessary for header pages and for sending the escape sequences or other initialization strings that enable printing of the header page.

    If an output filter is present (but no text filter) and lpd is working on a plain text job, lpd uses the output filter to do the job. As stated before, the output filter will print each file of the job in sequence with no intervening form feeds or other paper advancement - likely not the preferred result. In almost all cases, a text filter should be used.

    Error Handling

    Errors encountered during the print process are logged in the file pointed to by the lf parameter in the /etc/printcap file. The status of the printer can be obtained by using the lpq command. Any other error processing requires additional software not generally provided with the Linux distribution.

    Summary

    Since no widely accepted standards exist for printing in the Linux environment, many different methods can be used. This overview covers only the basics of Linux printing. New aspects of Linux, such as X Windows with the various display managers available, are being added regularly. Thus, new methods of printing are being developed that include the queuing and formatting of printer information. For more information on Linux printing, please contact Intelligraphics at techinfo@intelligraphics.com.