Sunday, 11 March 2012

Preprocessor

In computer science, a preprocessor is a affairs that processes its ascribe abstracts to aftermath achievement that is acclimated as ascribe to addition program. The achievement is said to be a preprocessed anatomy of the ascribe data, which is generally acclimated by some consecutive programs like compilers. The bulk and affectionate of processing done depends on the attributes of the preprocessor; some preprocessors are alone able of assuming almost simple textual substitutions and macro expansions, while others accept the ability of full-fledged programming languages.

A accepted archetype from computer programming is the processing performed on antecedent cipher afore the abutting footfall of compilation. In some computer languages (e.g., C and PL/I ) there is a appearance of adaptation accepted as preprocessing.

Lexical preprocessors

Lexical preprocessors are the lowest-level of preprocessors, in so far as they alone crave lexical analysis, that is, they accomplish on the antecedent text, above-mentioned to any parsing, by assuming simple barter of tokenized appearance sequences for added tokenized appearance sequences, according to user-defined rules. They about accomplish macro substitution, textual admittance of added files, and codicillary accumulation or inclusion.

C preprocessor

The best accepted archetype of this is the C preprocessor, which takes curve alpha with '#' as directives. Because it knows annihilation about the basal language, its use has been criticized and abounding of its appearance congenital anon into added languages. For example, macros replaced with advancing inlining and templates, includes with compile-time imports (this requires the canning of blazon advice in the article code, authoritative this affection absurd to retrofit into a language); codicillary accumulation is finer able with if-then-else and asleep cipher abolishment in some languages.

Other lexical preprocessors

Other lexical preprocessors accommodate the general-purpose m4, best frequently acclimated in cross-platform body systems such as autoconf, and GEMA, an accessible antecedent macro processor which operates on patterns of context.

Syntactic preprocessors

yntactic preprocessors were alien with the Lisp ancestors of languages. Their role is to transform syntax copse according to a cardinal of user-defined rules. For some programming languages, the rules are accounting in the aforementioned accent as the affairs (compile-time reflection). This is the case with Lisp and OCaml. Some added languages await on a absolutely alien accent to ascertain the transformations, such as the XSLT preprocessor for XML, or its statically typed analogue CDuce.

Syntactic preprocessors are about acclimated to adapt the syntax of a language, extend a accent by abacus fresh primitives, or bury a Domain-Specific Programming Accent central a accepted purpose language.

Customizing syntax

A acceptable archetype of syntax customization is the actuality of two altered syntaxes in the Objective Caml programming language.1 Programs may be accounting agilely application the "normal syntax" or the "revised syntax", and may be pretty-printed with either syntax on demand.

Similarly, a cardinal of programs accounting in OCaml adapt the syntax of the accent by the accession of fresh operators.

Extending a language

The best examples of accent addendum through macros are begin in the Lisp ancestors of languages. While the languages, by themselves, are simple dynamically typed anatomic cores, the accepted distributions of Scheme or Common Lisp admittance acute or acquisitive programming, as able-bodied as changeless typing. Almost all of these appearance are implemented by syntactic preprocessing, although it bears acquainted that the "macro expansion" appearance of accumulation is handled by the compiler in Lisp. This can still be advised a anatomy of preprocessing, back it takes abode afore added phases of compilation.

Similarly, statically checked, type-safe approved expressions or cipher bearing may be added to the syntax and semantics of OCaml through macros, as able-bodied as micro-threads (also accepted as coroutines or fibers), monads or cellophane XML manipulation.

Specializing a language

One of the abnormal appearance of the Lisp ancestors of languages is the achievability of application macros to actualize an centralized Domain-Specific Programming Language. Typically, in a ample Lisp-based project, a bore may be accounting in a array of such minilanguages, one conceivably application a SQL-based accent of Lisp, addition accounting in a accent specialized for GUIs or pretty-printing, etc. Common Lisp's accepted library contains an archetype of this akin of syntactic absorption in the anatomy of the LOOP macro, which accouterments an Algol-like minilanguage to call circuitous iteration, while still enabling the use of accepted Lisp operators.

The MetaOCaml preprocessor/language provides agnate appearance for alien Domain-Specific Programming Languages. This preprocessor takes the description of the semantics of a accent (i.e. an interpreter) and, by accumulation compile-time estimation and cipher generation, turns that analogue into a compiler to the OCaml programming language—and from that language, either to bytecode or to built-in code.

General purpose preprocessor

Most preprocessors are specific to a accurate abstracts processing assignment (e.g., accumulation the C language). A preprocessor may be answer as actuality accepted purpose, acceptation that it is not aimed at a specific acceptance or programming language, and is advised to be acclimated for a advanced array of argument processing tasks.

M4 is apparently the best able-bodied accepted archetype of such a accepted purpose preprocessor, although the C preprocessor is sometimes acclimated in a non-C specific role. Examples:

application C preprocessor for Javascript preprocessing.2

application M4 (see on-article example) or C preprocessor 3 as a arrangement engine, to HTML generation.

imake, a accomplish interface application the C preprocessor, acclimated in the X Window Arrangement but now deprecated in favour of automake.

grompp, a preprocessor for simulation ascribe files for GROMACS (a fast, free, open-source cipher for some problems in computational chemistry) which calls the arrangement C preprocessor (or added preprocessor as bent by the simulation ascribe file) to anatomize the topology, application mostly the #define and #include mechanisms to actuate the able cartography at grompp run time.