Web20 University

How PHP Code Is Parsed

The parsing of PHP code involves several stages: tokenization, parsing, compilation, and interpretation.

  1. Tokenization: In this initial stage, the PHP interpreter takes the raw PHP source code and breaks it down into ’tokens’, which represent the smallest meaningful chunks of the code. Tokens are the basic building blocks of a program, including variables, function names, operators, and punctuation such as semicolons and parentheses.

  2. Parsing: The next stage involves parsing these tokens into a structure that represents the program. The parser organizes the tokens according to the rules of PHP’s grammar into a parse tree or abstract syntax tree (AST). This tree represents the syntactical structure of the program.

  3. Compilation: Once the PHP interpreter has an AST, it then compiles this into opcodes (operation codes), which are lower-level representations of the program. This is akin to converting a script to a set of specific instructions that the machine can understand.

  4. Interpretation: Finally, the PHP interpreter executes these opcodes, effectively running the PHP program. The Zend Engine, an open-source scripting engine that interprets PHP, does the compilation and execution steps.

In practice, PHP is an interpreted language, meaning the process of parsing and execution happens almost simultaneously. That is, it doesn’t need to go through a separate compilation process to an executable file before running as languages like C or C++ do. Instead, the PHP interpreter reads and executes PHP scripts line-by-line each time they’re run.

Note: PHP also has a feature called the OPCache, which can store precompiled script bytecode in shared memory. This can reduce the overhead of parsing and compiling the same scripts on every request, significantly speeding up PHP performance.

Remember, this process applies to PHP 7 and later versions, which use an abstract syntax tree. In older PHP versions, the parsing stage creates a linear sequence of opcode numbers and operands, but does not create an AST.