PhpGrammar Architecture

The Php Grammar uses the Lexer's declarative features to capture a small set of basic information, then routes to appropriate PHP functions for additional handling. There are words and operators. words are basically any string of alphanumeric characters, including underscore and backslash. operators are basically any symbol (or series of symbols) that have a particular meaning in the language.

There is a Handlers trait which takes the basic information & routes it to an appropriate method. Operations are mapped from symbol to a string (like = is assign). Then operations are routed like $this->op_assign(). Words like $this->op_function (when the word function hits).

There is an $xpn (expression) ast used as a simple object to hold meta data / state information.

  • $xpn->declaration is automatically appended to by whitespace, words, operators (maybe docblocks??? maybe comments ??).
  • $xpn->words is automatically appended to each time a word is encountered.
  • $xpn->last_op is the last operation recorded & is set automatically after an operation is done being handled.
  • $xpn->waiting_for is set by specific handlers for specific words/operators & checked by subsequent handlers to see if the state is correct.
  • $xpn->head ... is sometimes set as an ast for something to be acted upon, but is not added to the regular ast stack. Idr the use case.

Both declaration & words are reset to an empty array frequently by specific handlers.

Old notes from developing the idea of the current arcitechture

I'm thinking of a new paradigm where I catch EVERYTHING & at 'stoppers' I will process what has been captured. Essentially I will capture "words" which are .... and encapsulated sequence of characters. " i am a string" is one word because that's a single unit for parsing.

in public int $abc = "I am a thing";, the words are:

  • public
  • int
  • $abc
  • =
  • "I am a thing"

Then ; is a stopper. Maybe = is a stopper, too.

The idea is that ... everything works out to be an expression made up of words and operators.

<?php  
protected $whatever = 'cat';  
/** abc */  
static public function abc(bool $b): string {  
    $abc = "cat";  
    $def = "dog";  
    return "zeep";  
}  

It's gonna build the same kind of AST, but now I get to think about it differently.
so, protected $whatever =.
When I hit =, I know I have an assignment operation
the left-hand words of that operation are ['protected', '$whatever'];
The last word is the property name. All words before it are modifiers (which may include type).