PhpGrammar Architecture
The Php Grammar uses the Lexer's declarative features to capture a small set of basic information, then routes to appropriate PHP functions for additional handling. There are words
and operators
. words
are basically any string of alphanumeric characters, including underscore and backslash. operators
are basically any symbol (or series of symbols) that have a particular meaning in the language.
There is a Handlers
trait which takes the basic information & routes it to an appropriate method. Operations are mapped from symbol to a string (like =
is assign
). Then operations are routed like $this->op_assign()
. Words like $this->op_function
(when the word function
hits).
There is an $xpn
(expression) ast used as a simple object to hold meta data / state information.
-
$xpn->declaration
is automatically appended to by whitespace, words, operators (maybe docblocks??? maybe comments ??). -
$xpn->words
is automatically appended to each time a word is encountered. -
$xpn->last_op
is the last operation recorded & is set automatically after an operation is done being handled. -
$xpn->waiting_for
is set by specific handlers for specific words/operators & checked by subsequent handlers to see if the state is correct. -
$xpn->head
... is sometimes set as an ast for something to be acted upon, but is not added to the regular ast stack. Idr the use case.
Both declaration & words are reset to an empty array frequently by specific handlers.
Old notes from developing the idea of the current arcitechture
I'm thinking of a new paradigm where I catch EVERYTHING & at 'stoppers' I will process what has been captured. Essentially I will capture "words" which are .... and encapsulated sequence of characters. " i am a string"
is one word
because that's a single unit for parsing.
in public int $abc = "I am a thing";
, the words are:
- public
- int
- $abc
- =
- "I am a thing"
Then ;
is a stopper
. Maybe =
is a stopper, too.
The idea is that ... everything works out to be an expression made up of words and operators.
<?php
protected $whatever = 'cat';
/** abc */
static public function abc(bool $b): string {
$abc = "cat";
$def = "dog";
return "zeep";
}
It's gonna build the same kind of AST, but now I get to think about it differently.
so, protected $whatever =
.
When I hit =
, I know I have an assignment
operation
the left-hand words of that operation are ['protected', '$whatever']
;
The last word is the property name. All words before it are modifiers (which may include type).