Trait lalrpop_util::state_machine::ParserDefinition
source · pub trait ParserDefinition: Sized {
type Location: Clone + Debug;
type Error;
type Token: Clone + Debug;
type TokenIndex: Copy + Clone + Debug;
type Symbol;
type Success;
type StateIndex: Copy + Clone + Debug;
type Action: ParserAction<Self>;
type ReduceIndex: Copy + Clone + Debug;
type NonterminalIndex: Copy + Clone + Debug;
Show 14 methods
// Required methods
fn start_location(&self) -> Self::Location;
fn start_state(&self) -> Self::StateIndex;
fn token_to_index(&self, token: &Self::Token) -> Option<Self::TokenIndex>;
fn action(
&self,
state: Self::StateIndex,
token_index: Self::TokenIndex
) -> Self::Action;
fn error_action(&self, state: Self::StateIndex) -> Self::Action;
fn eof_action(&self, state: Self::StateIndex) -> Self::Action;
fn goto(
&self,
state: Self::StateIndex,
nt: Self::NonterminalIndex
) -> Self::StateIndex;
fn token_to_symbol(
&self,
token_index: Self::TokenIndex,
token: Self::Token
) -> Self::Symbol;
fn expected_tokens(&self, state: Self::StateIndex) -> Vec<String>;
fn uses_error_recovery(&self) -> bool;
fn error_recovery_symbol(
&self,
recovery: ErrorRecovery<Self>
) -> Self::Symbol;
fn reduce(
&mut self,
reduce_index: Self::ReduceIndex,
start_location: Option<&Self::Location>,
states: &mut Vec<Self::StateIndex>,
symbols: &mut Vec<SymbolTriple<Self>>
) -> Option<ParseResult<Self>>;
fn simulate_reduce(
&self,
action: Self::ReduceIndex
) -> SimulatedReduce<Self>;
// Provided method
fn expected_tokens_from_states(
&self,
states: &[Self::StateIndex]
) -> Vec<String> { ... }
}
Required Associated Types§
sourcetype Location: Clone + Debug
type Location: Clone + Debug
Represents a location in the input text. If you are using the
default tokenizer, this will be a usize
.
sourcetype Error
type Error
Represents a “user error” – this can get produced by
reduce()
if the grammar includes =>?
actions.
sourcetype Token: Clone + Debug
type Token: Clone + Debug
The type emitted by the user’s tokenizer (excluding the location information).
sourcetype TokenIndex: Copy + Clone + Debug
type TokenIndex: Copy + Clone + Debug
We assign a unique index to each token in the grammar, which
we call its index. When we pull in a new Token
from the
input, we then match against it to determine its index. Note
that the actual Token
is retained too, as it may carry
additional information (e.g., an ID
terminal often has a
string value associated with it; this is not important to the
parser, but the semantic analyzer will want it).
sourcetype Symbol
type Symbol
The type representing things on the LALRPOP stack. Represents the union of terminals and nonterminals.
sourcetype StateIndex: Copy + Clone + Debug
type StateIndex: Copy + Clone + Debug
Identifies a state. Typically an i8, i16, or i32 (depending on how many states you have).
sourcetype Action: ParserAction<Self>
type Action: ParserAction<Self>
Identifies an action.
sourcetype ReduceIndex: Copy + Clone + Debug
type ReduceIndex: Copy + Clone + Debug
Identifies a reduction.
sourcetype NonterminalIndex: Copy + Clone + Debug
type NonterminalIndex: Copy + Clone + Debug
Identifies a nonterminal.
Required Methods§
sourcefn start_location(&self) -> Self::Location
fn start_location(&self) -> Self::Location
Returns a location representing the “start of the input”.
sourcefn start_state(&self) -> Self::StateIndex
fn start_state(&self) -> Self::StateIndex
Returns the initial state.
sourcefn token_to_index(&self, token: &Self::Token) -> Option<Self::TokenIndex>
fn token_to_index(&self, token: &Self::Token) -> Option<Self::TokenIndex>
Converts the user’s tokens into an internal index; this index
is then used to index into actions and the like. When using an
internal tokenizer, these indices are directly produced. When
using an external tokenier, however, this function matches
against the patterns given by the user: it is fallible
therefore as these patterns may not be exhaustive. If a token
value is found that doesn’t match any of the patterns the user
supplied, then this function returns None
, which is
translated into a parse error by LALRPOP (“unrecognized
token”).
sourcefn action(
&self,
state: Self::StateIndex,
token_index: Self::TokenIndex
) -> Self::Action
fn action( &self, state: Self::StateIndex, token_index: Self::TokenIndex ) -> Self::Action
Given the top-most state and the pending terminal, returns an action. This can be either SHIFT(state), REDUCE(action), or ERROR.
sourcefn error_action(&self, state: Self::StateIndex) -> Self::Action
fn error_action(&self, state: Self::StateIndex) -> Self::Action
Returns the action to take if an error occurs in the given
state. This function is the same as the ordinary action
,
except that it applies not to the user’s terminals but to the
“special terminal” !
.
sourcefn eof_action(&self, state: Self::StateIndex) -> Self::Action
fn eof_action(&self, state: Self::StateIndex) -> Self::Action
Action to take if EOF occurs in the given state. This function
is the same as the ordinary action
, except that it applies
not to the user’s terminals but to the “special terminal” $
.
sourcefn goto(
&self,
state: Self::StateIndex,
nt: Self::NonterminalIndex
) -> Self::StateIndex
fn goto( &self, state: Self::StateIndex, nt: Self::NonterminalIndex ) -> Self::StateIndex
If we reduce to a nonterminal in the given state, what state do we go to? This is infallible due to the nature of LR(1) grammars.
sourcefn token_to_symbol(
&self,
token_index: Self::TokenIndex,
token: Self::Token
) -> Self::Symbol
fn token_to_symbol( &self, token_index: Self::TokenIndex, token: Self::Token ) -> Self::Symbol
“Upcast” a terminal into a symbol so we can push it onto the parser stack.
sourcefn expected_tokens(&self, state: Self::StateIndex) -> Vec<String>
fn expected_tokens(&self, state: Self::StateIndex) -> Vec<String>
Returns the expected tokens in a given state. This is used for error reporting.
sourcefn uses_error_recovery(&self) -> bool
fn uses_error_recovery(&self) -> bool
True if this grammar supports error recovery.
sourcefn error_recovery_symbol(&self, recovery: ErrorRecovery<Self>) -> Self::Symbol
fn error_recovery_symbol(&self, recovery: ErrorRecovery<Self>) -> Self::Symbol
Given error information, creates an error recovery symbol that we push onto the stack (and supply to user actions).
sourcefn reduce(
&mut self,
reduce_index: Self::ReduceIndex,
start_location: Option<&Self::Location>,
states: &mut Vec<Self::StateIndex>,
symbols: &mut Vec<SymbolTriple<Self>>
) -> Option<ParseResult<Self>>
fn reduce( &mut self, reduce_index: Self::ReduceIndex, start_location: Option<&Self::Location>, states: &mut Vec<Self::StateIndex>, symbols: &mut Vec<SymbolTriple<Self>> ) -> Option<ParseResult<Self>>
Execute a reduction in the given state: that is, execute user
code. The start location indicates the “starting point” of the
current lookahead that is triggering the reduction (it is
None
for EOF).
The states
and symbols
vectors represent the internal
state machine vectors; they are given to reduce
so that it
can pop off states that no longer apply (and consume their
symbols). At the end, it should also push the new state and
symbol produced.
Returns a Some
if we reduced the start state and hence
parsing is complete, or if we encountered an irrecoverable
error.
FIXME. It would be nice to not have so much logic live in
reduce. It should just be given an iterator of popped symbols
and return the newly produced symbol (or error). We can use
simulate_reduce
and our own information to drive the rest,
right? This would also allow us – I think – to extend error
recovery to cover user-produced errors.
sourcefn simulate_reduce(&self, action: Self::ReduceIndex) -> SimulatedReduce<Self>
fn simulate_reduce(&self, action: Self::ReduceIndex) -> SimulatedReduce<Self>
Returns information about how many states will be popped during a reduction, and what nonterminal would be produced as a result.
Provided Methods§
sourcefn expected_tokens_from_states(
&self,
states: &[Self::StateIndex]
) -> Vec<String>
fn expected_tokens_from_states( &self, states: &[Self::StateIndex] ) -> Vec<String>
Returns the expected tokens in a given state. This is used in the
same way as expected_tokens
but allows more precise reporting
of accepted tokens in some cases.