\documentclass[oneside]{book}
\usepackage[a4paper,margin=2cm]{geometry}

\newcommand*{\myversion}{2025B}
\newcommand*{\mydate}{Version \myversion\ (\the\year-\mylpad\month-\mylpad\day)}
\newcommand*{\mylpad}[1]{\ifnum#1<10 0\the#1\else\the#1\fi}

\setlength{\parindent}{0pt}
\setlength{\parskip}{4pt plus 1pt minus 1pt}

\usepackage{hyperref}
\hypersetup{
  colorlinks=true,
  urlcolor=blue3,
  linkcolor=green3,
}

\usepackage{tabularray}

\NewTblrEnviron{spectblr}
\SetTblrOuter[spectblr]{long}
\SetTblrInner[spectblr]{
  hlines = {gray3}, column{Z} = {co=1}, colsep = 5pt,
  row{2-Z} = {brown8},
  row{1} = {fg=white, bg=purple2, font=\bfseries\sffamily},
  rowhead = 1,
}

\usepackage{codehigh}
\colorlet{highback}{azure9}
\CodeHigh{language=latex/latex2,style/main=gray9,style/code=gray9}

\NewDocumentCommand\PP{m}{\texttt{\fakeverb{#1}}} % package
\NewDocumentCommand\CC{m}{\texttt{\fakeverb{#1}}} % command
\NewDocumentCommand\VV{m}{\texttt{\fakeverb{#1}}} % variable
\NewDocumentCommand\VT{m}{\texttt{\fakeverb{#1}}} % variable type
\NewDocumentCommand\TT{m}{\texttt{\fakeverb{#1}}} % text

\usepackage{pegmatch}

\begin{document}

\title{\textsf{\color{green3}PEGMATCH: Parsing Expression Grammars for TeX}}
\author{Jianrui Lyu (tolvjr@163.com)\\ \url{https://github.com/lvjr/pegmatch}}
\date{\mydate}
\maketitle

\tableofcontents

\chapter{Package Interfaces}

\section{Introduction}

The \PP{pegmatch} package ports PEG (Parsing Expression Grammars)%
\footnote{See Parsing Expression Grammars page: \url{https://bford.info/packrat/}.} to TeX.
Following the design in LPEG (Parsing Expression Grammars for Lua),%
\footnote{See Parsing Expression Grammars for Lua page: \url{https://www.inf.puc-rio.br/~roberto/lpeg/}.}
it defines patterns as LaTeX3 variables, and offers several operators to compose patterns.

In general, PEG matching is much more powerful than RE (Regular Expressions) matching.
At this time, \PP{pegmatch} package only supports TeX strings.%
\footnote{I started to write it for my \PP{codehigh} package to get rid of \PP{l3regex} dependency.}
Also it is still in experimental status, hence some interfaces may change in future releases.

\section{The first example}

The following is the first example:
\begin{demohigh}
\NewSpeg\lMyTestSpeg
\SetSpeg\lMyTestSpeg{\SpegP{abc}}
\IfSpegMatchTF\lMyTestSpeg{a}{T}{F}
\IfSpegMatchTF\lMyTestSpeg{ab}{T}{F}
\IfSpegMatchTF\lMyTestSpeg{abc}{T}{F}
\IfSpegMatchTF\lMyTestSpeg{abcd}{T}{F}
\end{demohigh}
In this example, we use \CC{\NewSpeg} to create a new \VT{speg} variable \VV{\lMyTestSpeg},
and use \CC{\SetSpeg} to set the variable with a pattern expression,
then use \CC{\IfSpegMatchTF} to match it against different subject strings.
The pattern \TT{\SpegP{abc}} matches the string \TT{abc} literally.

\section{Basic commands}

This package provides the following commands for creating and matching \VT{speg} patterns:\nopagebreak
\begin{spectblr}[
  caption = Basic commands
]{}
  Command & Description \\
  \CC{\NewSpeg#1}     & create \VT{speg} variable \TT{#1}\\
  \CC{\SetSpeg#1{#2}} & set \VT{speg} variable \TT{#1} with \VT{speg} expresssion \VT{#2}\\
  \CC{\IfSpegMatchT#1{#2}{#3}}
      & match \TT{#1} against string \TT{#2}, then run code \TT{#3} if the match succeeds\\
  \CC{\IfSpegMatchF#1{#2}{#3}}
      & match \TT{#1} against string \TT{#2}, then run code \TT{#3} if the match fails\\
  \CC{\IfSpegMatchTF#1{#2}{#3}{#4}}
      & match \TT{#1} against string \TT{#2}, then run code \TT{#3} if the match succeeds,
        othewise run code \TT{#4}\\
  \CC{\IfSpegExtractT#1{#2}#3{#4}}
      & match \TT{#1} against string \TT{#2},
        then store captures in \TT{#3} and run code \TT{#4} if the match succeeds\\
  \CC{\IfSpegExtractF#1{#2}#3{#4}}
      & match \TT{#1} against string \TT{#2},
        then clear \TT{#3} and run code \TT{#4} if the match fails\\
  \CC{\IfSpegExtractTF#1{#2}#3{#4}{#5}}
      & match \TT{#1} against string \TT{#2},
        then store captures in \TT{#3} and run code \TT{#4} if the match succeeds,
        othewise clear \TT{#3} and run code \TT{#5}
\end{spectblr}

\section{Scratch variables}

There are two predefined scratch \VT{speg} variables for setting \VT{speg} patterns:
\VV{\lTmpaSpeg} and \VV{\lTmpbSpeg}.
Also there are two predefined scratch \VT{seq} variables for storing captures
(see Section~\ref{sect:capture}): \VV{\lSpegTmpaSeq} and \VV{\lSpegTmpbSeq}.%

\section{Primitive patterns}

This package provides the following commands for making primitive patterns:%\nopagebreak
\begin{spectblr}[
  caption = Primitive patterns
]{}
  Pattern & Description \\
  \TT{\SpegP{<string>}} & match \TT{<string>} literally \\
  \TT{\SpegQ{<n>}}      & match exactly \TT{<n>} characters \\
  \TT{\SpegR{<X><Y><x><y>}}
     & match any character between \TT{<X>} and \TT{<Y>} or between \TT{<x>} and \TT{<y>} \\
  \TT{\SpegS{<string>}} & match any character in \TT{<string>}
\end{spectblr}

The following examples demonstrate pattern matching with other primitive patterns:
\begin{demohigh}
\SetSpeg\lMyTestSpeg{\SpegQ{2}}
\IfSpegMatchTF\lMyTestSpeg{u}{T}{F}
\IfSpegMatchTF\lMyTestSpeg{vw}{T}{F}
\IfSpegMatchTF\lMyTestSpeg{xyz}{T}{F}
\end{demohigh}
\begin{demohigh}
\SetSpeg\lMyTestSpeg{\SpegR{AZ}}
\IfSpegMatchTF\lMyTestSpeg{Qq}{T}{F}
\IfSpegMatchTF\lMyTestSpeg{q1}{T}{F}
\IfSpegMatchTF\lMyTestSpeg{1Q}{T}{F}
\SetSpeg\lMyTestSpeg{\SpegR{AZaz}}
\IfSpegMatchTF\lMyTestSpeg{Qq}{T}{F}
\IfSpegMatchTF\lMyTestSpeg{q1}{T}{F}
\IfSpegMatchTF\lMyTestSpeg{1Q}{T}{F}
\end{demohigh}

\begin{demohigh}
\SetSpeg\lMyTestSpeg{\SpegS{world}}
\IfSpegMatchTF\lMyTestSpeg{one}{T}{F}
\IfSpegMatchTF\lMyTestSpeg{two}{T}{F}
\end{demohigh}

By default, PEG always starts at the first character.
Since both \CC{\SpegR} and \CC{\SpegS} match only one letter,
both last commands in previous two examples give \TT{F}.

\section{Pattern operators}

This package provides the following pattern operators for composing patterns:
\begin{spectblr}[
  caption = Pattern operators
]{}
  Operator         & Precedence & Description \\
  \TT{patt1/patt2} & 1 (choice) & match \TT{patt1} or \TT{patt2} (ordered choice) \\
  \TT{patt1*patt2} & 2 (concat) & match \TT{patt1} followed by \TT{patt2} \\
  \TT{!patt}       & 3 (not predicate) & match only if \TT{patt} does not match, and consume no input \\
  \TT{&patt}       & 3 (and predicate) & match \TT{patt} but consume no input \\
  \TT{patt^{<n>}}  & 4 (repeat) & match at least \TT{<n>} ($n\ge0)$ repetitions of \TT{patt} \\
  \TT{patt^{-<n>}} & 4 (repeat) & match at most \TT{<n>} ($n>0$) repetitions of \TT{patt} \\
  \TT{{patt expr}} & 5 (group)  & match \TT{patt expr} (pattern expression)
\end{spectblr}

With \TT{!} and \TT{*} operators, we can create negative character sets:
\begin{demohigh}
\SetSpeg\lMyTestSpeg{!\SpegR{09} * \SpegQ{1}}
\IfSpegMatchTF\lMyTestSpeg{A}{T}{F}
\IfSpegMatchTF\lMyTestSpeg{5}{T}{F}
\SetSpeg\lMyTestSpeg{!\SpegS{abc} * \SpegQ{1}}
\IfSpegMatchTF\lMyTestSpeg{B}{T}{F}
\IfSpegMatchTF\lMyTestSpeg{b}{T}{F}
\end{demohigh}

With \TT{^} operator, we can match words:
\begin{demohigh}
\SetSpeg\lMyTestSpeg{\SpegR{AZaz} ^ {1}}
\IfSpegMatchTF\lMyTestSpeg{HELLO}{T}{F}
\IfSpegMatchTF\lMyTestSpeg{world}{T}{F}
\IfSpegMatchTF\lMyTestSpeg{ text }{T}{F}
\IfSpegMatchTF\lMyTestSpeg{(text)}{T}{F}
\end{demohigh}

In fact, \TT{patt^{-1}} is similar to \TT{expr?}, \TT{patt^0} is similar to \TT{expr*},
and \TT{patt^1} is similar to \TT{expr+} in regular expression matching.

\section{Pattern variables}

In using \CC{\SetSpeg} command to set a \VT{speg} variable with a pattern expression,
you can use other \VT{speg} variables. For example:
\begin{demohigh}
\SetSpeg\lTmpaSpeg{\SpegR{AZ} / \SpegR{az}}
\SetSpeg\lTmpbSpeg{\SpegS{135} * \lTmpaSpeg}
\IfSpegMatchTF\lTmpbSpeg{2ab}{[T]}{[F]}
\IfSpegMatchTF\lTmpbSpeg{3ab}{[T]}{[F]}
\SetSpeg\lTmpbSpeg{\SpegS{135} * \lTmpaSpeg^{3}}
\IfSpegMatchTF\lTmpbSpeg{3ab}{[T]}{[F]}
\IfSpegMatchTF\lTmpbSpeg{3abcd}{[T]}{[F]}
\end{demohigh}

By using another recursive pattern, we can make \PP{speg} find a pattern anywhere in a string.
The following example demonstrates how to match a word with at least three letters
inside a string:\nopagebreak
\begin{demohigh}
\NewSpeg\lMyWordSpeg
\NewSpeg\lMyAnywhereSpeg
\SetSpeg\lMyWordSpeg{\SpegR{AZaz}^{3}}
\SetSpeg\lMyAnywhereSpeg{\lMyWordSpeg / \SpegQ{1} * \lMyAnywhereSpeg}
\IfSpegMatchTF\lMyAnywhereSpeg{foo bar}{[T]}{[F]}
\IfSpegMatchTF\lMyAnywhereSpeg{fo bar}{[T]}{[F]}
\IfSpegMatchTF\lMyAnywhereSpeg{123 ba}{[T]}{[F]}
\IfSpegMatchTF\lMyAnywhereSpeg{123 bar}{[T]}{[F]}
\end{demohigh}
In this example, \VV{\lMyAnywhereSpeg} tries to match \VV{\lMyWordSpeg},
skipping one letter and tries again if it fails.

\section{Capture patterns}%
\label{sect:capture}

This package provides the following commands for making capture patterns:
\begin{spectblr}[
  caption = Primitive patterns
]{}
  Pattern             & Name & Description \\
  \CC{\SpegC{<patt>}} & simple capture & capture the match for \TT{<patt>}\\
  \CC{\SpegCp}        & position capture & capture current position
\end{spectblr}

Position capture \CC{\SpegCp} must be concatenated with other patterns (by using \TT{*} operator):
\begin{demohigh}
\SetSpeg\lTmpaSpeg{\SpegCp * \SpegR{az}^{1} * \SpegCp * \SpegR{09}^{1} * \SpegCp}
\IfSpegExtractTF\lTmpaSpeg{12ab}\lSpegTmpaSeq{%
  \MapSpegSeqInline\lSpegTmpaSeq{[#1]}%
}{Failed}
\IfSpegExtractTF\lTmpaSpeg{ab12}\lSpegTmpaSeq{%
  \MapSpegSeqInline\lSpegTmpaSeq{[#1]}%
}{Failed}
\IfSpegExtractTF\lTmpaSpeg{abcd12345}\lSpegTmpaSeq{%
  \MapSpegSeqInline\lSpegTmpaSeq{[#1]}%
}{Failed}
\end{demohigh}
In this example, we use \CC{\IfSpegExtractTF} command to extract all captures,
which are stored in the \VT{seq} variable (\CC{\lSpegTmpaSeq}) specified by the third argument.
Then we use \CC{\MapSpegSeqInline} command to print each capture.

If you want to capture the substrings, you can modified the above example as follows:\nopagebreak
\begin{demohigh}
\SetSpeg\lTmpaSpeg{\SpegC{\SpegR{az}^{1}} * \SpegC{\SpegR{09}^{1}}}
\IfSpegExtractTF\lTmpaSpeg{12ab}\lSpegTmpaSeq{%
  \MapSpegSeqInline\lSpegTmpaSeq{[#1]}%
}{Failed}
\IfSpegExtractTF\lTmpaSpeg{ab12}\lSpegTmpaSeq{%
  \MapSpegSeqInline\lSpegTmpaSeq{[#1]}%
}{Failed}
\IfSpegExtractTF\lTmpaSpeg{abcd12345}\lSpegTmpaSeq{%
  \MapSpegSeqInline\lSpegTmpaSeq{[#1]}%
}{Failed}
\end{demohigh}

\chapter{The Source Code}

\dochighinput[language=latex/latex3]{pegmatch.sty}

\end{document}