Semantic graph parser based on Categorial grammars

Last update: Aug 19, 2022

Related tags

Overview

Lambekseq

"Everyone who failed Greek or Latin hates it."

This package is for proving theorems in Categorial grammars (CG) and constructing semantic graphs, i.e., semgraphs on top of that.

Three CG calculuses are supported here (see below). A "proof" is simply a set of atom links, abstracting away from derivaiton details.

Requirements

Add the path to the package to PYTHONPATH. None of the below packages is needed to use the theorem proving facility.

Semantic graphs derive from digraph:

networkx

For graph visualization we use

pydot (required by networkx)
Graphviz
python-graphviz
dot2tex

Background

This package is used for the author's PhD thesis in progress.

Categorial grammars:

Associative Lambek Calculus Allowing Empty Premises (Lambek 1958)
Proof Net for Lambek Calculus based on Cyclic Multiplicative Linear Logic (Pentus 2010)
Continuized Combinatory Categorial Grammar (Barker and Shan 2014)
Displacement Calculus/Continuized Lambek Calculus (Morrill et al 2010)

Semantic graphs:

Abstract Meaning Representation (Banarescu et al 2013;
see also https://github.com/amrisi/amr-guidelines/blob/master/amr.md)
Hybrid Logic Dependency Semantics (Baldridge and Kruijff 2002; White 2006)

Theorem Proving

To prove a theorem, use atomlink module. For example, using Lambek Calculus to prove np np\s -> s.

>>> import lambekseq.atomlink as al

>>> con, *pres = 's np np\\s'.split()
>>> con, pres, parser, _ = al.searchLinks(al.LambekProof, con, pres)
>>> al.printLinks(con, pres, parser)

This outputs

----------
s_0 <= np_1 np_2\s_3

(np_1, np_2), (s_0, s_3)

Total: 1

You can run atomlink in command line. The following finds proofs for the theorems in input, using abbreviation definitions in abbr.json and Contintuized CCG.

$ python atomlink.py -i input -a abbr.json -c ccg --earlyCollapse

Theorem s qp vp/s qp vp (the first item is the conclusion, the rest the premises) is thus proved as follows:

<class 'lambekseq.cntccg.Cntccg'>
----------
s_0 <= (s_1^np_2)!s_3 (np_4\s_5)/s_6 (s_7^np_8)!s_9 np_10\s_11

(np_10, np_8), (np_2, np_4), (s_0, s_3), (s_1, s_5), (s_11, s_7), (s_6, s_9)

Total: 1

When using Lambek/Displacement/CCG calculus, you can also inspect the proof tree that yields atom links:

>>> con, *pres = 's', 'np', '(np\\s)/np', 'np'
>>> con, pres, parser, _ = al.searchLinks(al.LambekProof, con, pres)
>>> parser.buildTree()
>>> parser.printTree()
(np_1, np_2), (np_4, np_5), (s_0, s_3)
........ s_3 -> s_0
........ np_1 -> np_2
.... np_1 np_2\s_3 -> s_0
.... np_5 -> np_4
 np_1 (np_2\s_3)/np_4 np_5 -> s_0

You can export the tree to Bussproofs code for Latex display:

>>> print(parser.bussproof)
...
\begin{prooftree}
\EnableBpAbbreviations
        \AXC{s$_{3}$ $\to$ s$_{0}$}
        \AXC{np$_{1}$ $\to$ np$_{2}$}
    \BIC{np$_{1}$\enskip{}np$_{2}$\textbackslash s$_{3}$ $\to$ s$_{0}$}
    \AXC{np$_{5}$ $\to$ np$_{4}$}
\BIC{np$_{1}$\enskip{}(np$_{2}$\textbackslash s$_{3}$)/np$_{4}$\enskip{}np$_{5}$ $\to$ s$_{0}$}
\end{prooftree}

Run python atomlink.py --help for details.

Semantic Parsing

Use semcomp module for semantic parsing. You need to define graph schemata for parts of speech as in schema.json.

>>> from lambekseq.semcomp import SemComp
>>> SemComp.load_lexicon(abbr_path='abbr.json',
                         vocab_path='schema.json')
>>> ex = 'a boy walked a dog'
>>> pos = 'ind n vt ind n'
>>> sc = SemComp(zip(ex.split(), pos.split()), calc='dsp')
>>> sc.unify('s')

Use graphviz's Source to display the semgraphs constructed from the input:

>>> from graphviz import Source
>>> Source(sc.semantics[0].dot_styled)

This outputs

You can inspect the syntax behind this parse:

>>> sc.syntax[0].insight.con, sc.syntax[0].insight.pres
('s_0', ['np_1/n_2', 'n_3', '(np_4\\s_5)/np_6', 'np_7/n_8', 'n_9'])

>>> sc.syntax[0].links
['(n_2, n_3)', '(n_8, n_9)', '(np_1, np_4)', '(np_6, np_7)', '(s_0, s_5)']

See demo/demo.ipynb for more examples.

You can export semgraphs to tikz code that can be visually edited by TikZit.

>>> print(sc.semantics[0].tikz)
\begin{tikzpicture}
\begin{pgfonlayer}{nodelayer}
        \node [style=node] (i1) at (-1.88,2.13) {};
        \node [style=none] (g2u0) at (-2.99,3.07) {};
        \node [style=node] (i0) at (0.99,-2.68) {};
        \node [style=none] (g5u0) at (1.09,-4.13) {};
        \node [style=node] (g3a0) at (0.74,0.43) {};
        \node [style=none] (g3u0) at (2.05,1.19) {};
        \node [style=none] (0) at (-3.04,2.89) {boy};
        \node [style=none] (1) at (0.61,-4.00) {dog};
        \node [style=none] (2) at (-0.66,0.72) {ag};
        \node [style=none] (3) at (0.63,-0.77) {th};
        \node [style=none] (4) at (2.42,1.09) {walked};
\end{pgfonlayer}
\begin{pgfonlayer}{edgelayer}
        \draw [style=arrow] (i1) to (g2u0.center);
        \draw [style=arrow] (i0) to (g5u0.center);
        \draw [style=arrow] (g3a0) to (i1);
        \draw [style=arrow] (g3a0) to (i0);
        \draw [style=arrow] (g3a0) to (g3u0.center);
\end{pgfonlayer}
\end{tikzpicture}

Semantic graph parser based on Categorial grammars

Related tags

Overview

Lambekseq

Requirements

Background

Categorial grammars:

Semantic graphs:

Theorem Proving

Semantic Parsing

Owner

A PyTorch implementation of NeRF (Neural Radiance Fields) that reproduces the results.

Model serving at scale

Source code for Zalo AI 2021 submission

E2EDNA2 - An automated pipeline for simulation of DNA aptamers complexed with small molecules and short peptides

CFNet: Cascade and Fused Cost Volume for Robust Stereo Matching（CVPR2021）

Lightweight library to build and train neural networks in Theano

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

End-to-end beat and downbeat tracking in the time domain.

Ensembling Off-the-shelf Models for GAN Training

Topic Discovery via Latent Space Clustering of Pretrained Language Model Representations

Generalized Data Weighting via Class-level Gradient Manipulation

EMNLP 2021: Single-dataset Experts for Multi-dataset Question-Answering

Keras implementation of "One pixel attack for fooling deep neural networks" using differential evolution on Cifar10 and ImageNet

Talk covering the features of skorch

[ICLR 2021 Spotlight Oral] "Undistillable: Making A Nasty Teacher That CANNOT teach students", Haoyu Ma, Tianlong Chen, Ting-Kuei Hu, Chenyu You, Xiaohui Xie, Zhangyang Wang

object detection; robust detection; ACM MM21 grand challenge; Security AI Challenger Phase VII

Hso-groupie - A pwnable challenge in Real World CTF 4th

Codes for CIKM'21 paper 'Self-Supervised Graph Co-Training for Session-based Recommendation'.

CR-FIQA: Face Image Quality Assessment by Learning Sample Relative Classifiability

[AAAI 2022] Sparse Structure Learning via Graph Neural Networks for Inductive Document Classification