Norma-Simplex: an infrastructure for legal documents analysis, assessment, and re-factoring

  • Expected start date:
    Expected end date:
    Project Status:
    Project Type:
    Project Manager :
    Luís Barbosa


    Modern societies are governed by laws and regulations. This concept is captured in the phrase “rule of law” [Figgis22], i.e. a nation should be governed by law. This implies that every citizen is subject to the law [Tamanaha04]. These laws and regulations are captured in textual documents, written by legislators. Shortcomings and other assorted problems with these documents have a direct social and economic impact in people’s everyday lives, and introduces administrative burden to governments and citizens [Coglianese12]. Also, in modern societies, there is clear concern to include citizens in deliberation and decision processes (e.g. participatory surveys), and with the increase demand on transparency and open data, norms and regulations should be easy to read and accessible to everyone, independently of education background, and accessibility constraints [Zittel07, Gurría11].


    This project aims at modeling, analyzing, assessing, and suggesting improvements to legal documents and regulations. Such models should capture the semantics conveyed by natural language in a rigorous mathematical way, relying on relation algebra. The captured information is structured into knowledge through a tool-chain of analysis and inference, in the form of rich models which are then used to: (1) validate the semantic information captured by the model, namely checking for inconsistencies using state-of-the-art model-checking techniques; (2) generate refactored text from the improved model conveying the same semantics in a better, less ambiguous and more concise way; (3) derive the schema of the information system able to support the computerization of the corresponding legal process; (4) build semantic networks and information retrieval artifacts to search and analyze legal document contents. The combination and integration of all these activities will result in higher quality laws and regulations and more efficient IT support for the legal sector.


    To finish the summary let’s look at a sentence extracted from a local city hall participatory survey regulation: “Each citizen can submit one proposal only.”, for a human it is easier to perceive the obvious constraint. The challenge is instrumenting a machine to infer the same information. From a linguistic point of view the concept or action being described is captured (in most cases) by the verb of the sentence, by looking at the part-of-speech tagging combined with the dependency parsing information, we can infer a relation between the citizen and a proposal, and by looking at the verb actants we can infer the cardinality of the concepts in the relation. Given all this information well defined and well structured, and a set of boilerplates, we can came up with a snippet of Alloy [Jackson02] that represents the same knowledge: “sig citizen { present: lone proposal }”, but is written in a formal, unambiguous language, which makes it easier for systematically processing by machines.