Linguistics Computing and Natural Language Understanding – Learning The History

I have today received a copy of “Using Computers In Linguistics A Practical Guide”.  I got this book after seeing it referenced on wikipedia (https://en.wikipedia.org/wiki/Natural_language_understanding) against the phrase:

The interpretation capabilities of a language understanding system depend on the semantic theory it uses. Competing semantic theories of language have specific trade offs in their suitability as the basis of computer automated semantic interpretation [21]”.

These range from naive semantics or stochastic semantic analysis to the use of pragmatics to derive meaning from context.[22][23][24].

References:
21: Using computers in linguistics: a practical guide by John Lawler, Helen Aristar Dry 198 ISBN 0-415-16792-2 page 209
22: Naive semantics for natural language understanding by Kathleen Dahlgren 1988 ISBN 0-89838-287-4
23: Stochastically-based semantic analysis by Wolfgang Minker, Alex Waibel, Joseph Mariani 1999 ISBN 0-7923-8571-3
24: Pragmatics and natural language understanding by Georgia M. Green 1996 ISBN 0-8058-2166-X

This looks to be an excellant introductory book on historical approaches to language understanding. I need to learn my history so as not to repeat the mistakes of the past if I am going to contribute to developing a computer system that understands natural language.

I am also waiting on a book on dependency grammar which was used in an early but unsuccessful venture into the field of language understanding.  Interest in this particular field is however now growing http://depling.org/depling2015/ (also https://en.wikipedia.org/wiki/Dependency_grammar).

– Ok I am a bit of a geek but this is my train set…

Understanding Syntax and Conceptual Text Modelling – A Journey

I am not certain that it will be possible to automatically create BPMN reliably  from text but it will be fun trying.  This task will require prior object knowledge,  knowledge of object properties actions and responses, understanding of the meaning of words (a dictionary or as it is known in this trade a Lexicon), access to ontologies (for relative how things are related and logically deriving further knowledge), a model of how words relate to one another (linguistic theories), a system (systems) for word sense disambiguation, a mechanism for classifying words and sentence types into parts of speech, a mechanism for classifying or better still viewing objects in a real world context in relation to other objects and the environment.  Thinking about the basics of word understanding you need a visual spacial context and a sense of number to appreciate “this”, “that” “those” before understanding the spatially abstract “the”.  As far as real world object representation is concerned I think you could integrate and dynamically build or load a view of an object in a virtual 3D space using web3d (see http://www.web3d.org/standards).

I have started building a UML model for the software components necessary to achieve this task.  After studying some of the work of senseval I can see that there is no one size fits all solution to word sense disambiguation.  I think it therefore makes sense to implement multiple solutions and associate the most applicable to particular words (this could be done automatically against a marked up corpus).  I feel a natural implementation for this will be to use a service locator to find the most relevant word sense disambiguation provider implemented via a provider interface.

As previously described in How dynamic creation of BPMN in part involves the classification of textual information into the following categories:

  1. Activity
    Activities will be associated with verbs and represent processes.  Processes can be associated with additional information such as set up time, minimum or maximum batch size or a processing rate and pre and post process queues of a defined capacity.
  2. Entity
    Entities are the things or information that gets transformed by processes and travel through a process model.  When an entity is transformed by a process it could renamed e.g fleece to yarn in wool processing.
  3. Resource
    Resources are additional things that are needed to support the processing of entities.
  4. Event
    Events are things that happen and are created by a trigger. They may pass information and cause an action.  Events can illicit a response and be either synchronous or asynchronous.
  5. Actor
    Actors are the sources of system inputs and destinations for outputs or the source or destination of external events. Actors can be the source or destination of “entities”.
  6. Goal
    Goals are difficult to define but a likely to be identified by the fact that they involve systems that create added value.
  7. System
    A system is a group of things that have a definable boundary and probably has has a goal.

Looking at these categories they are a subset of the data you can find described in schema.org.  I have been thinking that the schema.org XML schema might be a better intial target mapping than BPMN.

An obvious implementation for this problem would be a deep learning classification engine.  Before this can be considered I need a better understanding of word and sentence meaning (Semantics, Pragmatics and Conceptual meaning) .

There are multiple theories available for grammar.  I started with generative grammar and am now reading about dependency grammar.  I have again hit the frustration of not being able to read references as I am not a member of a University library.

I am often getting the basic story of a topic off wikipedia and then trying to find peer reviewed journal references.

I finally found some good references about deep learning.  Some people with have been telling me I should give up my study of linguistics and forget these procedural approaches to solving the problem of language understanding and focus on understanding deep learning.  From what I have read so far in academic papers (i.e no hype but an explanation), deep learning is about classifying and understanding things through a hierarchical chain.  Each neural layer in deep learning currently tends to needs training before it can be used to feed into the next layer.  Deep learning is not a means of stirring a pot of neuron soup before letting it settle out into a brain.  From what I have read deep learning represents advanced pattern matching tool.  I have seen articles about how to build a brain which I have not yet read.  It maybe that my understanding of what you can do with deep learning is out of date but I also read 2015 articles.  I have found http://deeplearning.net which does appear to be an excellent source for finding out the state of the art.