BGU NLP - EasyFirst Package for the Medical Domain

Raphael Cohen and Michael Elhadad

March 2012

This is an implemention of EasyFirst parser with a precompiled model for the biomedical domain trained on the Genia Treebank.


  1. License
  2. Download
  3. Usage

The package is based on Yoav Goldberg's implementation of EasyFirst Parser.

It is based on POS tagging preprocessing using GeniaTagger by Tsuruoka et al.
Download and install separately

License

EasyFirst Package for the Medical Domain is distributed under a GPL license.

    EasyFirst is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    See the GNU General Public License.
The precompiled model is trained with the Genia Treebank, see their website for licensing.

Download

EasyFirst + precompiled model for Genia.
The prerequisites are java 1.6 and python (it has been tested with Python 2.7) and GeniaTagger.

Use

  • Generate the POS input:
    Usage:    geniatagger [textFile] > [posTextFile]
    Example:  geniatagger file.txt > fileWithPOS.txt
    -
        This should be run in the local directory where genia tagger is installed.
    
    
  • Run Parser
    
    % cat fileWithPOS.txt | python transformGeniaPOStoConllx.py | python sdparser.py -conll -noextra
    
    - This will output the parsed sentences in conll format to the stdout (screen).
    
    

    Last modified March 13, 2012
    Updated December 30, 2020