The definition of base NPs has to be adapted to the case of Hebrew to handle smixut (construct state) correctly.
This system was trained on a set of 5000 sentences manually parsed in the Knowledge Center for Hebrew Processing (treebank in Hebrew, treebank in English). The results on a test set of about 15,000 NPs are about 83% precision and 88% recall.
The method learns a set of Part of Speech patterns which covers the training set. About 1800 patterns were learned for the best results achieved.
This page contains: