11# NLPPlus
22
3- NLPPlus is the first 100% customizable NLP package for Python. NLPPlus
4- uses the [ open-source NLP Engine] ( https://github.com/VisualText/nlp-engine ) .
5- Unlike other NLP packages which are black boxes, NLPPlus analyzers are
6- 100% NLP++ code that can be modified. NLPPlus comes with five starter
7- analyzers: telephone numbers, links, emails, and a full English parser.
3+ ## <## <span style =' color :red ' >READ FIRST</span >
4+
5+ Current NLP python packages have the "intention" of being plug-and-play
6+ systems that perform natural language tasks without modification. The
7+ problem is that when these systems ultimately fail in critical situations,
8+ coders are left with no real way to fix these systems and they are quickly
9+ abandoned.
10+
11+ The problem is that most all of these packages rely on statistical methods
12+ such as machine learning or neural networks, or in the simpler cases, they
13+ rely on Regex. Statistical systems cannot logically be corrected and Regex
14+ is extremely limited and unreadable and impossible to maintain or extend.
15+ Plus, these systems offer little if any means to modify them even though
16+ every NLP task is slightly different in important ways.
17+
18+ The NLPPlus Python Package is different from all other NLP Python packages.
19+ All its analyzers are 100% human readable and modifiable code that allows
20+ any non-NLP coder to become a computational linguist using the NLP++ VSCode
21+ Language Extension appropriately called "VisualText". The VisualText
22+ extension allows for the visualization of any NLP process. Coders can "see"
23+ the syntactic parse tree along each step of the process, see rule matches
24+ directly in the ext, and print out the knowledge base at any point in the
25+ process. Plus, dictionaries and knowledge bases are human readable unlike
26+ json files or databases.
27+
28+ NLPPlus comes with five starter analyzers: telephone numbers, links, emails,
29+ addresses, and a full English parser. And because NLP++ is a glassbox, all
30+ analyzers can easily be modified by any coder.
831
932If for example, the telephone number analyzer is not working properly for your
1033application, you can use the [ NLP++ VSCode extension] ( http://vscode.visualtect.org )
@@ -23,7 +46,7 @@ around the world are starting to use NLP++ to write human digital readers for
2346
2447* Python 3.10 or newer
2548
26- ## Installation
49+ ## < span style = ' color : orange ' > Installation</ span >
2750
2851### Future Installation (waiting for approval)
2952
@@ -63,9 +86,6 @@ shown in the filename, for instance, for Python 3.10 on Windows you
6386will see a file with a name like
6487` nlpplus-0.1.dev1+g55d691d-cp310-cp310-win_amd64.whl ` - the ` cp310 `
6588means Python 3.10. For Python 3.12 it would be ` cp312 ` , and so forth.
66- You can install this file with ` pip ` :
67-
68- pip install nlpplus-0.1.2-cp310-cp310-win_amd64.whl
6989
7090For specific instructions on setting up Python on your platform please
7191consult the Python documentation.
@@ -74,7 +94,49 @@ If your platform is not supported you can also compile it from source,
7494which will require a working C++ compiler. See the platform specific
7595instructions below for the requirements to build.
7696
77- ## Using the Library
97+ ## <span style =' color :green ' >Why Use NLP++?</span >
98+
99+ There are many reasons to consider using NLP++. Whether it be to be
100+ able to write Regex-like rule patterns, to having the ability to
101+ modify 100% of the NLP code, or to visualize the NLP analyzer in
102+ an intunitive way, NLP++ should be in every coder and programmer's
103+ toolkit.
104+
105+ To put it simply, NLP++ turns any coder or programmer into an NLP
106+ engineer.
107+
108+ ### 1000 Times Better than Regex
109+
110+ For matching patterns in text, NLP++ is a Regex killer. The rule
111+ matching system in NLP++ is human readable and is performed by calling
112+ rules in a sequence, making creating and debugging rule-based patterns
113+ a breeze. Along with
114+
115+ ### 100% Modifiable
116+
117+ The main reason to use NLP++
118+ it is to engineer an NLP system to a specific task. Most all extraction
119+ or understanding tasks in NLP require specific processing that is never
120+ included in "generic" systems. NLP++ allows for the creation or
121+ modification of any NLP++ system.
122+
123+ It must be emphasized that what separates NLPPlus from all the other
124+ NLP packages in Python is that fact that all parsers are 100% modifiable
125+ using the VSCode NLP++ Language Extension. Other NLP packages use regex
126+ patterns which are impossible to modify or use trained machine learning
127+ or neural network systems which cannot be fixed when
128+
129+ ### VisualText Editor
130+
131+ Writing an NLP system from scratch is thought to be for only those in
132+ computational linguistics. But VisualText, NLP++, and the conceptual
133+ Grammar changes all that.
134+
135+ Taking full advantage of the familiar VSCode environment, the NLP++
136+ language extension makes NLP a visual process and logical process that
137+ is easy to understand.
138+
139+ ## <span style =' color :yellow ;' >Usng the NLPPlus Python Package</span >
78140
79141Very basic usage, which runs the default parser for US English and
80142returns parsing results as xML:
@@ -99,7 +161,53 @@ or JSON output from them:
99161 parsed_address = results.output["email_address"][0]
100162 parse_tree = results.final_tree
101163
102- ## NLP++ Development
164+ ### NLPPlus Engine Functions
165+
166+ #### set_analyzer_folder(analyzer_folder_path: str)
167+ This is used to set the folder where your analyzers are located.
168+
169+ #### analyze(text: str, parser: str = "parse-en-us"): str
170+ This calls one of the analyzers in the analyzer folder on the text.
171+ If the analyzer folder was not set, it will use the library analyzers
172+ that come with NLPPlus. It is recommended that you use the function
173+ copy_library_analyzers to copy the analyzers to avoid having them
174+ overwritten when a new version of NLPPlus is installed.
175+
176+ The analyze function a results object that make the analyzer
177+ output files easily accessible to python. (see reults below)
178+
179+ #### copy_library_analyzers(self, to_dir: str, overwrite: bool=True)
180+ This function copies the NLPPlus library analyzers into a safe
181+ folder away from where they can be overwritten by newer versions
182+ of the NLPPlus package. This allows coders to edit and modify the
183+ analyzers to their liking. Remember to use the set_analyzers_folder
184+ if you want to call your versions of these library analyzers
185+ using the NLPPlus package.
186+
187+ #### input_text(analyzer_name: str, file_name: str)
188+ When developing or editing NLP++ analyzers and calling them from
189+ Python, it is convenient to test your python code on text you
190+ have used to develop your analyzer. This function retrieves the
191+ text from a file in the analyzer's input directory for easy
192+ access while developing your python code in conjunction with
193+ and NLP++ analyzer.
194+
195+ ### NLPPlus Engine Results
196+
197+ #### output
198+ This returns a json object based on the parsed output.json file
199+ producted by the analyzer. THe analyzer has to purposly construct
200+ the output.json file for this to work.
201+
202+ #### output.json
203+ The output file produced by the analyzer that is a string, not
204+ a jsoh object. This file must explicity be produced by the analyzer.
205+
206+ #### final.tree
207+ All analyzers output a final tree of the text that is being processed.
208+ This file is in the NLP++ tree format.
209+
210+ ## <span style =' color :orange ' >NLP++ Development</span >
103211
104212By default the ` NLPPlus ` module will create a temporary working
105213directory with the default parser and the small set of analyzers
0 commit comments