Design `nlp` submodule and skeleton of drift detection methods #152

Anmol-Srivastava · 2023-08-03T15:13:58Z

Task

NLP-based drift detection algorithms do not always fit into data-drift or concept-drift definitions, so a separate submodule can be made and a basic skeleton of a language or text-based algorithm can be made.

Impact

This makes implementing specific algorithms later on easier.

Anmol-Srivastava · 2023-08-03T15:19:05Z

Worth thinking about returning to the pipeline idea:

class NLPMethod():
    def run():
        self = pipe(self, *self.operators)

n = NLPMethod(operators=[sklearn.some_preprocessor, transformers.some_transformer, some_encoder, some_evaluator])

Anmol-Srivastava · 2023-08-03T15:37:57Z

Also worth exploring multi-threading / HPC / GPU compatibility here. If adopting a pipeline approach, we may have several operators applied to the same data at a given stage, which is a good opportunity to demonstrate potential performance enhancements. We can use MD3 as a starting point

anmol-srivastava-mitre · 2023-08-03T17:59:35Z

Also worth looking at iterators

anmol-srivastava-mitre · 2023-08-03T18:07:42Z

below step ~ next(iter)

class FreeDetector():
     def step(inputs):
         data = pipe(*self.data, some_operators)
         state = # ... pipeline of operators e.g. divergence metrics ...
         self.state = state
 
    def run():
        while data:
            self.step()

anmol-srivastava-mitre · 2023-08-03T18:11:15Z

The above can help simplify a joint interface for batch vs. stream data, and can be made relevant for NLP and other methods

Anmol-Srivastava added this to the Initial NLP Support milestone Aug 3, 2023

Anmol-Srivastava self-assigned this Aug 3, 2023

Anmol-Srivastava added the enhancement New feature or request label Aug 3, 2023

Anmol-Srivastava added the nlp Related to development of NLP capabilities label Aug 3, 2023

Anmol-Srivastava added this to To do in Ver 0.2.1 via automation Aug 8, 2023

Anmol-Srivastava moved this from To do to In progress in Ver 0.2.1 Aug 8, 2023

Anmol-Srivastava linked a pull request Oct 17, 2023 that will close this issue

Implement NLP transforms, KS alarm + detector #163

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Design `nlp` submodule and skeleton of drift detection methods #152

Design `nlp` submodule and skeleton of drift detection methods #152

Anmol-Srivastava commented Aug 3, 2023

Anmol-Srivastava commented Aug 3, 2023

Anmol-Srivastava commented Aug 3, 2023

anmol-srivastava-mitre commented Aug 3, 2023

anmol-srivastava-mitre commented Aug 3, 2023

anmol-srivastava-mitre commented Aug 3, 2023

Design nlp submodule and skeleton of drift detection methods #152

Design nlp submodule and skeleton of drift detection methods #152

Comments

Anmol-Srivastava commented Aug 3, 2023

Task

Impact

Anmol-Srivastava commented Aug 3, 2023

Anmol-Srivastava commented Aug 3, 2023

anmol-srivastava-mitre commented Aug 3, 2023

anmol-srivastava-mitre commented Aug 3, 2023

anmol-srivastava-mitre commented Aug 3, 2023

Design `nlp` submodule and skeleton of drift detection methods #152

Design `nlp` submodule and skeleton of drift detection methods #152