Skip to content

Utils package with commonly functions used for data science projects with Python

License

Notifications You must be signed in to change notification settings

vichShir/datascience-utils-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 

Repository files navigation

vichShir package

This repository contains common functions that I used in data science projects with Python.

Key Features

  • Utils
    • Download files
  • Machine Learning - Preprocessing
    • Normalize the inputs
    • Impute missing data
  • Text patterns
    • Name extractor
    • Email
    • Phone
    • Year
  • Webscraping
    • Extract the content (text) from websites

How to use

To clone and install this package, you'll need PIP installed on your computer. From your command line:

# Update pip
pip install --upgrade pip

# Install the latest master of vichShir
pip install git+https://github.com/vichShir/datascience-utils-python.git

Examples

NER - Name Extractor

from vichshir.cleaning.text_matching.nlp import NameExtractor

txt_person = '''
Existem muitos sistemas de ERP. Thiago Fulano da Silva é CTO e desenvolvedor de um poderoso sistema de ERP, também coordena uma equipe, João Sicrano da Costa e Pedro Beltrano.
'''

extractor = NameExtractor()
persons = extractor.extract_names(txt_person)
persons

Credits

This software uses the following open source packages:

  • Pandas
  • Numpy
  • Scikit-learn
  • Transformers
  • Beautifulsoup

License

Apache 2.0

About

Utils package with commonly functions used for data science projects with Python

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages