This repository contains AI projects -- screenshots/gif & source.
Thesis - in progress...
Information Retrieval and Text Mining - Romanian Language Information Retrieval System
Java + Lucene
The project revolves around two entities: 1.Indexer - the class that starts from a set of documents that it takes as a parameter in the main function (args [0]) and creates an "inverted index" that it saves in the folder ".\index". This class reads documents using the DocumentReader class (which uses Tika) and then saves them as txt documents (which contain exactly the same information) in a temporary folder based on which the "inverted index" is built, then the temporary folder is stressed and the information is saved to disk in "inverted index". And 2.Searcher - the class that starts from the “inverted index” created previously and from a search sting still called query. This class returns documents that are revealed for the search string based on a confidence score.
How to run the code:
- Start a terminal in P1 then add the document indexing command
java -Dfile.encoding=UTF-8 -classpath ".\out\production\P1;.\dependencies\lucene-core-8.6.3.jar;.\dependencies\tika-app-1.24.1.jar;.\dependencies\pdfbox-app-2.0.21.jar;.\dependencies\lucene-queryparser-8.6.3.jar;.\dependencies\lucene-analyzers-common-8.6.3.jar" com.main.Indexer ".\docs"
- After indexing you can search for various information with the command. The project was run with Java 15.
java -Dfile.encoding=UTF-8 -classpath ".\out\production\P1;.\dependencies\lucene-core-8.6.3.jar;.\dependencies\tika-app-1.24.1.jar;.\dependencies\pdfbox-app-2.0.21.jar;.\dependencies\lucene-queryparser-8.6.3.jar;.\dependencies\lucene-analyzers-common-8.6.3.jar" com.main.Searcher "to modify"
Information Retrieval and Text Mining - Add Lyrics-Based Music Genre Classification
Python
Music genre classification, especially using lyrics alone, remains a challenging topic in Music Information Retrieval. In this project I apply a several methods to classify a large dataset of intact song lyrics.
(Year%201/Syntactic%20Modeling%20of%20Biological%20Systems/Learning%20representations%20of%20microbe–metabolite/Peper_LM.pdf)
Deep Learning - Deep Hallucination Classification
Python
Deep image hallucination classification challenge in which I train deep classification models on a data set containing images generated by deep generative models.
The analysis report and explanations can be found here
Natural language processing 2 - Fake News Detection
Python
In our society, the spread of fake news is increasing drastically due to which people are believing in unreal incidents. So it is utmost necessary to differentiate the real news from the fake ones and present them to society.
The analysis report and explanations can be found here
Contributors:
- Zugravu Andrei
- Calinescu Valentin
Other Classes
- Computer Vision – Video analysis of a snooker footage
- Computer Vision – Automatic grading of multiple choice tests
Practical Machine Learning - Classify gestures by reading muscle activity
Python
A recording of human hand muscle activity producing four different hand gestures.
The analysis report and explanations can be found here
Practical Machine Learning - Suicide Rates Overview 1985 to 2016
Python
Suicide Rates Overview 1985 to 2016 Compares socio-economic info with suicide rates by year and country.
The analysis report and explanations can be found here
Programming efficient algorithms - On the Decision Tree Complexity of String Matching
A natural problem is to determine the number of characters that need to be queried (i.e. the decision tree complexity) in a string in order to decide whether this string contains a certain pattern. Rivest showed that for every pattern p, in the worst case any deterministic algorithm needs to query at least n − |p| + 1 characters, where n is the length of the string and |p| is the length of the pattern. The analysis report and explanations can be found here
Syntactic Modeling of Biological Systems - Learning representations of microbe–metabolite
Metabolic-microbial relationships are essential for the study of the microbiome. A new method is introduced that has the power to analyze the metabolite-microbe relationships. This new method is based on a technology not used so far in the study of metabolized microbial interactions, namely machine learning. It is proved by 5 experiments: two experiments on cystic pulmonary fibrosis, one on the wetting of the biocrust, in the analysis of the impact of a high fat diet in murine a bacterium responsible for the excess production of a new bile acid is determined and in the analysis of in ammatory bowel disease and the colon identify a bacterium responsible for this disease as it was not initially associated with this disease in the Human Microbiome Project, as this method of analyzing metabolite microbe interactions has higher performance than previous methods (which are purely statistical) to do this thing.
The analysis report and explanations can be found here