Create a pull request #12

Subeen-Kim · 2017-10-15T19:24:57Z

No description provided.

mightydeveloper · 2017-10-16T02:39:22Z

README.md

+# commonwords.py
+Required packages:
+
+$ pip install wikipedia


I would put more documentation on README file.
Besides required packages, you can put descriptions, relevant links, or how to use it

mightydeveloper · 2017-10-16T02:46:55Z

commonwords.py

+    for word in text_1:
+        geometric_mean = math.sqrt(histogram_1.get(word, 0)*histogram_2.get(word, 0))
+        if geometric_mean != 0:
+            if word not in preposition and word not in pronoun and word not in article and word not in commonverb and word not in alphabet and word not in conjunction and word not in number:


stopwords = preposition+pronoun+number+article+commonverb+alphabet+conjunction
if word not in stopwords will be more concise code. I would make them as one list before using in the line 63. Also, some libraries already have those commonly ignored list of words as "stopwords". If you use those libraries, you don't have to write out all the words and might save your time.

mightydeveloper · 2017-10-16T02:51:56Z

commonwords.py

+    text = text.replace(')','')
+    text = text.replace("'",'')
+    text = text.replace(".",'')
+    text = text.replace(",",'')


This is just a minor suggestion, but if you want to make code a bit more scalable and concise code,
you might want to consider

for x, y in [('-',' '), ('(',''), (')',''), ("'",''), (".",''), (",",'')]: text = text.replace(x, y)

mightydeveloper · 2017-10-16T02:55:43Z

commonwords.py

+            count[word] = 1
+        else:
+            count[word] += 1
+    return count


(Also minor suggestion)
In python library, there is a special dictionary called Counter.
By default, this initializes the count to be 0, so that you wouldn't need to check if the keyword is inside the dictionary or not.
An equivalent code using Counter will be like following

from collections import Counter count = Counter() for word in unsorted_words: count[word] += 1 return count

mightydeveloper · 2017-10-16T02:56:37Z

datafromwiki.py

+            count[word] += 1
+    return count
+
+#histogram(text_to_word('Subeen-is-(an)-idiot'))


You can remove the unnecessary comments before submitting

mightydeveloper · 2017-10-16T03:11:37Z

datafromwiki.py

+        geometric_mean = math.sqrt(histogram_1.get(word, 0)*histogram_2.get(word, 0))
+        if geometric_mean != 0:
+            common_count[word] = geometric_mean
+        else:


else statement is unnecessary here

mightydeveloper · 2017-10-16T03:18:48Z

Overall, I think you did a great job on documenting functions using docstrings, but still there can be an improvement on README file.

Subeen-Kim added 3 commits October 9, 2017 18:49

temporary submission

b7ef181

Turning in miniproject 3

a5aac27

revised version

1bdcf4a

mightydeveloper reviewed Oct 16, 2017

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create a pull request #12

Create a pull request #12

Subeen-Kim commented Oct 15, 2017

mightydeveloper Oct 16, 2017

mightydeveloper Oct 16, 2017 •

edited

Loading

mightydeveloper Oct 16, 2017 •

edited

Loading

mightydeveloper Oct 16, 2017

mightydeveloper Oct 16, 2017

mightydeveloper Oct 16, 2017

mightydeveloper commented Oct 16, 2017

Create a pull request #12

Are you sure you want to change the base?

Create a pull request #12

Conversation

Subeen-Kim commented Oct 15, 2017

mightydeveloper Oct 16, 2017

Choose a reason for hiding this comment

mightydeveloper Oct 16, 2017 • edited Loading

Choose a reason for hiding this comment

mightydeveloper Oct 16, 2017 • edited Loading

Choose a reason for hiding this comment

mightydeveloper Oct 16, 2017

Choose a reason for hiding this comment

mightydeveloper Oct 16, 2017

Choose a reason for hiding this comment

mightydeveloper Oct 16, 2017

Choose a reason for hiding this comment

mightydeveloper commented Oct 16, 2017

mightydeveloper Oct 16, 2017 •

edited

Loading

mightydeveloper Oct 16, 2017 •

edited

Loading