-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data analysis part 2 #173
base: master
Are you sure you want to change the base?
Data analysis part 2 #173
Conversation
|
||
## Looking at the data | ||
|
||
The data will be split into different csv files, split by different data types.According to your setup, there will be up to 5 files: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're missing a space after the fullstop.
- homie_enum | ||
- homie_color: contains rgb values for the smart lights | ||
- homie_float: contains all metrics stored as floats (temperature) | ||
- homie_integer: contains all metrics stored as integers (humidity %, battery level %) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
string
is also possible.
- homie_float: contains all metrics stored as floats (temperature) | ||
- homie_integer: contains all metrics stored as integers (humidity %, battery level %) | ||
|
||
Here, we want to focus on the csvs containing floats and integers, as they contain the temperature/humdity data. Useful columns: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CSV files
|
||
Here, we want to focus on the csvs containing floats and integers, as they contain the temperature/humdity data. Useful columns: | ||
- time: since epoch (unix epoch 1970). pandas handles this for us. | ||
device_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this should also be a list entry.
- node_type: =="Mijia sensor" to select only the temperature/humidity sensor data | ||
- node_name: nickname for the sensor (e.g., "living room") | ||
|
||
There are between 4 and 10 data points per sensor per minute, depending on how often a sensor gets polled (~ 10K data points in a 24h period for a given sensor) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on the min_update_period_seconds
in mijia-homie.toml, really.
@@ -7,7 +7,9 @@ | |||
"outputs": [], | |||
"source": [ | |||
"import pandas as pd \n", | |||
"import plotly.express as px\n" | |||
"import plotly.express as px\n", | |||
"from sklearn.preprocessing import StandardScaler\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hrm. I'm getting an error here. Trying to debug now.
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-2-186f7a1512d6> in <module>
1 import pandas as pd
2 import plotly.express as px
----> 3 from sklearn.preprocessing import StandardScaler
4 from sklearn.decomposition import PCA
ModuleNotFoundError: No module named 'sklearn'
@@ -10,6 +10,7 @@ ipykernel = "^5.5.3" | |||
pandas = "^1.2.4" | |||
plotly = "^4.14.3" | |||
nbstripout = "^0.3.9" | |||
sklearn = "^0.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://pypi.org/project/sklearn/ says to use scikit-learn
instead.
vscode also decided that it wanted to install notebook
when I tried things out on a fresh virtualenv, but I can make a patch for that as a separate PR.
No description provided.