Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Week of 7/15/2024 #1309

Open
15 tasks done
rburghol opened this issue Jul 15, 2024 · 10 comments
Open
15 tasks done

Week of 7/15/2024 #1309

rburghol opened this issue Jul 15, 2024 · 10 comments
Assignees

Comments

@rburghol
Copy link
Contributor

rburghol commented Jul 15, 2024

@mwdunlap2004
Copy link
Collaborator

I added a couple of files to the master branch, the one called mon_lm_analysis.r is a way to use the new mon_lm functions I made to write out a csv of the stats which can then be used in the plot_save file I added, it takes the stats, dataset name, label name, and the write location to make a png. The updated mon_lm_stats and mon_lm_plot functions were both included in the lm_analysis_plots_copy.R file I added as well. They work with daily data as well, right now I have their source set as the original file so they wont work, but I figured it would be better for us to edit the original if we like the edits then to try and set everything to this new location

@mwdunlap2004
Copy link
Collaborator

I ran all of the methods for all three datasets for the 01665500 gage which is the Rapidan River, there were a few issues with my methods (we didn't have the week column in nldas2), and I had to adjust my calls for the functions because it wasn't calling the most up to date versions. But our methods work, and I was able to make plots and stat csvs for all three datasets relatively easily.

@mwdunlap2004
Copy link
Collaborator

Screenshot 2024-07-17 at 11 29 39 AM Screenshot 2024-07-17 at 11 30 01 AM Screenshot 2024-07-17 at 11 31 51 AM This is what the error looks like on my end when I try to convert data_lm into a JSON

@COBrogan
Copy link
Collaborator

COBrogan commented Jul 17, 2024

Okay. I'm guessing that error results from trying to export the R6 class that we created as plotBin. Per the documentation for toJson:

Description
Convert an R object into a corresponding JSON object
Lists with unnamed componenets are not current supported
Usage
toJson( x, indent=0, method="C" )
Arguments
x a vector or list to convert into a JSON object

So, we can probably only export the lists within the object. We could restructure plotBin at this point and just make it a list. It no longer needs the full R6 functionality because it is only storing data and not the plot. This would likely get around this issue for us.

@COBrogan
Copy link
Collaborator

@mwdunlap2004 per our discussion, check out this example for lists. I was indexing my list incorrectly during the meeting. Note the use of [[i]] instead of [i] when trying to get a list element! Maybe this will help you, maybe not. As long as we get the residual plot at the end of the day, feel free to write out the data using any format you want. The below loop will generate unique data in each loop using rnorm. It will then store the full lm model, the full data, the rsq, and the stats all in testList!

testList <- list(lms = list(),stats=list(),rsq = numeric(),data=lists())
for(i in 1:12){
  testDF <- data.frame(1:50,rnorm(50))
  testLM <- lm(testDF$rnorm.50.~testDF$X1.50)
  testStats <- summary(testLM)
  rsq <- testStats$adj.r.squared
  
  testList$lms[[i]] <- testLM
  testList$stats[[i]] <- testStats
  testList$rsq[i] <- rsq
  testList$data[[i]] <- testDF
}

testList$data[[1]]
testList$rsq
testList$stats
class(testList$lms[[1]])

HOWEVER, this STILL can't be written to json. Apparently I was at least partially incorrect before. The lm objects themselves are R6 objects, as are summary(lm). So, these can't be written to JSON directly....Instead you may need to just store the data we want and write it out. See below for how json may save some time comapred to write.csv():

testList <- list(resid = list(),
                 fitted = list(),
                 coeff=list(),rsq = numeric(),data=list())
for(i in 1:12){
  testDF <- data.frame(1:50,rnorm(50))
  testLM <- lm(testDF$rnorm.50.~testDF$X1.50)
  testStats <- summary(testLM)
  rsq <- testStats$adj.r.squared
  
  testList$resid[[i]] <- testLM$residuals
  testList$fitted[[i]] <- testLM$fitted.values
  testList$coeff[[i]] <- testStats$coefficients
  testList$rsq[i] <- rsq
  testList$data[[i]] <- testDF
}

json <- toJSON(testList$stats)

@mwdunlap2004
Copy link
Collaborator

I figured out a way with the assistance of Connor to adjust mon_lm to create a json, right now the only issue on my end is trying to get the month and rsq list to make our rsq plots we use.
Screenshot 2024-07-18 at 3 40 11 PM
Screenshot 2024-07-18 at 3 39 48 PM

@rburghol
Copy link
Contributor Author

@COBrogan @mwdunlap2004 It looks like the jsonlite module will serialize R6 objects. https://rdrr.io/cran/jsonlite/man/serializeJSON.html

@mwdunlap2004
Copy link
Collaborator

That worked! I pushed the changes to harp archive, but our mon_lm_analysis now outputs our full JSON and the csv of our stats. I'm not sure if at a later date we would want to get the stats from the JSON, but right now it just outputs both since that seemed easier.

@COBrogan
Copy link
Collaborator

COBrogan commented Jul 25, 2024

@mwdunlap2004 @rburghol @ilonah22 I know we said we should just move on from writing out the R6 plotBin object, but it was really irking me. So I found an approach via serialization. I will warn everyone, the file itself is pretty ugly. It's far from "pretty" JSON. It's just row after row of bytes ("raw" class in R). But it works! It let's us write out the entire R6 object and read it back in! And it doesn't use any packages. Food for thought! Stolen mostly from here

#Dummy data
test <- data.frame(a=1:3,b=4:6)
#LM for the dummy data
testLM <- lm(b ~ a,data = test)
#Our R6 Class plotBin
plotBin <- R6Class(
     "plotBin", 
     public = list(
         plot = NULL, data=list(), atts=list(), r_col='',
         initialize = function(plot = NULL, data = list()){ 
             self.plot = plot; self.data=data; 
           }
       )
   )
#Pulled from lm_analysis_plots: populate a new plotBin R6 with data. Add a list
#for lms and put a lm in there
sample_data <- test
plot_out <- plotBin$new(data = sample_data)
plot_out$atts$lms <- list()
#Store a few regressions using dummy data
plot_out$atts$lms[[1]] <- lm(b ~ a, data = test)
plot_out$atts$lms[[2]] <- lm(a ~ b, data = test)

#A file path to write out the file
fname <- "testser.txt"
#Opens a connection to the file path and writes the data directly. Simplifies
#the formatting for these "raw" bytes that we will create
outCon <- file(fname, "w")
#Serialize plot_out as ascii bytes and convert to Char (character).
mychars <- rawToChar(serialize(plot_out, NULL, ascii=T))
# Write directly to the file using the connection outCon from above
cat(mychars, file=outCon)
#Close outCon, essentially saving the file in fname
close(outCon)

#Now, read in fname via readChar. Then convert to raw, which is expected by
#unserialize
testUnser <- charToRaw(readChar(fname, file.info(fname)$size))
#Voila, our R6 object is here!
unserializedData <- unserialize(testUnser)
unserializedData$atts$lms

Outputs:

> unserializedData
<plotBin>
  Public:
    atts: list
    clone: function (deep = FALSE) 
    data: list
    initialize: function (plot = NULL, data = list()) 
    plot: NULL
    r_col: 
> unserializedData$atts$lms
[[1]]

Call:
lm(formula = b ~ a, data = test)

Coefficients:
(Intercept)            a  
          3            1  


[[2]]

Call:
lm(formula = a ~ b, data = test)

Coefficients:
(Intercept)            b  
         -3            1  

@rburghol rburghol changed the title Week of 7/15/2024 Week of 7/15/2024 and 7/22/2024 Jul 26, 2024
@rburghol rburghol changed the title Week of 7/15/2024 and 7/22/2024 Week of 7/15/2024 Jul 26, 2024
@mwdunlap2004
Copy link
Collaborator

I changed our mon_lm_analysis function to include Connor's method for creating the JSON, it is super ugly like he said it would be, but reading it in to the residual plot function I made seems to work, this is what the code looks like, I think variable names could be improved, but the code works and we can still output the stats and the json right now.
Screenshot 2024-07-29 at 9 09 06 AM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants