-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
There's a Kaggle calling out for wtte-rnn weibull #45
Comments
Thank you @chris-english for the very kind supportive words and very interesting read. Remember, if you're not using a temporal model (i.e RNN/CNN) you don't need a time-dimension or masking and all your pains mostly goes away, and all you need is to work with a 2-d standard My general recommendation is to stick to my posted examples, daynebatten's way of shaping data is fundamentally different. I'm still waiting for a fun Kaggle-challenge to apply WTTE for but it usually ends with me not wanting to focus on feature engineering that day lol. Anyway, I'm a huge fan of Kaggle and please ping if you see some other cool challenge. Good luck! |
Happy to see, and soon clone wtte-1.1.2. And rebuilding tensorflow, yet again, or rather bazel is busy banging all my cores. But as I brought him up, I send the last lecture by Parzen, of Parzen window(s), for your consideration of time series, nonparametric statistics and empiric distributions going forward: and an interesting interaction over at tensorflow issues on a seemingly related matter from page 5 of Parzen: Regards the kaggle, the data is rumored to be a time series, without all the decoration of time stamps and the like, so an ordered sequence. Like yourself, I'm not a feature engineer and I think zeros matter in a sparse matrix. That's the tone and insight I get from your thesis, predict non-events. Taking the train data into pandas, we get a mix of float64 followed by int64(s) and subsequent float64. Summing the int64(s) and dividing by the float64 target, we generally see about .0565, on our calculator. In the mix of data there are also the inverses at .9435. The first number generally makes sense to me as value in a bank setting rendered to basis points and you'd want your value to be above your cost of capital. I look at the .9435 as the bone head option of were one to respond to every offer, the bank would have .9435 of your money and you'd be left with .0565. Beware of banks bearing gifts. In the course of this month Santander mailed me a offer to pay a bounty of $250 were I to open a checking account with them. But this could just as well be the set of customers who are self-described "insurance poor", which means they have every insurance product known and all free cash flow goes to premiums, less .0565. The test distribution is left truncated when laid over train, which to my mind where the one sample proportion diverges from the empiric and a 'model' is born. Certainly an argument is being made. But I conject. 5715 objects compiled by Bazel, so a couple of thousand to go, and off to play. Perhaps everything will settle to a model in time to submit. |
And after rebuilding, this is generally where I get to when trying to fit. InvalidArgumentError: slice index 1 of dimension 1 out of bounds. so looking into it. at least with the new build it doesn't core dump on illegal instructions to my CPU when I call keras. which is also to say, this has nothing to do with wtte, though I imagine you've seen this output before. |
@ragulpr,
Spent the last very interesting month split between trying to implement this in R or magically arrive at a sufficient understanding of python to apply to this kaggle challenge:
https://www.kaggle.com/c/santander-value-prediction-challenge
of which there are a few days left to submit. I looked at this data in R and it really did look weibull.
Of the many places that I stumbled, until just recently RKeras didn't support python like slicing notation, though Keras2.2.2(R) and tfnightly(1,10.0) probably now do. My ndims are always wrong, i.e. expected 3 got 2, expected 2 got 4. The usual culprits.
The challenge above is a slight variation on time to next event as the value(s) are also embedded in
the timesteps. Under pandas or data.frame(R) a system of potential bookends (left - float64, right - int64, with next potential series starting with next float64 whether activity >0. or not.) I've been attempting a fit on concatenating a 4459, 4991, 2 tensor with a fit on None, 4991,1 (a 0/1 normalized by row) and None, 4991, 2 the values in the data (concatenated), but with as yet no avail.
I'm not sure that make much sense, or more probably makes none. But attempting to say that timesteps were also features, i.e. 4459, 4991, 4991 sorta blew thru my ram, my mother's ram, and the NSA's ram.
In reviewing all the open and closed, I see I want to 999 my mask. And probably should just try to get the examples to work, which I finally managed today to get @daynebatten to run as long as I used the deprecated
input_dim=
rather than trying to negotiateinput_shape
where I inevitably fall into the 'expected ndim x, got ndim y dance'.Anyway, I think your work is very interesting, worthy, and should make something of a splash over in kaggle land. Might even fund another year of study. If you haven't looked into it already, check out Emmanual Parzens and Deep's quantile statistics.
Thanks for a very interesting month. I'll keep plugging away and hope to see your submission.
The text was updated successfully, but these errors were encountered: