Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement URL retrieval for readtable #385

Closed
milktrader opened this issue Oct 28, 2013 · 9 comments
Closed

Implement URL retrieval for readtable #385

milktrader opened this issue Oct 28, 2013 · 9 comments

Comments

@milktrader
Copy link

This feature has been stubbed out and currently throws the error "URL retrieval not yet implemented"

milktrader/Quandl.jl#8

milktrader/Quandl.jl#1

@milktrader
Copy link
Author

Lines 562-564 of src/io.jl is where this feature is stubbed, btw.

    # (1) Path is an HTTP or FTP URL
    if ismatch(r"^(http://)|(ftp://)", pathname)
        error("URL retrieval not yet implemented")

@StefanKarpinski
Copy link
Member

I would advocate for not going this way and instead having some kind of generic readurl function that returns an open stream that returns the contents of a URL. The stream interface is a nice "narrow waist" – various packages support handling arbitrary data streams, while base – or other packages – provide the ability to provide streams by various means, including reading from an internet URL.

@milktrader
Copy link
Author

readurl(filename) = readlines( curl -s $filename ) works for that purpose, but I'd really like to lever the parsing code that very nicely places this stream into a data frame.

@StefanKarpinski
Copy link
Member

Yes, the point is that the fundamental DataFrames interface should be reading data from a stream, rather than opening a file name. Of course, readtable(fname::String) = open(readtable,fname) is an obvious default behavior that immediately gets back the current interface of passing a file name.

@milktrader
Copy link
Author

As my friend Max would say, I'm confuzzled about how to write this. So it comes to having a readurl function that takes a parsing function and an url?

@johnmyleswhite
Copy link
Contributor

I believe the idea is that you will call readtable(readurl(URL)). Look into the existing I/O code to see the arguments for the version of readtable that accepts an IOStream as an input.

@prcastro
Copy link
Contributor

Bump

@quinnj
Copy link
Member

quinnj commented Sep 7, 2017

CSV.jl can read from any IO object, including the result of a HTTP.get(url) call, so this is supported

CSV.read(HTTP.body(HTTP.get(csv_url)))

@quinnj quinnj closed this as completed Sep 7, 2017
@davidanthoff
Copy link
Contributor

CSVFiles.jl also supports reading from a URL directly:

df = DataFrame(load("http://someurl.org/somefile.csv"))

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants