Skip to content

YQL Open Data Tables

zhouyaoji edited this page Apr 15, 2013 · 4 revisions

Introduction

YQL Open Data Tables (ODTs) are tables created by the community that define how the YQL Web Service can get data from some source such as a Web Service, RSS feed, or even an HTML page.

The YQL Open Data Tables are XML files based on a schema that allow table developers to define elements for binding YQL verbs such as SELECT, INSERT, UPDATE, and DELETE as well as elements for URLs, key inputs, paging, and even executable JavaScript. Using the JavaScript API, your table can perform operations such as make REST calls, parse responses, compress/decompress data, and more.

Using YQL Open Data Tables

Using YQL Open Data Tables is not much different than using YQL's predefined tables. You form the YQL statement to access data from the ODT in the same way, but to access the table, you need the verb use to specify one or more tables or use the env query parameter to get the environment store that gives you access to all of the community ODTs.

The use verb selects the ODT to use, which you can then query:

use "http://myserver.com/mytables.xml" as mytable;
select * from mytable where...

You can also select more than one ODTs as shown below:

use "http://myserver.com/mytables1.xml" as table1; 
use "http://myserver.com/mytables2.xml" as table2; 
select * from table1 where id in (select id from table2)

To access all of the ODTs, you use the env query string parameters:

env=http://datatables.org/alltables.env

For example, to use the community ODT for Craigslist, you would append the env parameter and the q parameter for the YQL statement:

http://query.yahooapis.com/v1/public/yql?q=select * from craigslist.search where location="sfbay" and type="sss" and query="schwinn mountain bike" diagnostics=true&env=store://datatables.org/alltableswithkeys

Creating YQL Open Data Tables

Intro

If you have data that you want to allow others to consume or know of a data source that you want to make accessible with YQL, you can create an ODT that allows the YQL Web Service to access the data. If you want to keep your table private, then you just need to host the table and access it through the the use verb. If you would like your table to be public, you can add your table to this YQL Open Data Table GitHub repository. YQL users will have access to your table when using the query parameter and value env=http://datatables.org/alltables.env.

Recommended Steps to Create Tables

  1. Decide whether your data source is suitable to be used by YQL. Sources such as the following are generally good candidates:
    • RSS feeds (great)
    • Web Services (good-great: sometimes depends on keys and have rate limitations)
    • Static HTML pages (okay: HTML scraping is reliable if the page doesn't change)
  2. Try an existing ODT: Go to the YQL Console, click on the link Show Community Tables, and then click on one of the community tables to run a query.
  3. Review the Open Data Tables Reference.
  4. Look at the Open Data Table Examples and the examples tables in this repo. You may find tables that you can model your own after.
  5. Draft your ODT and host it on a Web server.
  6. From the YQL Console, test your table with the use statement. Example: use "http://your_domain/your_table.xml" as your_table; select * from your_table where key1="value1"
  7. Go to the next section to learn how to contribute your ODT to the community tables.

Contributing

To add your ODT to the community tables, you simply fork the yql / yql-tables repo, add your table, and make a pull request. The YQL team will review your table and either suggest changes or accept your pull request. Once your table is part of the set of community tables, you can test it from the YQL Console.