diff --git a/dev/api/index.html b/dev/api/index.html index 28784ea..d1f53b0 100644 --- a/dev/api/index.html +++ b/dev/api/index.html @@ -1,6 +1,6 @@ API Reference · XLSX.jl

API Reference

XLSX.XLSXFileType

XLSXFile represents a reference to an Excel file.

It is created by using XLSX.readxlsx or XLSX.openxlsx.

From a XLSXFile you can navigate to a XLSX.Worksheet reference as shown in the example below.

Example

xf = XLSX.readxlsx("myfile.xlsx")
-sh = xf["mysheet"] # get a reference to a Worksheet
source
XLSX.readxlsxFunction
readxlsx(source::Union{AbstractString, IO}) :: XLSXFile

Main function for reading an Excel file. This function will read the whole Excel file into memory and return a closed XLSXFile.

Consider using XLSX.openxlsx for lazy loading of Excel file contents.

source
XLSX.openxlsxFunction
openxlsx(f::F, source::Union{AbstractString, IO}; mode::AbstractString="r", enable_cache::Bool=true) where {F<:Function}

Open XLSX file for reading and/or writing. It returns an opened XLSXFile that will be automatically closed after applying f to the file.

Do syntax

This function should be used with do syntax, like in:

XLSX.openxlsx("myfile.xlsx") do xf
+sh = xf["mysheet"] # get a reference to a Worksheet
source
XLSX.readxlsxFunction
readxlsx(source::Union{AbstractString, IO}) :: XLSXFile

Main function for reading an Excel file. This function will read the whole Excel file into memory and return a closed XLSXFile.

Consider using XLSX.openxlsx for lazy loading of Excel file contents.

source
XLSX.openxlsxFunction
openxlsx(f::F, source::Union{AbstractString, IO}; mode::AbstractString="r", enable_cache::Bool=true) where {F<:Function}

Open XLSX file for reading and/or writing. It returns an opened XLSXFile that will be automatically closed after applying f to the file.

Do syntax

This function should be used with do syntax, like in:

XLSX.openxlsx("myfile.xlsx") do xf
     # read data from `xf`
 end

Filemodes

The mode argument controls how the file is opened. The following modes are allowed:

  • r : read mode. The existing data in source will be accessible for reading. This is the default mode.

  • w : write mode. Opens an empty file that will be written to source.

  • rw : edit mode. Opens source for editing. The file will be saved to disk when the function ends.

Warning

The rw mode is known to produce some data loss. See #159.

Simple data should work fine. Users are advised to use this feature with caution when working with formulas and charts.

Arguments

  • source is IO or the complete path to the file.

  • mode is the file mode, as explained in the last section.

  • enable_cache:

If enable_cache=true, all read worksheet cells will be cached. If you read a worksheet cell twice it will use the cached value instead of reading from disk in the second time.

If enable_cache=false, worksheet cells will always be read from disk. This is useful when you want to read a spreadsheet that doesn't fit into memory.

The default value is enable_cache=true.

Examples

Read from file

The following example shows how you would read worksheet cells, one row at a time, where myfile.xlsx is a spreadsheet that doesn't fit into memory.

julia> XLSX.openxlsx("myfile.xlsx", enable_cache=false) do xf
           for r in XLSX.eachrow(xf["mysheet"])
@@ -12,13 +12,13 @@
 end

Edit an existing file

XLSX.openxlsx("edit.xlsx", mode="rw") do xf
     sheet = xf[1]
     sheet[2, :] = [2, Date(2019, 1, 1), "add new line"]
-end

See also XLSX.readxlsx.

source
openxlsx(source::Union{AbstractString, IO}; mode="r", enable_cache=true) :: XLSXFile

Supports opening a XLSX file without using do-syntax. In this case, the user is responsible for closing the XLSXFile using close or writing it to file using XLSX.writexlsx.

See also XLSX.writexlsx.

source
XLSX.writexlsxFunction
writexlsx(output_source, xlsx_file; [overwrite=false])

Writes an Excel file given by xlsx_file::XLSXFile to IO or filepath output_source.

If overwrite=true, output_source (when a filepath) will be overwritten if it exists.

source
XLSX.sheetnamesFunction
sheetnames(xl::XLSXFile)
-sheetnames(wb::Workbook)

Returns a vector with Worksheet names for this Workbook.

source
XLSX.WorksheetType

A Worksheet represents a reference to an Excel Worksheet.

From a Worksheet you can query for Cells, cell values and ranges.

Example

xf = XLSX.readxlsx("myfile.xlsx")
+end

See also XLSX.readxlsx.

source
openxlsx(source::Union{AbstractString, IO}; mode="r", enable_cache=true) :: XLSXFile

Supports opening a XLSX file without using do-syntax. In this case, the user is responsible for closing the XLSXFile using close or writing it to file using XLSX.writexlsx.

See also XLSX.writexlsx.

source
XLSX.writexlsxFunction
writexlsx(output_source, xlsx_file; [overwrite=false])

Writes an Excel file given by xlsx_file::XLSXFile to IO or filepath output_source.

If overwrite=true, output_source (when a filepath) will be overwritten if it exists.

source
XLSX.sheetnamesFunction
sheetnames(xl::XLSXFile)
+sheetnames(wb::Workbook)

Returns a vector with Worksheet names for this Workbook.

source
XLSX.WorksheetType

A Worksheet represents a reference to an Excel Worksheet.

From a Worksheet you can query for Cells, cell values and ranges.

Example

xf = XLSX.readxlsx("myfile.xlsx")
 sh = xf["mysheet"] # get a reference to a Worksheet
 println( sh[2, 2] ) # access element "B2" (2nd row, 2nd column)
 println( sh["B2"] ) # you can also use the cell name
 println( sh["A2:B4"] ) # or a cell range
-println( sh[:] ) # all data inside worksheet's dimension
source
XLSX.readdataFunction
readdata(source, sheet, ref)
+println( sh[:] ) # all data inside worksheet's dimension
source
XLSX.readdataFunction
readdata(source, sheet, ref)
 readdata(source, sheetref)

Returns a scalar or matrix with values from a spreadsheet.

See also XLSX.getdata.

Examples

These function calls are equivalent.

julia> XLSX.readdata("myfile.xlsx", "mysheet", "A2:B4")
 3×2 Array{Any,2}:
  1  "first"
@@ -35,21 +35,21 @@
 3×2 Array{Any,2}:
  1  "first"
  2  "second"
- 3  "third"
source
XLSX.getdataFunction
getdata(sheet, ref)
 getdata(sheet, row, column)

Returns a scalar or a matrix with values from a spreadsheet. ref can be a cell reference or a range.

Indexing in a Worksheet will dispatch to getdata method.

Example

julia> f = XLSX.readxlsx("myfile.xlsx")
 
 julia> sheet = f["mysheet"]
 
 julia> matrix = sheet["A1:B4"]
 
-julia> single_value = sheet[2, 2] # B2

See also XLSX.readdata.

source
getdata(ws::Worksheet, cell::Cell) :: CellValue

Returns a Julia representation of a given cell value. The result data type is chosen based on the value of the cell as well as its style.

For example, date is stored as integers inside the spreadsheet, and the style is the information that is taken into account to chose Date as the result type.

For numbers, if the style implies that the number is visualized with decimals, the method will return a float, even if the underlying number is stored as an integer inside the spreadsheet XML.

If cell has empty value or empty String, this function will return missing.

source
XLSX.getcellFunction
getcell(xlsxfile, cell_reference_name) :: AbstractCell
+julia> single_value = sheet[2, 2] # B2

See also XLSX.readdata.

source
getdata(ws::Worksheet, cell::Cell) :: CellValue

Returns a Julia representation of a given cell value. The result data type is chosen based on the value of the cell as well as its style.

For example, date is stored as integers inside the spreadsheet, and the style is the information that is taken into account to chose Date as the result type.

For numbers, if the style implies that the number is visualized with decimals, the method will return a float, even if the underlying number is stored as an integer inside the spreadsheet XML.

If cell has empty value or empty String, this function will return missing.

source
XLSX.getcellFunction
getcell(xlsxfile, cell_reference_name) :: AbstractCell
 getcell(worksheet, cell_reference_name) :: AbstractCell
 getcell(sheetrow, column_name) :: AbstractCell
-getcell(sheetrow, column_number) :: AbstractCell

Returns the internal representation of a worksheet cell.

Returns XLSX.EmptyCell if the cell has no data.

source
getcell(sheet, ref)

Returns an AbstractCell that represents a cell in the spreadsheet.

Example:

julia> xf = XLSX.readxlsx("myfile.xlsx")
+getcell(sheetrow, column_number) :: AbstractCell

Returns the internal representation of a worksheet cell.

Returns XLSX.EmptyCell if the cell has no data.

source
getcell(sheet, ref)

Returns an AbstractCell that represents a cell in the spreadsheet.

Example:

julia> xf = XLSX.readxlsx("myfile.xlsx")
 
 julia> sheet = xf["mysheet"]
 
-julia> cell = XLSX.getcell(sheet, "A1")
source
XLSX.getcellrangeFunction
getcellrange(sheet, rng)

Returns a matrix with cells as Array{AbstractCell, 2}. rng must be a valid cell range, as in "A1:B2".

source
XLSX.row_numberFunction
row_number(c::CellRef) :: Int

Returns the row number of a given cell reference.

source
XLSX.column_numberFunction
column_number(c::CellRef) :: Int

Returns the column number of a given cell reference.

source
XLSX.eachrowFunction
eachrow(sheet)

Creates a row iterator for a worksheet.

Example: Query all cells from columns 1 to 4.

left = 1  # 1st column
+julia> cell = XLSX.getcell(sheet, "A1")
source
XLSX.getcellrangeFunction
getcellrange(sheet, rng)

Returns a matrix with cells as Array{AbstractCell, 2}. rng must be a valid cell range, as in "A1:B2".

source
XLSX.row_numberFunction
row_number(c::CellRef) :: Int

Returns the row number of a given cell reference.

source
XLSX.column_numberFunction
column_number(c::CellRef) :: Int

Returns the column number of a given cell reference.

source
XLSX.eachrowFunction
eachrow(sheet)

Creates a row iterator for a worksheet.

Example: Query all cells from columns 1 to 4.

left = 1  # 1st column
 right = 4 # 4th column
 for sheetrow in XLSX.eachrow(sheet)
     for column in left:right
@@ -57,7 +57,7 @@
 
         # do something with cell
     end
-end
source
XLSX.readtableFunction
readtable(
     source,
     sheet,
     [columns];
@@ -73,7 +73,7 @@
     return !ismissing(v) && v == "unwanted value"
 end

keep_empty_rows determines whether rows where all column values are equal to missing are kept (true) or dropped (false) from the resulting table. keep_empty_rows never affects the bounds of the table; the number of rows read from a sheet is only affected by, first_row, stop_in_empty_row and stop_in_row_function (if specified). keep_empty_rows is only checked once the first and last row of the table have been determined, to see whether to keep or drop empty rows between the first and the last row.

Example

julia> using DataFrames, XLSX
 
-julia> df = DataFrame(XLSX.readtable("myfile.xlsx", "mysheet"))

See also: XLSX.gettable.

source
XLSX.gettableFunction
gettable(
     sheet,
     [columns];
     [first_row],
@@ -90,7 +90,7 @@
 
 julia> df = XLSX.openxlsx("myfile.xlsx") do xf
         DataFrame(XLSX.gettable(xf["mysheet"]))
-    end

See also: XLSX.readtable.

source
XLSX.eachtablerowFunction
eachtablerow(sheet, [columns]; [first_row], [column_labels], [header], [stop_in_empty_row], [stop_in_row_function], [keep_empty_rows])

Constructs an iterator of table rows. Each element of the iterator is of type TableRow.

header is a boolean indicating whether the first row of the table is a table header.

If header == false and no column_labels were supplied, column names will be generated following the column names found in the Excel file.

The columns argument is a column range, as in "B:E". If columns is not supplied, the column range will be inferred by the non-empty contiguous cells in the first row of the table.

The user can replace column names by assigning the optional column_labels input variable with a Vector{Symbol}.

stop_in_empty_row is a boolean indicating whether an empty row marks the end of the table. If stop_in_empty_row=false, the iterator will continue to fetch rows until there's no more rows in the Worksheet. The default behavior is stop_in_empty_row=true. Empty rows may be returned by the iterator when stop_in_empty_row=false.

stop_in_row_function is a Function that receives a TableRow and returns a Bool indicating if the end of the table was reached.

Example for stop_in_row_function:

function stop_function(r)
+    end

See also: XLSX.readtable.

source
XLSX.eachtablerowFunction
eachtablerow(sheet, [columns]; [first_row], [column_labels], [header], [stop_in_empty_row], [stop_in_row_function], [keep_empty_rows])

Constructs an iterator of table rows. Each element of the iterator is of type TableRow.

header is a boolean indicating whether the first row of the table is a table header.

If header == false and no column_labels were supplied, column names will be generated following the column names found in the Excel file.

The columns argument is a column range, as in "B:E". If columns is not supplied, the column range will be inferred by the non-empty contiguous cells in the first row of the table.

The user can replace column names by assigning the optional column_labels input variable with a Vector{Symbol}.

stop_in_empty_row is a boolean indicating whether an empty row marks the end of the table. If stop_in_empty_row=false, the iterator will continue to fetch rows until there's no more rows in the Worksheet. The default behavior is stop_in_empty_row=true. Empty rows may be returned by the iterator when stop_in_empty_row=false.

stop_in_row_function is a Function that receives a TableRow and returns a Bool indicating if the end of the table was reached.

Example for stop_in_row_function:

function stop_function(r)
     v = r[:col_label]
     return !ismissing(v) && v == "unwanted value"
 end

keep_empty_rows determines whether rows where all column values are equal to missing are kept (true) or skipped (false) by the row iterator. keep_empty_rows never affects the bounds of the iterator; the number of rows read from a sheet is only affected by first_row, stop_in_empty_row and stop_in_row_function (if specified). keep_empty_rows is only checked once the first and last row of the table have been determined, to see whether to keep or drop empty rows between the first and the last row.

Example code:

for r in XLSX.eachtablerow(sheet)
@@ -98,15 +98,15 @@
     rn = XLSX.row_number(r) # `TableRow` row number.
     v1 = r[1] # will read value at table column 1.
     v2 = r[:COL_LABEL2] # will read value at column labeled `:COL_LABEL2`.
-end

See also XLSX.gettable.

source
XLSX.writetableFunction
writetable(filename, table; [overwrite], [sheetname])

Write Tables.jl table to the specified filename.

source
writetable(filename::Union{AbstractString, IO}, tables::Vector{Pair{String, T}}; overwrite::Bool=false)
-writetable(filename::Union{AbstractString, IO}, tables::Pair{String, Any}...; overwrite::Bool=false)
source
writetable(filename, data, columnnames; [overwrite], [sheetname])
  • data is a vector of columns.
  • columnames is a vector of column labels.
  • overwrite is a Bool to control if filename should be overwritten if already exists.
  • sheetname is the name for the worksheet.

Example

import XLSX
+end

See also XLSX.gettable.

source
XLSX.writetableFunction
writetable(filename, table; [overwrite], [sheetname])

Write Tables.jl table to the specified filename.

source
writetable(filename::Union{AbstractString, IO}, tables::Vector{Pair{String, T}}; overwrite::Bool=false)
+writetable(filename::Union{AbstractString, IO}, tables::Pair{String, Any}...; overwrite::Bool=false)
source
writetable(filename, data, columnnames; [overwrite], [sheetname])
  • data is a vector of columns.
  • columnames is a vector of column labels.
  • overwrite is a Bool to control if filename should be overwritten if already exists.
  • sheetname is the name for the worksheet.

Example

import XLSX
 columns = [ [1, 2, 3, 4], ["Hey", "You", "Out", "There"], [10.2, 20.3, 30.4, 40.5] ]
 colnames = [ "integers", "strings", "floats" ]
-XLSX.writetable("table.xlsx", columns, colnames)

See also: XLSX.writetable!.

source
writetable(filename::Union{AbstractString, IO}; overwrite::Bool=false, kw...)
+XLSX.writetable("table.xlsx", columns, colnames)

See also: XLSX.writetable!.

source
writetable(filename::Union{AbstractString, IO}; overwrite::Bool=false, kw...)
 writetable(filename::Union{AbstractString, IO}, tables::Vector{Tuple{String, Vector{Any}, Vector{String}}}; overwrite::Bool=false)

Write multiple tables.

kw is a variable keyword argument list. Each element should be in this format: sheetname=( data, column_names ), where data is a vector of columns and column_names is a vector of column labels.

Example:

julia> import DataFrames, XLSX
 
 julia> df1 = DataFrames.DataFrame(COL1=[10,20,30], COL2=["Fist", "Sec", "Third"])
 
 julia> df2 = DataFrames.DataFrame(AA=["aa", "bb"], AB=[10.1, 10.2])
 
-julia> XLSX.writetable("report.xlsx", "REPORT_A" => df1, "REPORT_B" => df2)
source
XLSX.writetable!Function
writetable!(sheet::Worksheet, table; anchor_cell::CellRef=CellRef("A1")))

Write Tables.jl table to the specified sheet.

source
writetable!(sheet::Worksheet, data, columnnames; anchor_cell::CellRef=CellRef("A1"))

Writes tabular data data with labels given by columnnames to sheet, starting at anchor_cell.

data must be a vector of columns. columnnames must be a vector of column labels.

See also: XLSX.writetable.

source
XLSX.rename!Function
rename!(ws::Worksheet, name::AbstractString)

Renames a Worksheet.

source
XLSX.addsheet!Function
addsheet!(workbook, [name]) :: Worksheet

Create a new worksheet with named name. If name is not provided, a unique name is created.

source
+julia> XLSX.writetable("report.xlsx", "REPORT_A" => df1, "REPORT_B" => df2)source
XLSX.writetable!Function
writetable!(sheet::Worksheet, table; anchor_cell::CellRef=CellRef("A1")))

Write Tables.jl table to the specified sheet.

source
writetable!(sheet::Worksheet, data, columnnames; anchor_cell::CellRef=CellRef("A1"))

Writes tabular data data with labels given by columnnames to sheet, starting at anchor_cell.

data must be a vector of columns. columnnames must be a vector of column labels.

See also: XLSX.writetable.

source
XLSX.rename!Function
rename!(ws::Worksheet, name::AbstractString)

Renames a Worksheet.

source
XLSX.addsheet!Function
addsheet!(workbook, [name]) :: Worksheet

Create a new worksheet with named name. If name is not provided, a unique name is created.

source
diff --git a/dev/index.html b/dev/index.html index 27f3d35..657a77a 100644 --- a/dev/index.html +++ b/dev/index.html @@ -1,4 +1,4 @@ Home · XLSX.jl

XLSX.jl

Introduction

XLSX.jl is a Julia package to read and write Excel spreadsheet files.

Internally, an Excel XLSX file is just a Zip file with a set of XML files inside. The formats for these XML files are described in the Standard ECMA-376.

This package follows the EMCA-376 to parse and generate XLSX files.

Requirements

  • Julia v1.3

  • Linux, macOS or Windows.

Installation

From a Julia session, run:

julia> using Pkg
 
-julia> Pkg.add("XLSX")

Source Code

The source code for this package is hosted at https://github.com/felipenoris/XLSX.jl.

License

The source code for the package XLSX.jl is licensed under the MIT License.

Getting Help

If you're having any trouble, have any questions about this package or want to ask for a new feature, just open a new issue.

Contributing

Contributions are always welcome!

To contribute, fork the project on GitHub and send a Pull Request.

References

Alternative Packages

+julia> Pkg.add("XLSX")

Source Code

The source code for this package is hosted at https://github.com/felipenoris/XLSX.jl.

License

The source code for the package XLSX.jl is licensed under the MIT License.

Getting Help

If you're having any trouble, have any questions about this package or want to ask for a new feature, just open a new issue.

Contributing

Contributions are always welcome!

To contribute, fork the project on GitHub and send a Pull Request.

References

Alternative Packages

diff --git a/dev/migration/index.html b/dev/migration/index.html index 8c22752..32b8176 100644 --- a/dev/migration/index.html +++ b/dev/migration/index.html @@ -1,3 +1,3 @@ Migration Guides · XLSX.jl

Migration Guides

Migrating Legacy Code to v0.8

Version v0.8 introduced a breaking change on methods XLSX.gettable and XLSX.readtable.

These methods used to return a tuple data, column_labels. On XLSX v0.8 these methods return a XLSX.DataTable struct that implements Tables.jl interface.

Basic code replacement

Before

data, col_names = XLSX.readtable(joinpath(data_directory, "general.xlsx"), "table4")

After

dtable = XLSX.readtable(joinpath(data_directory, "general.xlsx"), "table4")
-data, col_names = dtable.data, dtable.column_labels

Reading DataFrames

Since XLSX.DataTable implements Tables.jl interface, the result of XLSX.gettable or XLSX.readtable can be passed to a DataFrame constructor.

Before

df = DataFrame(XLSX.readtable("myfile.xlsx", "mysheet")...)

After

df = DataFrame(XLSX.readtable("myfile.xlsx", "mysheet"))
+data, col_names = dtable.data, dtable.column_labels

Reading DataFrames

Since XLSX.DataTable implements Tables.jl interface, the result of XLSX.gettable or XLSX.readtable can be passed to a DataFrame constructor.

Before

df = DataFrame(XLSX.readtable("myfile.xlsx", "mysheet")...)

After

df = DataFrame(XLSX.readtable("myfile.xlsx", "mysheet"))
diff --git a/dev/search/index.html b/dev/search/index.html index 8633f70..d0d87e8 100644 --- a/dev/search/index.html +++ b/dev/search/index.html @@ -1,2 +1,2 @@ -Search · XLSX.jl

Loading search...

    +Search · XLSX.jl

    Loading search...

      diff --git a/dev/tutorial/index.html b/dev/tutorial/index.html index 0ad5a2d..1278724 100644 --- a/dev/tutorial/index.html +++ b/dev/tutorial/index.html @@ -59,7 +59,7 @@ "HeaderA" "HeaderB" 1 "first" 2 "second" - 3 "third"

      To inspect the internal representation of each cell, use the getcell or getcellrange methods.

      The example above used xf = XLSX.readxlsx(filename) to open a file, so all file contents are fetched at once from disk.

      You can also use XLSX.openxlsx to read file contents as needed (see Reading Large Excel Files and Caching).

      Data Types

      This package uses the following concrete types when handling XLSX files.

      XLSX.CellValueTypeType
      CellValueType

      Concrete supported data-types.

      Union{String, Missing, Float64, Int, Bool, Dates.Date, Dates.Time, Dates.DateTime}
      source

      Read Tabular Data

      The XLSX.gettable method returns tabular data from a spreadsheet as a struct XLSX.DataTable that implements Tables.jl interface. You can use it to create a DataFrame from DataFrames.jl. Check the docstring for gettable method for more advanced options.

      There's also a helper method XLSX.readtable to read from file directly, as shown in the following example.

      julia> using DataFrames, XLSX
      + 3           "third"

      To inspect the internal representation of each cell, use the getcell or getcellrange methods.

      The example above used xf = XLSX.readxlsx(filename) to open a file, so all file contents are fetched at once from disk.

      You can also use XLSX.openxlsx to read file contents as needed (see Reading Large Excel Files and Caching).

      Data Types

      This package uses the following concrete types when handling XLSX files.

      XLSX.CellValueTypeType
      CellValueType

      Concrete supported data-types.

      Union{String, Missing, Float64, Int, Bool, Dates.Date, Dates.Time, Dates.DateTime}
      source

      Read Tabular Data

      The XLSX.gettable method returns tabular data from a spreadsheet as a struct XLSX.DataTable that implements Tables.jl interface. You can use it to create a DataFrame from DataFrames.jl. Check the docstring for gettable method for more advanced options.

      There's also a helper method XLSX.readtable to read from file directly, as shown in the following example.

      julia> using DataFrames, XLSX
       
       julia> df = DataFrame(XLSX.readtable("myfile.xlsx", "mysheet"))
       3×2 DataFrames.DataFrame
      @@ -260,4 +260,4 @@
       │ 1   │ 1        │ Hey     │ 10.2    │ 2018-02-20 │ 19:10:00 │ 2018-05-20T19:10:00 │
       │ 2   │ 2        │ You     │ 20.3    │ 2018-02-21 │ 19:20:00 │ 2018-05-20T19:20:00 │
       │ 3   │ 3        │ Out     │ 30.4    │ 2018-02-22 │ 19:30:00 │ 2018-05-20T19:30:00 │
      -│ 4   │ 4        │ There   │ 40.5    │ 2018-02-23 │ 19:40:00 │ 2018-05-20T19:40:00 │
      +│ 4 │ 4 │ There │ 40.5 │ 2018-02-23 │ 19:40:00 │ 2018-05-20T19:40:00 │