BIM Analytics via Parquet and DuckDB
BIM Open Schema is an open formal specification of BIM data that is designed for modern analytics tools and optimized for columnar data formats like Parquet and relational databases like DuckDB.
This repo provides:
- Official specification in the form of valid C# code.
- Sample test file generated from the Snowdon sample
- C# libraries and tests files for reading / writing BIM Open Schema data, also distributed on Nuget.
- WPF-Based GUI Sample

The schema is optimized for columnar data formats, such as those used by relational databases, but it is not tied to any one particular serialization format, and can be easily converted to many different formats for fast inspection in your tool of choice.
This project comes with C# code which acts as the official specification, but the schema is platform independent and language agnostic.
We welcome code contributions in any language.
- ETL pipe : Revit/IFC -> Parquet -> database/BI/ML.
- Quick inspection : Open the Parquet files with DuckDB, PowerBI, or Pandas.
- Inter‑tool hand‑off : Share a small, self‑contained bundle instead of heavyweight RVT/IFC when geometry is not required.
Data scientists, BI analysts, and application developers who need properties, relationships, and additional BIM data without geometry.
- Column‑oriented storage: Each list maps cleanly to a Parquet column chunk or a database table.
- String & point interning: Repeated values are stored once and referenced by a typed index, keeping files small.
- EAV‑flavoured parameters: A minimal core (Entity, Descriptor) plus type‑specific value tables yields flexibility while preserving strong types.
- Relation set: A single EntityRelation edge list expresses most graph‑like BIM relationships found in Revit or IFC.
ETL (Extract, Transform, and Load) is a three-phase computing process where data is extracted from an input source, transformed (including cleaning), and loaded into an output data container.
The goal of BIM Open Schema is to standardize the BIM representation of data for extraction and loading using a schema that is efficient, compact, and cross-platform. Better ETL means better data analytics.
We provide tools and examples to convert BIM Open Schema to/from:
- Parquet - an efficient, open source, column-oriented data file format with wide tooling support.
- DuckDB - A simple, fast, open-source database system optimized for in-process analytical work.
- JSON - A lightweight and ubiquitous human-readable format for exchanging data over the web.
To extract data from Revit you can use our Revit Parquet Exporter

Or Bowerbird and the command "Export BIM Open Schema":

Tomo Sugeta has developed BIM Open Reader a browser-based tool for exploring BIM Open Data in the format of Parquet files.
Some open-source projects which are related:
Supporting and contributing to this project is as simple as providing feedback.
Some of the people who have contributed are (in alphabetical order):
- Ahmad Saleem Z - AnkerDB
- Christopher Diggins - Ara 3D
- Daryl Irvine - DG Jones and Partners
- Karim Daw - Gensler
- Pablo Derendinger - e-verse
- Tom van Diggelen - BIMcollab
- Tomo Sugeta - Cundall
- Valentin Noves - e-verse
- Yskert Schindel - Vyssuals
We have an active Discord server and discussion forum that you can join if you are interested. Just send us an email request
If you are interesting in professional help in leveraging the format and learning what you can do with it, reach out to us at [email protected].