This repository is for the preprint|paper "Warwick Electron Microscopy Datasets". It supplements datasets by providing scripts used to curate datasets and their variants, and to create both static and interactive visualizations.
There are three main datasets containing 19769 experimental STEM images, 17266 experimental TEM images and 98340 simulated TEM exit wavefunctions. Datasets are available here.
Scrips and data for variational autoencoders (VAEs) and modified t-distributed stochastic neighbor embedding (tSNE) are in the vaegan
subdirectory. Pretrained VAEs are here.
Interactive visualizations can be created by running display_visualization_files.py
. Change values of file location variables (in the script) to display their visualization:
SAVE_DATA: Full save location of a NumPy file containing a dataset. For example, from the datasets main page.
SAVE_FILE: Full save location of a NumPy file containing tSNE map points. Files for each visualization are in this repository and have filenames in the form "tsne_*.npy" for PCA and vae_tsne_*.npy
for VAE, where * is a wildcard.
An optional extra parameter, USE_FRAC, controls the portion of data points that are displayed. Use a value lower than 1 if a visualization is slow/unresponsive for a large dataset.
There are a few folders:
create_96x96
: Scripts to downsample examples to 96x96.
cropping
: Scripts to crop 512x512 regions from full images.
mining_scripts
: An assortment of mining scrips used to curate micrographs.
stem_full_shapes
: Scripts to investigate the distribution of STEM full images shapes.
vaegan
: Source code and pretrained models for VAEs, and source code and precompiled binaries for modified tSNE implementations.
In addition, there are a few noteable fles:
create_static_displays
: Creates tSNE visualizations with map points and images.
create_table_images
: Example TEM and STEM images are selected using their positions in tSNE visualizations.
create_visualization_files
: Ouputs NumPy files containing dataset principal componets and tSNE visualizations.
Scripts to simulate wavefunctions are here. They have the form "run_simulations*.py", where * is a wildcard.
Jeffrey Ede: [email protected]
An example tSNE visualization for 19769 96x96 crops from STEM images. It was created by training a VAE to encode images as 64-dimensional means and standard deviations of normal distributions. Standard deviations were then used to weight the clustering of means in 2 dimensions by tSNE. Images are shown at 500 randomly sampled points.