efficient-inference

"LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS", Zhiwen Fan, Kevin Wang, Kairun Wen, Zehao Zhu, Dejia Xu, Zhangyang Wang

3d-reconstruction efficient-inference gaussian-splatting

Updated Jun 20, 2024
Python

SqueezeAILab / SqueezeLLM

Star

[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization

natural-language-processing text-generation transformer llama quantization model-compression efficient-inference post-training-quantization large-language-models llm small-models localllm

Updated May 2, 2024
Python

Zhen-Dong / Awesome-Quantization-Papers

Star

List of papers related to neural network quantization in recent AI conferences and journals.

neural-networks awesome-list papers quantization model-compression edge-computing efficient-inference diffusion-models large-language-models

Updated May 28, 2024

horseee / DeepCache

Star

[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free

model-compression efficient-inference diffusion-models stable-diffusion training-free

Updated Jun 27, 2024
Python

The-Learning-And-Vision-Atelier-LAVA / SMSR

Star

[CVPR 2021] Exploring Sparsity in Image Super-Resolution for Efficient Inference

sparsity super-resolution efficient-inference

Updated Oct 18, 2021
Python

snap-research / graphless-neural-networks

Star

[ICLR 2022] Code for Graph-less Neural Networks: Teaching Old MLPs New Tricks via Distillation (GLNN)

deep-learning scalability pytorch knowledge-distillation efficient-inference distillation graph-algorithm graph-neural-networks gnn

Updated May 3, 2024
Python

bharathsudharsan / CNN_on_MCU

Star

Code for paper 'Multi-Component Optimization and Efficient Deployment of Neural-Networks on Resource-Constrained IoT Hardware'

optimization quantization neuralnetworks edge-computing graph-optimization efficient-inference tflite cmsis-nn c-code-generator tflite-conversion tinyml quantization-aware-training

Updated Apr 22, 2022
Jupyter Notebook

changlin31 / DS-Net

Star

(CVPR 2021, Oral) Dynamic Slimmable Network

pruning model-compression efficient-inference dynamic-networks network-pruning dynamic-pruning

Updated Dec 31, 2021
Python

SqueezeAILab / KVQuant

Star

KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

natural-language-processing compression text-generation transformer llama quantization mistral model-compression efficient-inference efficient-model large-language-models llm small-models localllm localllama

Updated Jun 13, 2024
Python

xindongzhang / ELAN

Star

[ECCV2022] Efficient Long-Range Attention Network for Image Super-resolution

transformer super-resolution efficient-inference

Updated Jul 20, 2022
Python

cure-lab / DeciWatch

Star

[ECCV 2022] Official implementation of the paper "DeciWatch: A Simple Baseline for 10x Efficient 2D and 3D Pose Estimation"

deep-learning efficiency pytorch human-pose-estimation pose-estimation eccv efficient-inference 2d-human-pose 3d-pose-estimation efficient-neural-networks body-reconstruction eccv2022 3d-body-recovery

Updated Jul 19, 2022
Python

lucidrains / speculative-decoding

Star

Explorations into some recent techniques surrounding speculative decoding

deep-learning transformers artificial-intelligence efficient-inference

Updated Oct 9, 2023
Python

bharathsudharsan / TinyML-Benchmark-NNs-on-MCUs

Star

Code for WF-IoT paper 'TinyML Benchmark: Executing Fully Connected Neural Networks on Commodity Microcontrollers'

machine-learning efficient-inference mcu-boards tflite cmsis-nn armcortexm4 armcortexm0 arduinio c-code-generator tinyml armcortexm7 raspberry-pi-pico tfmicro tinyml-benchmark

Updated Jul 23, 2022
Python

RAIVNLab / STR

Star

Soft Threshold Weight Reparameterization for Learnable Sparsity

sparsity cnn imagenet str icml efficient-inference soft-thresholding edge-machine-learning sparsity-optimization resource-efficient icml-2020 learnable-sparsity icml2020 soft-threshold-reparameterization

Updated Feb 15, 2023
Python

Improve this page

Add a description, image, and links to the efficient-inference topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the efficient-inference topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

efficient-inference

Here are 61 public repositories matching this topic...

huawei-noah / Efficient-AI-Backbones

huawei-noah / AdderNet

liuziwei7 / mobile-id

snap-research / EfficientFormer

SqueezeAILab / LLMCompiler

liuzhuang13 / slimming

VITA-Group / LightGaussian

SqueezeAILab / SqueezeLLM

Zhen-Dong / Awesome-Quantization-Papers

horseee / DeepCache

The-Learning-And-Vision-Atelier-LAVA / SMSR

snap-research / graphless-neural-networks

bharathsudharsan / CNN_on_MCU

changlin31 / DS-Net

SqueezeAILab / KVQuant

xindongzhang / ELAN

cure-lab / DeciWatch

lucidrains / speculative-decoding

bharathsudharsan / TinyML-Benchmark-NNs-on-MCUs

RAIVNLab / STR

Improve this page

Add this topic to your repo