Skip to content
@xlite-dev

xlite-dev

Develop ML/AI toolkits and ML/AI/CUDA Learning resources.

Pinned Loading

  1. LeetCUDA LeetCUDA Public

    📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

    Cuda 6.8k 708

  2. lite.ai.toolkit lite.ai.toolkit Public

    🛠A lite C++ AI toolkit: 100+ models with MNN, ORT and TRT, including Det, Seg, Stable-Diffusion, Face-Fusion, etc.🎉

    C++ 4.2k 757

  3. Awesome-LLM-Inference Awesome-LLM-Inference Public

    📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

    Python 4.5k 305

  4. Awesome-DiT-Inference Awesome-DiT-Inference Public

    📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉

    Python 396 19

  5. torchlm torchlm Public

    💎An easy-to-use PyTorch library for face landmarks detection: training, evaluation, inference, and 100+ data augmentations.🎉

    Python 265 25

  6. ffpa-attn ffpa-attn Public

    🤖FFPA: Extend FlashAttention-2 with Split-D, ~O(1) SRAM complexity for large headdim, 1.8x~3x↑🎉 vs SDPA EA.

    Cuda 214 9

Repositories

Showing 10 of 38 repositories
  • diffusers Public Forked from huggingface/diffusers

    🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.

    xlite-dev/diffusers’s past year of commit activity
    Python 0 Apache-2.0 6,362 0 0 Updated Sep 11, 2025
  • HunyuanImage-2.1 Public Forked from Tencent-Hunyuan/HunyuanImage-2.1

    HunyuanImage-2.1: An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generation​

    xlite-dev/HunyuanImage-2.1’s past year of commit activity
    Python 1 25 0 0 Updated Sep 10, 2025
  • Wan2.2 Public Forked from Wan-Video/Wan2.2

    Wan: Open and Advanced Large-Scale Video Generative Models

    xlite-dev/Wan2.2’s past year of commit activity
    Python 0 Apache-2.0 395 0 0 Updated Sep 9, 2025
  • Qwen-Image-Lightning Public Forked from ModelTC/Qwen-Image-Lightning

    Qwen-Image-Lightning: Speed up Qwen-Image model with distillation

    xlite-dev/Qwen-Image-Lightning’s past year of commit activity
    Python 0 Apache-2.0 27 0 0 Updated Sep 9, 2025
  • cache-dit Public Forked from vipshop/cache-dit

    A Unified Cache Acceleration Toolbox for 🤗Diffusers: FLUX.1, Qwen-Image-Edit, Qwen-Image, Wan2.1/2.2, etc.

    xlite-dev/cache-dit’s past year of commit activity
    Python 4 9 0 0 Updated Sep 4, 2025
  • LeetCUDA Public

    📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

    xlite-dev/LeetCUDA’s past year of commit activity
    Cuda 6,846 GPL-3.0 708 8 0 Updated Sep 3, 2025
  • Qwen-Image Public Forked from QwenLM/Qwen-Image

    Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.

    xlite-dev/Qwen-Image’s past year of commit activity
    Python 1 Apache-2.0 259 0 0 Updated Sep 3, 2025
  • comfyui-cache-dit Public

    cache-dit for comfyui

    xlite-dev/comfyui-cache-dit’s past year of commit activity
    Python 5 0 0 0 Updated Aug 26, 2025
  • Awesome-DiT-Inference Public

    📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉

    xlite-dev/Awesome-DiT-Inference’s past year of commit activity
    Python 396 GPL-3.0 19 0 0 Updated Aug 19, 2025
  • Awesome-LLM-Inference Public

    📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

    xlite-dev/Awesome-LLM-Inference’s past year of commit activity
    Python 4,486 GPL-3.0 305 0 0 Updated Aug 19, 2025

Top languages

Loading…