PYTHON LIBRARY

langvision
vision LLMs

Efficient LoRA fine-tuning for vision-language models — LLaVA, Qwen-VL, InternVL, PaliGemma, and any HF VLM.

Sister library to langtune. Same FastVisionModel API. Triton kernels applied automatically to the language decoder. Train locally or dispatch to the cloud — zero code changes.

Install langvisionRead the docs
$pip install langvision

SUPPORTED MODELS

Any vision-language model on HuggingFace

LLaVA 1.5/1.6POPULAR

Decoder + CLIP

InstructBLIP

Q-Former + LLM

Qwen-VL

ViT + Qwen2

InternVL2

InternViT + LLM

PaliGemmaNEW

SigLIP + Gemma2

BLIP-2

Q-Former + OPT/T5

mPLUG-Owl3

ViT + Qwen2

Any HF VLMOPEN

AutoModelForVision2Seq

FASTVISIONMODEL API

Local or remote. Your choice.

local_vision_train.py
from langvision import FastVisionModel
from PIL import Image

# Load LLaVA with 4-bit + Langtrain Triton kernels
model, processor = FastVisionModel.from_pretrained(
    "llava-hf/llava-1.5-7b-hf",
    load_in_4bit=True,
)

# Add LoRA on language decoder only (vision encoder frozen)
model = FastVisionModel.get_peft_model(
    model, r=16, method="qlora",
    train_vision_encoder=False,
)

# Fine-tune on your image-text pairs
FastVisionModel.train(
    model, processor, dataset,
    method="qlora",
    output_dir="./vision-model",
)

# Run inference
image = Image.open("cat.jpg")
response = FastVisionModel.generate(
    model, processor, image,
    prompt="Describe this image in detail."
)

12 TRAINING METHODS

From SFT to GRPO. All vision-aware.

SFT

Captioning, VQA, instruction following

QLoRA

4-bit NF4 — run 70B VLMs on 24 GB GPU

DoRA

Weight-decomposed LoRA on vision decoder

DPO

Preference optimization for image-text pairs

ORPO

Odds Ratio Preference Optimization

SimPO

Simple Preference Optimization

KTO

Kahneman-Tversky for vision preferences

RLHF

PPO with a vision reward model

GRPO

Group Relative Policy Optimization (RLVR)

IA³

Infused adapter on vision-language layers

Prefix

Prefix tuning for task steering

LoRA

Standard LoRA on language decoder

THE LANGTRAIN LIBRARY SUITE

langtune + langvision

FEATURElangtunelangvision
Unsloth-compatible API✓✓
Local GPU training✓✓
Remote cloud dispatch✓✓
Triton/CUDA kernels✓✓
12 training methods✓—
12 vision methods—✓
Vision encoder LoRA—✓
Multimodal DPO/ORPO—✓
Streaming remote metrics✓✓

Fine-tune vision models today

Install langvision and fine-tune any HuggingFace VLM in minutes.

GitHubDocumentationlangtune →
L
Langtrain

The fine-tuning platform for production LLMs.
Built for builders who demand sovereignty.

GithubHuggingFace
All Systems Operational

Product

  • Fine-Tuning
  • PlaygroundNew
  • RL Environment
  • Guardrails
  • AI Agents
  • Model Hub
  • Pricing
  • Enterprise

Use Cases

  • Customer Support AI
  • Internal Code Assistants
  • Healthcare & HIPAA
  • Financial Services
  • Legal Document QA

Resources

  • Documentation
  • Quick Start
  • API Reference
  • Python SDK
  • Node SDK
  • Blog
  • Changelog
  • Status

Company

  • About Us
  • Careers
  • Contact
  • Community
  • Support

Legal

  • Terms of Service
  • Privacy Policy
  • Cookie Policy
  • Data Processing Agreement
© 2026 Langtrain AI Private Limited. All rights reserved.
PrivacyTermsMade with ♥ in India

LANGTRAIN