LIVE-Last scan updating-53 sources active-129 signals today-RESEARCH PGaussDet: Open-Vocabulary and Referring Segmentation for 3D Gaussians Using 2D Detectors
Automated alternatives

Best SeFi-Image: A Text-to-Image Foundation Model with Semantic-First Diffusion alternatives.

Live source-backed alternatives to SeFi-Image: A Text-to-Image Foundation Model with Semantic-First Diffusion for Image generation. Alternatives are selected from the same task category and update whenever the best-of index rebuilds.

Alternatives
7
same task category
Sources
14
distinct URLs
Modules
6
indexable
Updated
Jun 29, 2026
from radar data
Reference option

SeFi-Image: A Text-to-Image Foundation Model with Semantic-First Diffusion

Training image generation foundation models consumes substantial resources. Previous methods have attempted to leverage semantic guidance to accelerate the training process, yet their experiments were only conducted on simple datasets such as ImageNet, at low resolutions, and with small-scale models. In this paper, we propose SeFi-Image, a text-to-image foundation model built upon semantic-first diffusion, a novel latent diffusion modeling paradigm. We instantiate SeFi-Image at three model scales, 1B, 2B, and 5B parameters, enabling systematic study of scaling behavior and flexible deployment under varying compute budgets. Notably, our largest 5B model was trained with merely 125K A800 GPU hours, corresponding to roughly 10-20% of the training compute used by Z-Image. However, it achieves results comparable to or even superior to Qwen-Image and Z-Image. Despite this modest training compute, SeFi-Image achieves strong performance on a wide range of benchmarks, including GenEval, DPG, LongTextBench, OneIG, and CVTG-2K. Moreover, we provide DMD2-distilled few-step turbo variants for each model scale to accommodate diverse hardware constraints and latency requirements. We publicly release our code, weights and hope this work offers the community useful insights into semantic-guided diffusion modeling for T2I generation, while also providing practical and readily deployable model options. cs.CV Training image generation foundation models consumes substantial resources. Previous methods have attempted to leverage semantic guidance to accelerate the training process, yet their experiments were only conducted on simple datasets such as ImageNet, at low resolutions, and with small-scale models. In this paper, we propose SeFi-Image, a text-to-image foundation model built upon semantic-first diffusion, a novel latent diffusion modeling paradigm. We instantiate SeFi-Image at three model scales, 1B, 2B, and 5B parameters, enabling systematic study of scaling behavior and flexible deployment under varying compute budgets. Notably, our largest 5B model was trained with merely 125K A800 GPU hours, corresponding to roughly 10-20% of the training compute used by Z-Image. However, it achieves results comparable to or even superior to Qwen-Image and Z-Image. Despite this modest training compute, SeFi-Image achieves strong performance on a wide range of benchmarks, including GenEval, DPG, LongTextBench, OneIG, and CVTG-2K. Moreover, we provide DMD2-distilled few-step turbo variants for each model scale to accommodate diverse hardware constraints and latency requirements. We publicly release our code, weights and hope this work offers the community useful insights into semantic-guided diffusion modeling for T2I generation, while also providing practical and readily deployable model options. Research signal collected from arXiv metadata; Gemini enrichment can add a clearer summary. cs.CV benchmark eval

RDR75Research-onlyarxiv-ai
Alternative

Replicate Official Models

Matched image generation, text-to-image, text to image; 3 source links; official model catalog signal; access model: Paid API

RDR89Paid API
Alternative

fal Model APIs

Matched image generation, text-to-image, text to image; 3 source links; official model catalog signal; access model: Paid API

RDR88Paid API
#AlternativeKindAccessFitWhy it appearsSource
01Replicate Official Models servicePaid APIRDR89Matched image generation, text-to-image, text to image; 3 source links; official model catalog signal; access model: Paid APIreplicate.com
02fal Model APIs servicePaid APIRDR88Matched image generation, text-to-image, text to image; 3 source links; official model catalog signal; access model: Paid APIfal.ai
03Runware Model API servicePaid APIRDR87Matched image generation, text-to-image, text to image; 3 source links; official model catalog signal; access model: Paid APIrunware.ai
04DiffusionBench: On Holistic Evaluation of Diffusion TransformerspaperResearch-onlyRDR79Matched image generation, text-to-image, text to image; 1 source link; access model: Research-only; freshly updatedarxiv.org
05artokun/comfyui-mcprepoOpen sourceRDR78Matched image generation, text-to-image, text to image; 1 source link; access model: Open source; freshly updatedgithub.com
06Intermediate Text Representation Guided Text-to-Image Generation for Enhancing One-and-Only AlignmentpaperResearch-onlyRDR77Matched image generation, text-to-image, text to image; 1 source link; access model: Research-only; freshly updatedarxiv.org
07aqm857886159/NomirepoOpen sourceRDR77Matched image generation, text-to-image, text to image; 1 source link; access model: Open source; freshly updatedgithub.com