JoyPix

Alibaba Happy Horse 1.0 Reference-to-Video

Overview

Alibaba Happy Horse 1.0 Reference-to-Video generates new video scenes guided by one or more reference images, helping maintain consistent characters, styles, and visual identity across the output. It combines reference-image grounding with natural-language prompting to create cinematic videos in 720p or 1080p.

Need to generate directly from text? Try Alibaba Happy Horse 1.0 Text-to-Video.

Need to animate a single starting frame instead? Try Alibaba Happy Horse 1.0 Image-to-Video.

Why Choose HappyHorse 1.0?

Reference-guided consistency

Use up to multiple reference images to preserve character identity, visual style, outfit details, and overall scene language.

Prompt + image control

Combine reference images with a text prompt to control the scene, action, mood, and camera behavior more precisely.

Cinematic motion

Generate smooth, expressive video motion while keeping important visual elements stable and recognizable.

Flexible output settings

Choose output resolution, aspect ratio, duration, and seed to match your creative and production needs.

Endpoint

POSThttps://openapi.joypix.ai/v1/alibaba/reference-to-video

Headers

NameValueRequired
Content-Typeapplication/jsonYes
AuthorizationBearer ${JoyPix_API_KEY}Yes

Parameters

ParameterRequiredDescription
modelYesThe name of the model. Set to:happyhorse-1.0-r2v
imagesYesReference image URLs. Supports 1-9 images.
promptYesText description of the desired scene, action, style, or motion.
resolutionNoOutput resolution: "720p" (default) or "1080p".
aspect_ratioNoOutput aspect ratio. Default: "16:9".
durationNoVideo length in seconds. Range: 3-15, default 5.
seedNoRandom seed for reproducibility. Range: 0-2147483647.

How to Use

  1. Upload your reference images: provide 1-9 image URLs that define the character, style, or visual identity you want to preserve.
  2. Write your prompt: describe the target scene, action, camera behavior, lighting, and mood.
  3. Choose resolution: use 720p for lower-cost iteration or 1080p for higher-quality final output.
  4. Set aspect ratio: choose the format that best fits your target platform or composition needs.
  5. Set duration: choose a clip length between 3 and 15 seconds.
  6. Set a seed (optional): use a fixed seed for more reproducible generations.
  7. Submit: generate and download your video.

Example Request

{
  "model": "happyhorse-1.0-r2v",
  "images": [
    "https://example.com/reference-1.jpg",
    "https://example.com/reference-2.jpg"
  ],
  "prompt": "A cinematic fashion scene with the same character walking through a softly lit modern city street at night, gentle camera tracking, subtle wind in the hair and clothing, elegant movement, realistic lighting, premium commercial style",
  "aspect_ratio": "16:9",
  "resolution": "1080p",
  "duration": 5,
  "seed": 123456
}

Response

{
  "code": 200,
  "message": "success",
  "data": {
    "task_id": "task_123456789"
  }
}

Get Task Status

Endpoint

GEThttps://openapi.joypix.ai/v1/tasks/${task_id}

Headers

NameValueRequired
AuthorizationBearer ${JoyPix_API_KEY}Yes

Response

{
  "code": 200,
  "message": "success",
  "data": {
    "task_id": "task_123456789",
    "inputs": "...",
    "model": "happyhorse-1.0-r2v",
    "status": "completed",
    "error": "",
    "video_url": "https://joypix-output.s3.amazonaws.com/..." // Note: Data is only saved for 24 hours, please download it as soon as possible
  }
}

Pricing

ResolutionCost
720p14 credits/s ($0.14/s)
1080p24 credits/s ($0.24/s)

Pro Tips

  • Use clear, high-quality reference images that strongly represent the character, outfit, or style you want to preserve.
  • Include multiple reference images when consistency across facial features, costume details, or design elements is important.
  • Be specific in your prompt about scene, action, camera motion, lighting, and mood.
  • Use 720p for rapid testing, then switch to 1080p for final-quality renders.
  • Reuse the same seed when you want more reproducible outputs.
  • Start with shorter durations to validate identity consistency and motion before generating longer clips.

Notes

  • Both images and prompt are required.
  • images supports 1-9 reference image URLs.
  • Ensure all image URLs are publicly accessible.
  • Supported video duration is 3-15 seconds.
  • Supported resolutions are 720p and 1080p.
  • Pricing scales linearly with duration.
  • Please ensure your content complies with applicable usage policies.