Alibaba Happy Horse 1.0 Reference-to-Video
Overview
Alibaba Happy Horse 1.0 Reference-to-Video generates new video scenes guided by one or more reference images, helping maintain consistent characters, styles, and visual identity across the output. It combines reference-image grounding with natural-language prompting to create cinematic videos in 720p or 1080p.
Need to generate directly from text? Try Alibaba Happy Horse 1.0 Text-to-Video.
Need to animate a single starting frame instead? Try Alibaba Happy Horse 1.0 Image-to-Video.
Why Choose HappyHorse 1.0?
Reference-guided consistency
Use up to multiple reference images to preserve character identity, visual style, outfit details, and overall scene language.
Prompt + image control
Combine reference images with a text prompt to control the scene, action, mood, and camera behavior more precisely.
Cinematic motion
Generate smooth, expressive video motion while keeping important visual elements stable and recognizable.
Flexible output settings
Choose output resolution, aspect ratio, duration, and seed to match your creative and production needs.
Endpoint
https://openapi.joypix.ai/v1/alibaba/reference-to-videoHeaders
| Name | Value | Required |
|---|---|---|
| Content-Type | application/json | Yes |
| Authorization | Bearer ${JoyPix_API_KEY} | Yes |
Parameters
| Parameter | Required | Description |
|---|---|---|
| model | Yes | The name of the model. Set to:happyhorse-1.0-r2v |
| images | Yes | Reference image URLs. Supports 1-9 images. |
| prompt | Yes | Text description of the desired scene, action, style, or motion. |
| resolution | No | Output resolution: "720p" (default) or "1080p". |
| aspect_ratio | No | Output aspect ratio. Default: "16:9". |
| duration | No | Video length in seconds. Range: 3-15, default 5. |
| seed | No | Random seed for reproducibility. Range: 0-2147483647. |
How to Use
- Upload your reference images: provide 1-9 image URLs that define the character, style, or visual identity you want to preserve.
- Write your prompt: describe the target scene, action, camera behavior, lighting, and mood.
- Choose resolution: use 720p for lower-cost iteration or 1080p for higher-quality final output.
- Set aspect ratio: choose the format that best fits your target platform or composition needs.
- Set duration: choose a clip length between 3 and 15 seconds.
- Set a seed (optional): use a fixed seed for more reproducible generations.
- Submit: generate and download your video.
Example Request
{
"model": "happyhorse-1.0-r2v",
"images": [
"https://example.com/reference-1.jpg",
"https://example.com/reference-2.jpg"
],
"prompt": "A cinematic fashion scene with the same character walking through a softly lit modern city street at night, gentle camera tracking, subtle wind in the hair and clothing, elegant movement, realistic lighting, premium commercial style",
"aspect_ratio": "16:9",
"resolution": "1080p",
"duration": 5,
"seed": 123456
}Response
{
"code": 200,
"message": "success",
"data": {
"task_id": "task_123456789"
}
}Get Task Status
Endpoint
https://openapi.joypix.ai/v1/tasks/${task_id}Headers
| Name | Value | Required |
|---|---|---|
| Authorization | Bearer ${JoyPix_API_KEY} | Yes |
Response
{
"code": 200,
"message": "success",
"data": {
"task_id": "task_123456789",
"inputs": "...",
"model": "happyhorse-1.0-r2v",
"status": "completed",
"error": "",
"video_url": "https://joypix-output.s3.amazonaws.com/..." // Note: Data is only saved for 24 hours, please download it as soon as possible
}
}Pricing
| Resolution | Cost |
|---|---|
| 720p | 14 credits/s ($0.14/s) |
| 1080p | 24 credits/s ($0.24/s) |
Pro Tips
- Use clear, high-quality reference images that strongly represent the character, outfit, or style you want to preserve.
- Include multiple reference images when consistency across facial features, costume details, or design elements is important.
- Be specific in your prompt about scene, action, camera motion, lighting, and mood.
- Use 720p for rapid testing, then switch to 1080p for final-quality renders.
- Reuse the same seed when you want more reproducible outputs.
- Start with shorter durations to validate identity consistency and motion before generating longer clips.
Notes
- Both images and prompt are required.
- images supports 1-9 reference image URLs.
- Ensure all image URLs are publicly accessible.
- Supported video duration is 3-15 seconds.
- Supported resolutions are 720p and 1080p.
- Pricing scales linearly with duration.
- Please ensure your content complies with applicable usage policies.
