Huggingface Chat Endpoint

Overview

This endpoint provides access to Huggingface's AI models for chat completion with streaming responses.

Request Details

HTTP Method

POST

Route

/api/ai/huggingface/[model_publisher]/[model_name]/chat

Route Parameters

Parameter	Type	Required	Description
model_publisher	string	Yes	The publisher of the AI model
model_name	string	Yes	The name of the AI model

Query Parameters

Parameter	Type	Required	Description
chat_id	string	Yes	Unique identifier of the chat

Headers

Header	Value	Required	Description
Content-Type	application/json	Yes	Indicates JSON request body
Authorization	Bearer [token]	Yes	JWT authentication token

Request Body

The request body should contain the chat messages array.

Field	Type	Required	Description
messages	Message[]	Yes	Array of chat messages

TypeScript Interface

interface Message {
  role: 'user' | 'assistant'
  content: string
}

interface ChatRequest {
  messages: Message[]
}

Python Model

from pydantic import BaseModel
from typing import List, Literal

class Message(BaseModel):
    role: Literal['user', 'assistant']
    content: str

class ChatRequest(BaseModel):
    messages: List[Message]

Response Format

Success Response

The response is a streaming text response containing the AI model's reply.

Response Headers

Header	Value	Description
Content-Type	text/event-stream	Indicates streaming response
Transfer-Encoding	chunked	Indicates chunked transfer

Error Responses

Invalid Model (400 Bad Request)

{
  "statusCode": 400,
  "statusMessage": "Invalid model name or publisher"
}

Service Unavailable (400 Bad Request)

{
  "statusCode": 400,
  "statusMessage": "Service not available anymore."
}

Internal Server Error (500)

{
  "statusCode": 500,
  "statusMessage": "Internal Server Error"
}

Code Examples

Python Example (using httpx)

import httpx
from pydantic import BaseModel
from typing import List, Literal
import asyncio

class Message(BaseModel):
    role: Literal['user', 'assistant']
    content: str

class ChatRequest(BaseModel):
    messages: List[Message]

async def stream_chat_completion(
    token: str,
    chat_id: str,
    model_publisher: str,
    model_name: str,
    messages: List[Message]
) -> None:
    async with httpx.AsyncClient() as client:
        response = await client.post(
            f"https://neptun-webui.vercel.app/api/ai/huggingface/{model_publisher}/{model_name}/chat",
            params={"chat_id": chat_id},
            headers={
                "Authorization": f"Bearer {token}",
                "Content-Type": "application/json"
            },
            json={"messages": [msg.dict() for msg in messages]},
            timeout=None
        )
        response.raise_for_status()

        async for chunk in response.aiter_bytes():
            print(chunk.decode(), end="", flush=True)

cURL Example

curl -X POST "https://neptun-webui.vercel.app/api/ai/huggingface/meta/llama/chat?chat_id=123" \
  -H "Authorization: Bearer your-token-here" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Hello, how are you?"}
    ]
  }'

TypeScript/JavaScript Example (using fetch)

async function streamChatCompletion(
  token: string,
  chatId: string,
  modelPublisher: string,
  modelName: string,
  messages: Message[]
): Promise<void> {
  const response = await fetch(
    `https://neptun-webui.vercel.app/api/ai/huggingface/${modelPublisher}/${modelName}/chat?chat_id=${chatId}`,
    {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${token}`,
        'Content-Type': 'application/json',
      },
      body: JSON.stringify({ messages }),
    }
  )

  if (!response.ok) {
    throw new Error(`HTTP error! status: ${response.status}`)
  }

  const reader = response.body?.getReader()
  if (!reader) {
    return
  }

  while (true) {
    const { done, value } = await reader.read()
    if (done) {
      break
    }

    // Process the streaming response chunks
    console.log(new TextDecoder().decode(value))
  }
}

Overview​

Request Details​

HTTP Method​

Route​

Route Parameters​

Query Parameters​

Headers​

Request Body​

TypeScript Interface​

Python Model​

Response Format​

Success Response​

Response Headers​

Error Responses​

Invalid Model (400 Bad Request)​

Service Unavailable (400 Bad Request)​

Internal Server Error (500)​

Code Examples​

Python Example (using httpx)​

cURL Example​

TypeScript/JavaScript Example (using fetch)​