Post

Parsing LLM Embedding API Responses with Portkey

Parsing LLM Embedding API Responses with Portkey

While working with Portkey’s embedding API to compute cosine similarity between text embeddings, I encountered two issues that led to aTypeError. Reflecting on these mistakes and how I fixed them has given me a solid strategy for reading and interpreting API responses. Here’s a quick, personal recap for future reference.

My Initial Mistakes

1. Not Specifying the Encoding Format

My original API call:

1
2
3
4
5
# Get embeddings for both texts using OpenAI's embedding model via Portkey.
response = portkey.embeddings.create(input=text, model="text-embedding-ada-002")

# Incorrect parsing using subscript notation
embedding = np.array(response["data"][0]["embedding"])

What went wrong:

  • Issue: Without specifying the encoding format, the API returned the embedding as a base64-encoded string rather than a list of floats.
  • Error: This resulted in a TypeError when I tried to convert the string into a numeric array.

2. Incorrect Parsing Method

The error message I encountered:

1
TypeError: 'Embedding' object is not subscriptable

Why:

  • I mistakenly treated the response as a dictionary (using response["data"]) even though the Portkey SDK returns custom objects with attributes. The correct approach is to use attribute access (e.g., response.data).

The Fix: Correcting My Approach

Specify the Encoding Format

I updated the API call to include encoding_format="float", ensuring the API returns a list of floats ready for numerical operations:

1
2
3
4
5
response = portkey.embeddings.create(
    input=text,
    model="text-embedding-ada-002-2",
    encoding_format="float"
)

Parse the Response Using Attribute Access

I then adjusted my parsing code to reflect that the response is a custom object with attributes:

1
embedding = np.array(response.data[0].embedding, dtype=float)

With these changes, I successfully extracted numeric embeddings for further computations like cosine similarity.

Learning to Read the API Response Structure

  1. Consult the Official Documentation:
    The docs provided a JSON sample response:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    
    {
      "data": [
        {
          "index": 123,
          "embedding": [123],
          "object": "embedding"
        }
      ],
      "model": "<string>",
      "object": "list",
      "usage": { "prompt_tokens": 123, "total_tokens": 123 }
    }
    

    Key takeaways:

    • The response is a JSON object with keys like "data", "model", and "usage".
    • The "data" key holds a list (array) of embedding objects.
    • Each embedding object contains an "embedding" key that holds the numeric vector when using encoding_format="float".
  2. Print and Inspect the Response:
    I printed out the response:

    1
    
    print(response)
    

    The output looked like:

    1
    
    CreateEmbeddingResponse(..., data=[Embedding(embedding=[...], index=0, object='embedding')], ...)
    

    This confirmed:

    • The response is a custom object.
    • The data attribute is a list containing one element (indexed at 0: index=0).
    • That element is an Embedding object, whose embedding attribute holds the actual numeric vector generated by the model for the input text.

By combining these two approaches—consulting the documentation and inspecting the printed response—I was able to deduce the correct way to parse the API response.

Final Thoughts

This blog captures my general approach to correctly using a Portkey LLM model API and parsing its response. The lessons learned here—specifying the correct encoding format and mapping the documented JSON structure to code by inspecting the actual response—will serve as a handy reference for future projects.

This post is licensed under CC BY 4.0 by the author.