Parsing LLM Embedding API Responses with Portkey
While working with Portkey’s embedding API to compute cosine similarity between text embeddings, I encountered two issues that led to aTypeError
. Reflecting on these mistakes and how I fixed them has given me a solid strategy for reading and interpreting API responses. Here’s a quick, personal recap for future reference.
My Initial Mistakes
1. Not Specifying the Encoding Format
My original API call:
1
2
3
4
5
# Get embeddings for both texts using OpenAI's embedding model via Portkey.
response = portkey.embeddings.create(input=text, model="text-embedding-ada-002")
# Incorrect parsing using subscript notation
embedding = np.array(response["data"][0]["embedding"])
What went wrong:
- Issue: Without specifying the encoding format, the API returned the embedding as a base64-encoded string rather than a list of floats.
- Error: This resulted in a
TypeError
when I tried to convert the string into a numeric array.
2. Incorrect Parsing Method
The error message I encountered:
1
TypeError: 'Embedding' object is not subscriptable
Why:
- I mistakenly treated the response as a dictionary (using
response["data"]
) even though the Portkey SDK returns custom objects with attributes. The correct approach is to use attribute access (e.g.,response.data
).
The Fix: Correcting My Approach
Specify the Encoding Format
I updated the API call to include encoding_format="float"
, ensuring the API returns a list of floats ready for numerical operations:
1
2
3
4
5
response = portkey.embeddings.create(
input=text,
model="text-embedding-ada-002-2",
encoding_format="float"
)
Parse the Response Using Attribute Access
I then adjusted my parsing code to reflect that the response is a custom object with attributes:
1
embedding = np.array(response.data[0].embedding, dtype=float)
With these changes, I successfully extracted numeric embeddings for further computations like cosine similarity.
Learning to Read the API Response Structure
Consult the Official Documentation:
The docs provided a JSON sample response:1 2 3 4 5 6 7 8 9 10 11 12
{ "data": [ { "index": 123, "embedding": [123], "object": "embedding" } ], "model": "<string>", "object": "list", "usage": { "prompt_tokens": 123, "total_tokens": 123 } }
Key takeaways:
- The response is a JSON object with keys like
"data"
,"model"
, and"usage"
. - The
"data"
key holds a list (array) of embedding objects. - Each embedding object contains an
"embedding"
key that holds the numeric vector when usingencoding_format="float"
.
- The response is a JSON object with keys like
Print and Inspect the Response:
I printed out the response:1
print(response)
The output looked like:
1
CreateEmbeddingResponse(..., data=[Embedding(embedding=[...], index=0, object='embedding')], ...)
This confirmed:
- The response is a custom object.
- The
data
attribute is a list containing one element (indexed at 0:index=0
). - That element is an
Embedding
object, whoseembedding
attribute holds the actual numeric vector generated by the model for the input text.
By combining these two approaches—consulting the documentation and inspecting the printed response—I was able to deduce the correct way to parse the API response.
Final Thoughts
This blog captures my general approach to correctly using a Portkey LLM model API and parsing its response. The lessons learned here—specifying the correct encoding format and mapping the documented JSON structure to code by inspecting the actual response—will serve as a handy reference for future projects.