Comment by cvhc
I'm frustrated that some fundamental aspects in agent development lack clear guideline.
One example is input/output types of function tools. Frameworks offer some flexiblity and seemingly I can use fundamental types and simple data structures (list, dict/map). But on the other hand I know all data types are eventually stringified and this has implications.
I have recently observed two issues when my agent calls a function that simply takes some int64 numeric IDs: (1) when the IDs are presented as hexadecimal in the context, the LLM attempts to convert them to decimal itself but mess it up because it doesn't really calculate; (2) some big IDs are not passed precisely in Google ADK framework [1], presumbly because its JSON serialization failed to keep the precision. I ended up changing the function to take string args instead. I also wasn't sure if the tool should return the data as original as possible in a moderately deeply nested dict, or step further to properly organize the output in a more human-readable text format for model ingestion.
OpenAI's doc [2] writes: "A result must be a string, but the format is up to you (JSON, error codes, plain text, etc.). The model will interpret that string as needed." -- But that clearly contradicts with the framework's capability and some official examples where dict/numbers are returned.
[1] https://github.com/google/adk-python/issues/3592 [2] https://platform.openai.com/docs/guides/function-calling