Assistants
The main idea behind assistants is to leverage the reasoning capabilities of LLMs, allowing the model to choose sequences of actions to take, rather than having these actions pre-coded.
With sufficient prompts and access to knowledge, LLM agents can operate semi-autonomously to assist humans across a variety of applications, from conversational chatbots to workflow automation and goal-based task execution.
Cognitive Solutions assistants can be configured dynamically and quickly, with the ability to choose from a wide range of options to model different problems and tasks for them to solve.
Key Features of Assistant's configuration
Interaction's flow
Flows allow the modeling of a wide range of complex interactions, enabling customization of all stages of user interaction as well as the internal process of generating responses or executing actions.
We use tools to perform more complex tasks. These tools can be fully configurable Python scripts.
Within flows, we can define:
- Prompt instructions for the task.
- Tools that do not require a model to execute, such as retrieving the current date.
- Tools that gather information from user interactions and perform actions using that information.
- Different stages of model access where tools are made available.
- Optimized generation configurations for each stage of interaction with LLMs, with the ability to use different providers and models within the same interaction.
- Use of frequently asked questions to speed up responses.
- Use of fixed responses based on the topic of the question to ensure controlled responses, especially recommended for sensitive topics.
- The number of recursive calls to LLMs allowed.
Generation
It is possible to select different generation configurations optimized for various tasks such as question answering, complex reasoning, image-based responses, goal-oriented guidance, and more.
The service provider, model, temperature, and other parameters can be configured. Different settings can be applied depending on the stage of the flow.
Content and information retrieval techniques
Information sources, search repositories, and retrieval techniques are fully configurable.
Content upload can be performed easily using standard parameters or can be fully customized to provide complete control over the uploaded information and its format. More information in the information section.
Evaluation
A component used during implementation to measure the assistant’s performance at each stage of the process. Using LLM-based metrics, embedding-based metrics, and traditional NLP metrics, we generate reports for the client with the assistant’s performance at a particular point in time. After the implementation stage, it is periodically run as regression tests.
When an assistant is created, is necessary to define metrics, acceptance criterias and visualizations for reports. More information in the evaluation section.
Accessing Assistants via API
Retrieving Assistant Information for Chat-UI Integration
When integrating assistants into a chat-ui component (such as ShowQuestion), you can use the GET /assistants/info/ endpoint to retrieve lightweight assistant data containing only the essential fields required to configure and display the chat interface.
This endpoint returns each assistant's:
- Display name (
info) — Customizable from the implementation portal - Description — Purpose or scope of the assistant
- Initial message — Welcome message to greet the user
- Message length limit (
max_msg_length) — Maximum character limit per message - Feature flags —
streaming_available,matrix_mode_available,realtime_available - Visual personalization —
logoURL and customcolors(name color, border color, primary color, secondary color)
This lightweight response is optimized for chat-ui initialization, avoiding the overhead of full assistant configuration data. Personalization fields (info, logo, colors) can be null; in such cases, the chat-ui should apply appropriate defaults.
For complete assistant configuration details, use the GET /assistants/ endpoint instead.