…all of the models evaluated “demonstrate near-zero confidentiality awareness.”
Any agent that is accessible from outside the company (e.g. a customer support chatbot) is going to have to deal with malicious actors. If it has access to sensitive information, and no confidentiality awareness…seems like a problem.
This is fun too:
Any agent that is accessible from outside the company (e.g. a customer support chatbot) is going to have to deal with malicious actors. If it has access to sensitive information, and no confidentiality awareness…seems like a problem.