Comment by simonw

Comment by simonw a day ago

Here's one of those JSON files loaded in Datasette Lite (15MB page load): https://lite.datasette.io/?json=https://huggingface.co/datas...

I had Gemini 2.5 Pro extract the prompts they used from the code:

  llm install llm-gemini
  llm install llm-fragments-github
  llm -m gemini/gemini-2.5-pro-preview-06-05 \
    -f github:SalesforceAIResearch/CRMArena \
    -s 'Markdown with a comprehensive list of all prompts used and how they are used'

Result here: https://gist.github.com/simonw/33d51edc574dbbd9c7e3fa9c9f79e...

jzelinskie a day ago

I recommend folks check out the linked paper -- it's discussing more than just confidentiality tests as a benchmark for being ready for B2B AI usage.

But when it comes to confidentiality, having fine-grained authorization securing your RAG layer is the only valid solution that I've seen in used in industry. Injecting data into the context window and relying on prompting will never be secure.

Reply View 3 replies

sausagefeet a day ago

Is that sufficient? I'm not very adept at modern AI but it feels to me like the only reliable solution is to not have the data in the model at all. Is that what you're saying accomplishes?

Reply View | 2 replies
- rafaelmn a day ago
  
  Yes. It's basically treat the model as another frontend approach - that way the model has the same scopes as any frontend app would.
  
  Reply View | 1 reply
  
  spwa4 3 hours ago
  
  Why wouldn't the human mind have the same problem? Hell, it's ironic because one thing ML is pretty damn good at is to get humans to violate their prompting, and, frankly, basic rational thought:
  https://www.ic3.gov/PSA/2024/PSA241203
  Or, more concretely:
  https://edition.cnn.com/2024/02/04/asia/deepfake-cfo-scam-ho...
  
  Reply View | 0 replies

heymijo a day ago

You are a perpetual motion machine. Truly prolific.

Reply View 0 replies