Show HN: Neural-hash-collider – Find target hash collisions for NeuralHash
dang 2021-08-19 06:27:35 +0000 UTC [ - ]
Apple defends anti-child abuse imagery tech after claims of ‘hash collisions’ - https://news.ycombinator.com/item?id=28225706 - Aug 2021 (401 comments)
Hash collision in Apple NeuralHash model - https://news.ycombinator.com/item?id=28219068 - Aug 2021 (662 comments)
Convert Apple NeuralHash model for CSAM Detection to ONNX - https://news.ycombinator.com/item?id=28218391 - Aug 2021 (177 comments)
(I just mean related to this particular project. To list the threads related to larger topic would be...too much.)
vermilingua 2021-08-19 02:34:09 +0000 UTC [ - ]
All it would take now, is for one CSAM hash to be known to the public, then uploading collided iPhone wallpapers to wallpaper download sites. That many false positives will overload whatever administrative capacity there is to review reports in a matter of days.
Matheus28 2021-08-19 02:38:48 +0000 UTC [ - ]
copperx 2021-08-19 03:06:16 +0000 UTC [ - ]
gnulinux 2021-08-19 05:56:02 +0000 UTC [ - ]
ec109685 2021-08-19 04:35:34 +0000 UTC [ - ]
That one can’t be figured out through this technique.
vermilingua 2021-08-19 05:28:47 +0000 UTC [ - ]
So either they have no intention to actually protect user data, or the system is trivially broken; either way a pretty damning look for Apple.
comex 2021-08-19 05:52:59 +0000 UTC [ - ]
vermilingua 2021-08-19 09:14:02 +0000 UTC [ - ]
ec109685 2021-08-19 15:38:20 +0000 UTC [ - ]
Syonyk 2021-08-19 04:44:49 +0000 UTC [ - ]
I mean, hopefully not, but at this point, it's reasonable to call just about everything into question on the topic.
ec109685 2021-08-19 04:54:02 +0000 UTC [ - ]
M4v3R 2021-08-19 05:42:11 +0000 UTC [ - ]
jdlshore 2021-08-19 05:51:32 +0000 UTC [ - ]
simondotau 2021-08-19 04:34:01 +0000 UTC [ - ]
Fnoord 2021-08-19 09:07:45 +0000 UTC [ - ]
For example, in case of a wallpaper, let's say its the Windows XP wallpaper. There's no human skin color in it at all, so you can easily be reasonably sure it isn't CP. You would not need an advanced ML for such.
And they can have multiple checksums, just like a tarball or package or whatever can have an CRC32, MD5, and SHA512. Just because one of these matches, doesn't mean the other don't. Only problem is keeping these DBs of hashes secret. But that could very well be a reason the scanning isn't done locally.
vermilingua 2021-08-19 09:22:34 +0000 UTC [ - ]
Fnoord 2021-08-19 14:20:36 +0000 UTC [ - ]
user-the-name 2021-08-19 11:02:44 +0000 UTC [ - ]
romeovs 2021-08-19 08:07:07 +0000 UTC [ - ]
But could this not also be used to circumvent the CSAM scanning by converting images that are in the CSAM database to visually similar images that won't match the hash anymore? That would effectively defeat the CSAM scanning Apple and others are trying to put into place completely and render the system moot.
One could argue that these spoofed images could also be added to the CSAM database, but what if you spoof them to have hashes of extremely common images (like common memes)? Adding memes to the database would render the whole scheme unmanageable, no?
Or am I missing something here?
So we'd end up with a system that: 1. Can't be reliably used to track actual criminal offenders (they'd just be able to hide) without rendering the whole database useless. 2. Can be used to attack anyone by making it look like they have criminal content on their iPhones.
chongli 2021-08-19 08:47:34 +0000 UTC [ - ]
Wouldn't it be easier for offenders to avoid Apple products? That requires no special computer expertise and involves no risk on their part.
romeovs 2021-08-19 10:23:56 +0000 UTC [ - ]
They are trying to prevent CSAM images from being stored and distributed using Apple products. If that goal is easily circumvented, the whole motivation for this (anti-)feature becomes invalid.
I a way, this architecture could potentially even make Apple products more attractive for CSAM distributors, since they now have a known way to fly under the radar (something that is arguably harder/riskier on other image sharing platforms, where the matching happens server-side).
One reasonable strategy Apple could have against that is through constantly finetuning the NeuralHash algorithm to hopefully catch more and more offenders. If that works reasonably well, it might deter criminals from their platform because an image that flies under the radar now might not fly under the radar in the future.
NB. I'm not trying to say Apple is doing the right thing here, especially since the above arguments put the efficacy of this architecture under scrutiny.
nonbirithm 2021-08-19 11:03:06 +0000 UTC [ - ]
If what Apple is aiming for is a more complete version of E2EE on their servers, maybe that's just an unintended consequence of the implementation, and the very reason why they're surprised that this received so much pushback. If Apple wanted to offer encryption for all user files in iCloud and leave no capability to decrypt the files themselves, they'd still need to be able to detect CSAM to protect themselves from liability. In that case, scanning on the device would be the only way to make it work.
If that were the case, I still wouldn't believe that moving the scan to the device fundamentally changes anything. Apple has to conduct a scan regardless, or they'll become a viable option for criminals to store CSAM. But in Apple's view, their implementation would mean they'd likely be the first cloud company that could claim to have zero knowledge of the data on on their servers while still satisfying the demands of the law.
Supposing that's the case, maybe what it would demonstrate is that no matter how you slice it, trying to offer a fully encrypted, no-knowledge solution for storing user data is fundamentally incompatible with societal demands.
But since Apple didn't provide such an explanation, we can only guess what their strategy is. They could have done a lot better job at describing their motivations, instead of hoping that the forces of public sentiment would allow it to pass like all the other scanning mechanisms actually had in the past.
the_other 2021-08-19 10:25:17 +0000 UTC [ - ]
xucheng 2021-08-19 08:45:47 +0000 UTC [ - ]
It works like this. First, found your target images, which are either widely available like internet memes for DOS attack or images you want to censor. Then, compute their Neuralhash. Next, use the hash collision tool to turn real CSAM images to have the same NeuralHash as the target images. Finally, report these adversarial CSAM images to the government. The result is that the attackers would successfully add the targeted NeuralHash into the CSAM database. And people who store these legit image will then be flagged.
iaw 2021-08-19 03:06:59 +0000 UTC [ - ]
Again, really naive but it seems like if you have two distinct multi-dimensional hashes it would be much harder to solve the gradient descent problem.
63 2021-08-19 03:24:43 +0000 UTC [ - ]
There has been a lot of hyperbole going around and the original premise that this is a breach of privacy is still true, but in my opinion the actual repercussions of attacks and collisions are being grossly exaggerated. One would have to create a collision with known CSAM for both algorithms (one of which is secret) which also overlaps with a legal porn image that could be misconstrued as CSAM by a human reviewer, or at the very least create and distribute hundreds of double collisions to DOS the reviewers.
silvestrov 2021-08-19 06:41:09 +0000 UTC [ - ]
But then they still need to upload the original image to the server, and what was the reason for doing the scanning client-side then when they still upload it?
Thorrez 2021-08-19 06:50:46 +0000 UTC [ - ]
Probably so that China cannot force Apple to hand over arbitrary images in iCloud (or all images in iCloud). With Apple's design the only images China can get from you are malicious images that people send you. If Apple scanned every image serverside without any clientside scanning, then theoretically China could get all newly-uploaded iCloud images.
blintz 2021-08-19 06:46:07 +0000 UTC [ - ]
MrMoenty 2021-08-19 07:25:51 +0000 UTC [ - ]
user-the-name 2021-08-19 11:06:00 +0000 UTC [ - ]
They are exactly the same case.
saithound 2021-08-19 12:37:44 +0000 UTC [ - ]
A cryptographic key is a piece of information which, as long as it remains secret, should be sufficient to protect the confidentiality and integrity of your system. This means that your system should remain secure even if your adversary knows everything else apart from the key, including the details of the algorithm you use, the hardware you have, and even all your previous plaintexts and ciphertexts (inputs and outputs). If the key fails to have this property, your cryptosystem is broken.
The trained model (or the weights of a NN) does not have this property at all. Keeping the model secret does not ensure the confidentiality or integrity of the system. E.g. just knowing some inputs and outputs of the secret model allows you to train your own classifier which behaves similarly enough to let you find perceptual hash collisions. If you treat your model as a cryptosystem, this would be a known-plaintext attack: any system vulnerable to these is considered completely and utterly broken.
You'd have to keep all of the following secret: the model, all its inputs, all its outputs. If you manage to do that, this might be secure. Might. But probably not. See also Part 2 of my FAQ, which happens to cover this question. [1]
user-the-name 2021-08-19 12:53:50 +0000 UTC [ - ]
This seems highly unlikely. You could train a model to find those exact known hashes, but I highly doubt you could get it to accurately find any other unknown hash.
> You'd have to keep all of the following secret: the model, all its inputs, all its outputs.
These are all, in fact, secret.
saithound 2021-08-19 13:27:45 +0000 UTC [ - ]
Your "highly doubt" is baseless. Black box attacks (where you create adversarial examples only using some inputs and outputs, but not the model) on machine learning models are not new. They have been demonstrated countless times [1]. You don't need to know the network at all.
> These are all, in fact, secret.
This is not the case, since regular, unprivileged Apple employees can and will look at the inputs and outputs of the model (the visual derivatives and their hashes). It's also irrelevant.
You insist that there is some kind of analogy between "keeping the model secret" and keeping a "cryptographic key" secret. There is no such analogy. It makes no sense. It is simply not there. Keeping a cryptographic key secret keeps confidentiality and integrity. Keeping your model secret accomplishes neither of these.
[1] https://towardsdatascience.com/adversarial-attacks-in-machin...
londons_explore 2021-08-19 07:12:59 +0000 UTC [ - ]
mikeyla85 2021-08-19 08:00:36 +0000 UTC [ - ]
ec109685 2021-08-19 03:39:06 +0000 UTC [ - ]
They also keep the second hash function private.
zepto 2021-08-19 04:18:13 +0000 UTC [ - ]
heavyset_go 2021-08-19 03:35:29 +0000 UTC [ - ]
yoz-y 2021-08-19 06:06:56 +0000 UTC [ - ]
1. People actually do use generally publicly available services to store and distribute CP (as suggested by the amount of reports done by Facebook)
2. A lot of people evidently use iCloud Photo Library to store images of things other than pictures they took themselves. This is not really surprising, I've learned that the answer of "does anybody ever?" questions is aways "yes". It is a bit weird though since the photo library is terrible for this use case.
signal11 2021-08-19 07:50:13 +0000 UTC [ - ]
Not in the context of CSAM, but this is iOS’s “appliance” user interface coming back to bite it. The iOS photos app doesn’t appear to have a way to show the user only the photos they took.
Apps like Twitter, browsers, chat apps, screenshots all get to save their photos in the photo library. I believe iOS 15 has a way to filter photos by originating app, but for most users currently, it’s hard not to use iCloud Photo Library for photos I didn’t take myself.
Interestingly, users save chat history including images, from apps like iMessage and WhatsApp, on iCloud too. I’m not sure what happens to e2e encryption for backed up data.
yoz-y 2021-08-19 10:47:27 +0000 UTC [ - ]
signal11 2021-08-19 11:38:39 +0000 UTC [ - ]
corgihamlet 2021-08-19 07:48:48 +0000 UTC [ - ]
That is just the way it works on iOS and it is really annoying to have random cute dog pictures saved from Reddit crop up in "Your Furry Friends"-Compilations.
yoz-y 2021-08-19 10:45:35 +0000 UTC [ - ]
The first is now a bit easier since the share sheet also has "save to files" (only some of the time though, for no apparent reason). The second is a bit easier as there at least is a Screenshot automatic album.
But yes, I see why people do this, but I wish apple provided a better way to not pollute the iCloud library.
ThinBold 2021-08-19 04:16:50 +0000 UTC [ - ]
Surely you can easily fabricate innocent images whose NeuralHash matches the database. But in what way are you going to send them to victims and convince them to save them to their photo library? The moment you send it via WhatsApp FB will stop you because (they think) it is a problematic image. And Even if the image did land, it has to look like some cats and dogs or the receiver will just ignore. (Even worse, the receiver may report you.) And even if your image does look like cats and dogs, it has to pass another automatic test at the server side that uses another obfuscated, constantly-updating algorithm. After that, even more tests if Apple really wants to.
That means your image needs to collide ≥ three times, one open, one obfuscated, and one Turing.
Gmail scans your attachments and most people are cool with it. I highly doubt that Apple has any reason to withdraw this.
undecisive 2021-08-19 10:31:12 +0000 UTC [ - ]
It boils down to this: If you can prevent [some organisation] from potentially destroying civilisation, how much effort would be too much effort, and how much uncertainty is too much uncertainty?
For most, there's a trade off. If someone believes that the technology is sufficient for any country to implement a brutal civil rights destruction campaign, and that this is 50% likely, and all you need to do is upload some harmless images to your icloud to thwart it, why wouldn't you? For example, maybe a certain political party could regain power in a massive way and start doing away with gay rights, locking up anyone with pictures of men kissing each other on their phones. Of course, other countries have other types of extremists that look like they might take over government, or have already taken over governments. In those countries, tools like this are already in place, and this new one could be very powerful.
So if you could upload some innocuous images and save hundreds of lives in 20 or more countries in the world, even if you don't fear for your own country, would you?
Assuming you said yes, the question is, could everyone who agrees give Apple enough false positives, force enough human moderators to have to inspect the images, that the whole scheme becomes financially un-viable on Apple's end?
Of course, the only thing we could do here is slow this "progress" down. Maybe we can use that extra time to share the message that "this kind of technology is not ok" and that being naive about what this tech will be used for is almost guaranteed to kill more people than it saves.
But while so few people are thinking of it in these terms, or if everyone believes the attempt to be futile, it can't work. It's like anything - by the time people realise there's a problem, it's often too late to fix the problem.
ThinBold 2021-08-19 14:09:55 +0000 UTC [ - ]
save hundreds of lives, would you?
Sure, why not.But let me ask some questions, because at this point I am not sure if people want Apple's system to be robust or jammable. If our fear is that Apple will tune the system to detect pictures of two men kissing, wouldn't an easily jammable system works in our favor because we can DDoS it or threaten to do so anytime we want?
asxd 2021-08-19 08:01:56 +0000 UTC [ - ]
zo1 2021-08-19 08:47:48 +0000 UTC [ - ]
copperx 2021-08-19 05:37:00 +0000 UTC [ - ]
ThinBold 2021-08-19 05:50:09 +0000 UTC [ - ]
(In case you have problem with me saying "so many", that I can fix.)
sennight 2021-08-19 09:18:18 +0000 UTC [ - ]
Because such architectural flaws become absolute train wrecks when scaled. Remember the Clipper Chip? This is like that: cryptographers pointing out fundamental flaws that may seem like minor issues to most of the users who were going to be compelled to use it - but at scale those flaws result in the direct opposite of the stated objectives.
It feels weird having to explain scalability on HN... everyone here should know that if your little scheme is struggling pre-rollout then trying to power through will only magnify your troubles. So it is hard to account for that blind spot that defenders of this thing seem to have.
ThinBold 2021-08-19 13:06:20 +0000 UTC [ - ]
The scalability issue seems to work in our favor because, perhaps, the normal usage will overwhelm the human reviewers Apple prepared and we don't even need to send troll images.
And for the entire time, our data remains untouched.
sennight 2021-08-19 14:57:20 +0000 UTC [ - ]
t-writescode 2021-08-19 07:25:23 +0000 UTC [ - ]
istingray 2021-08-19 04:38:52 +0000 UTC [ - ]
ThinBold 2021-08-19 05:59:35 +0000 UTC [ - ]
If I miss anything then surely I am willing to be corrected. But so far I don't see comments that show us how to penetrate the four-layer system (local hash check, semantic check by user, on-server hash check, and human reviewer).
istingray 2021-08-19 06:26:05 +0000 UTC [ - ]
ThinBold 2021-08-19 16:46:50 +0000 UTC [ - ]
Alupis 2021-08-19 02:32:17 +0000 UTC [ - ]
I just don't understand what Apple's motivation would have been here. Surely this fallout could have been anticipated?
JohnJamesRambo 2021-08-19 03:06:12 +0000 UTC [ - ]
Alupis 2021-08-19 05:52:30 +0000 UTC [ - ]
rebuilder 2021-08-19 08:57:29 +0000 UTC [ - ]
Apple have decided their position of not being able to provide access to law enforcement is becoming a liability. They're probably under intense pressure from several governments on that front.
This is a way to intentionally let their hand be forced into scanning for arbitrary hashes on devices at the behest of governments, taking pressure off Apple and easing their relations with governments. They take a PR hit now, but it's not too bad since it's ostensibly about fighting child abuse, and Apple's heart is clearly in the right place. When later, inevitably, the hashes start to include other material, Apple can say their hands are tied on the matter - they can no longer use the "can't do it" defense and are forced to comply. This is much simpler than having to fight about it all the time.
gtm1260 2021-08-19 05:19:44 +0000 UTC [ - ]
cirrus3 2021-08-19 03:52:53 +0000 UTC [ - ]
They will have to do it either way, and they the fact they are even telling how us they plan to do it is more than we can say for every other cloud services.
This is better than all alternatives at this point. Like it or not. If you don't like, you might need to get up to speed on what other services you may already be using are doing.
Spivak 2021-08-19 02:41:58 +0000 UTC [ - ]
Alupis 2021-08-19 02:43:33 +0000 UTC [ - ]
As far as we know (and I'm sure lots of eyeballs are looking now) Android doesn't do this.
And frankly, why would Apple care that the FBI isn't cozy with them. Their entire brand is "security and privacy", kind of goes against most 3 Letter Agencies anyway.
nonbirithm 2021-08-19 07:18:46 +0000 UTC [ - ]
"Last year, for instance, Apple reported 265 cases to the National Center for Missing & Exploited Children, while Facebook reported 20.3 million, according to the center’s statistics. That enormous gap is due in part to Apple’s decision not to scan for such material, citing the privacy of its users."
If you were a law enforcement agency and noticed this discrepancy, would you believe that you'd be letting some number of child abusers get away because of that difference in 20 million reports? iCloud probably doesn't have the same level of adoption as Facebook, but the gap is still very large.
[1] https://www.nytimes.com/2021/08/05/technology/apple-iphones-...
simondotau 2021-08-19 04:44:19 +0000 UTC [ - ]
Android does not do on-device scanning, but Google does scan photos after they are uploaded to their cloud photo service. It's not on-device scanning, but the effect is functionally identical: photos that are being uploaded to the cloud are being scanned for CSAM. The only real distinction is who owns the CPU which computes the hash.
I doubt it's the FBI pressuring Apple. My suspicion is it's fear of the US Congress passing worse, even more privacy-invading laws under the guise of combating CSAM. If Apple's lobbyists can show that iPhones are already searching for CSAM, arguments for such laws get weaker.
Alupis 2021-08-19 04:46:22 +0000 UTC [ - ]
So did Apple, and pretty much all cloud hosting providers.
This, on device, scanning is what's new, and very out of character for Apple.
> If Apple's lobbyists can show that iPhones are already searching for CSAM, arguments for such laws get weaker.
I'm not aware of any big anti-CSAM push being made by Congress. CSAM just isn't really a big issue in the US, the existing laws, and culture, are pretty effective already.
nonbirithm 2021-08-19 08:14:30 +0000 UTC [ - ]
As a result, people will focus their arguments instead on the technological flaws in the current implementation of on-device scanning or slippery slope arguments that are unlikely to become reality, the feature will be added anyway with no political opposition, and in the end Apple and/or the government will get what they want, for what they consider the greater good.
I think that absolute privacy in society as a whole isn't attainable with those values in place, and it raises many questions regarding to what extent the Internet should remain free from moderation. Are there really no kinds of information that are so fundamentally damaging that they should not be allowed to exist on someone's hard drive? If not, who will be in control of moderating that information? Maybe we will have to accept that some tradeoffs between privacy and stability need to be made for the collective good, in limited circumstances.
eivarv 2021-08-19 08:37:57 +0000 UTC [ - ]
simondotau 2021-08-19 04:55:29 +0000 UTC [ - ]
Right now. The best time for Apple to do this is when cannot be painted as a defensive move against any specific legislation. The CSAM argument has been used many times in the past and it's certain to be used many more times in the future.
sgent 2021-08-19 07:19:08 +0000 UTC [ - ]
https://www.nytimes.com/2020/02/07/us/online-child-sexual-ab...
simondotau 2021-08-19 04:51:42 +0000 UTC [ - ]
Ethics aside, on-device scanning has the benefit of Constitutional protection, at least in the USA. Because the searching is being performed on private property, any attempt by the Government to try to expand the scope of searches would be a clear-cut 4th Amendment violation.
(Whereas if the scanning is done in the cloud, Government can compel searches and that would fall under the "third party doctrine" which is an end-run around the 4th Amendment.)
Alupis 2021-08-19 04:58:17 +0000 UTC [ - ]
It's it though? The device someone bought 2 years ago suddenly starts reporting them to the FBI's Anti-CSAM unit without the owners realistic consent does seem like a run-around to unpermissioned government searches. It's not reasonable to say "throw away your $1200 device if you don't consent", is it? Nor can a person reasonably avoid iOS updates that force this feature to be active.
> any attempt by the Government to try to expand the scope of searches
We've seen private companies willfully censor individuals at the government's behest under the current administration - will Apple begin expanding the search and reporting mechanisms just to stay in whatever administration's good grace?
Like I said, this is extremely out of character, and very off-brand for Apple. Why would someone trust Apple going forward? Even Google's Android doesn't snitch on it's owners to law enforcement... Setting aside all the ways for nefarious actors to abuse this system and sic LE on innocent individuals.
simondotau 2021-08-19 05:02:32 +0000 UTC [ - ]
Yes. Your phone is your private property, just like your house or your car. Searching your private property requires a warrant or reasonable suspicion, otherwise it's a 4th Amendment violation.
This twitter thread is worth a read.
https://twitter.com/pwnallthethings/status/14248736290037022...
copperx 2021-08-19 05:29:34 +0000 UTC [ - ]
Alupis 2021-08-19 05:06:27 +0000 UTC [ - ]
You can't realistically avoid the iOS update. Apple has effectively given consent on your behalf... How will that fly?
simondotau 2021-08-19 05:11:46 +0000 UTC [ - ]
If you're concerned about other forms of scanning compelled by the Government, you never consented to the search. So even if Apple complied, the search is invalid and cannot be used to prosecute you.
torstenvl 2021-08-19 05:19:03 +0000 UTC [ - ]
This is a dangerously false understanding of the law. Stop giving legal advice. You are not a lawyer.
simondotau 2021-08-19 05:25:47 +0000 UTC [ - ]
I'm curious, do you think that the Third Party Doctrine applies here?
Alupis 2021-08-19 05:48:25 +0000 UTC [ - ]
simondotau 2021-08-19 06:06:02 +0000 UTC [ - ]
Alupis 2021-08-19 06:25:31 +0000 UTC [ - ]
All things current at social media companies seem willing to censor after suggestion of the administration.
simondotau 2021-08-19 06:31:15 +0000 UTC [ - ]
I don't think you're being serious.
Alupis 2021-08-19 06:43:59 +0000 UTC [ - ]
The only realistic alternative to Apple is Android... And Google is pretty darn transparent in their spying on users. Apple just did a 180 degree about-face on all the branding they've built over the last decade. Why should anyone trust Apple again?
Look, this whole neural-hash thing took what, 2 weeks for people to fabricate collisions? This just illustrated how poorly conceived and ill-thought the entire plan was from Apple. It's not beyond reason to assume any of these things given the evidence we currently have.
simondotau 2021-08-19 07:04:39 +0000 UTC [ - ]
You think Facebook and Twitter have dobbed users into the US Government for spreading unflattering memes? You are delusional, or more likely, not being serious. This is the last reply you'll be seeing from me. I'm collapsing this thread and won't be replying any more.
Alupis 2021-08-19 05:17:13 +0000 UTC [ - ]
As previously mentioned, Android doesn't scan all photos on your device... Google scans content uploaded to their servers. Which is reasonable... It's their servers, they can host what they want. Your iPhone is your iPhone.
simondotau 2021-08-19 05:26:40 +0000 UTC [ - ]
Citation?
Alupis 2021-08-19 05:34:39 +0000 UTC [ - ]
simondotau 2021-08-19 05:49:01 +0000 UTC [ - ]
From everything I've read, from Apple and other sources, if the photo is about to be uploaded to iCloud Photo Library then it is scanned for CSAM. If it's not, it isn't.
Alupis 2021-08-19 05:53:41 +0000 UTC [ - ]
simondotau 2021-08-19 06:07:16 +0000 UTC [ - ]
Still waiting on that citation.
Alupis 2021-08-19 06:18:15 +0000 UTC [ - ]
simondotau 2021-08-19 06:34:06 +0000 UTC [ - ]
Still waiting on that citation.
Alupis 2021-08-19 06:42:31 +0000 UTC [ - ]
simondotau 2021-08-19 07:02:49 +0000 UTC [ - ]
copperx 2021-08-19 05:27:00 +0000 UTC [ - ]
Where does this assumption come from? Because of iOS lower market share? Are you implying they are more prevalent in Android devices? In desktop computers? I don't understand the logic.
simondotau 2021-08-19 05:57:13 +0000 UTC [ - ]
You don't have to be particularly tech savvy to know it's a bad idea to co-mingle your deepest darkest secrets alongside photos of your mum and last night's dinner. Especially when discovering those secrets would lead to estrangement, or prison.
As for the few who might be doing it currently, that's likely to plummet quickly. If you think Apple's move caused waves in the Hacker News crowd, just imagine how much it has blown up in the CSAM community right now. I dare say it's probably all they've been talking about for the past two weeks.
sgerenser 2021-08-19 12:26:55 +0000 UTC [ - ]
norov 2021-08-19 03:42:35 +0000 UTC [ - ]
spullara 2021-08-19 03:42:00 +0000 UTC [ - ]
https://transparencyreport.google.com/child-sexual-abuse-mat...
jjcon 2021-08-19 06:03:56 +0000 UTC [ - ]
To be clear, Apple does not utilize E2E in iCloud. They can (and already do) scan iCloud contents
ncw96 2021-08-19 02:01:10 +0000 UTC [ - ]
only_as_i_fall 2021-08-19 02:21:55 +0000 UTC [ - ]
If Apple wants to defend this they should try to explain how the system will work even if generating adversarial images is trivial.
ncw96 2021-08-19 02:50:06 +0000 UTC [ - ]
1. You have to reach a threshold of matches before your account is flagged.
2. Once the threshold is reached, the matched images are checked against a different perceptual hash algorithm on Apple servers. This means an adversarial image would have to trigger a collision on two distinct hashing algorithms.
3. If both hash algorithms show a match, then “visual derivative” (low-res versions) of the images are inspected by Apple to confirm they are CSAM.
Only after these three criteria are met is your account disabled and referred to NCMEC. NCMEC will then do their own review of the flagged images and refer to law enforcement if necessary.
[1]: https://www.apple.com/child-safety/pdf/Security_Threat_Model...
only_as_i_fall 2021-08-19 11:42:30 +0000 UTC [ - ]
I mean assuming the purpose is to catch child abusers and not merely to use this particular boogeyman to introduce a back door for later use.
Dylan16807 2021-08-19 03:13:07 +0000 UTC [ - ]
copperx 2021-08-19 03:13:21 +0000 UTC [ - ]
ncw96 2021-08-19 03:25:40 +0000 UTC [ - ]
I don’t believe Apple has said whether or not they send them in their initial referral to NCMEC, but law enforcement could easily get a warrant for them. iCloud Photos are encrypted at rest, but Apple has the keys.
(Many have speculated that this CSAM local scanning feature is a precursor to Apple introducing full end-to-end encryption for all of iCloud. We’ll see.)
Scaevolus 2021-08-19 02:06:28 +0000 UTC [ - ]
There are other ways to guess what the hashes are, but I can't think of legal ones.
> Matching-Database Setup. The system begins by setting up the matching database using the known CSAM image hashes provided by NCMEC and other child-safety organizations. First, Apple receives the NeuralHashes corresponding to known CSAM from the above child-safety organizations. Next, these NeuralHashes go through a series of transformations that includes a final blinding step, powered by elliptic curve cryptography. The blinding is done using a server-side blinding secret, known only to Apple. The blinded CSAM hashes are placed in a hash table, where the position in the hash table is purely a function of the NeuralHash of the CSAM image. This blinded database is securely stored on users’ devices. The properties of elliptic curve cryptography ensure that no device can infer anything about the underlying CSAM image hashes from the blinded database.
https://www.apple.com/child-safety/pdf/CSAM_Detection_Techni...
dannyw 2021-08-19 02:12:32 +0000 UTC [ - ]
It's also possible for someone (Attacker A) to go on the darknet and get a list of 96-bit neural hashes, and then publish or sell this list somewhere to another party, Attacker B. The second party would never have to interact with CSAM.
Imagine Ransomware v2: We have inserted 29 photos of CSAM-matching material into your photo library. Pay X monero to this address in 30 minutes, or we will insert 2 additional photos, which will cross the threshold and may result in serious and life-changing consequences to you[1].
The difference here (versus the status quo) is that an easily-broken perceptual hashing enables the attacker to never send or possess any CSAM images[2]. From my experiences with being victims of various hackers, I know a lot of them won't touch CSAM because they know it's wrong, but they'll salivate at an opportunity to weaponise automated CSAM scanning.
[1]: If you think Apple's human review will mitigate this attack, you can permute legal pornography to match CSAM signatures. If Apple's reviewers see 30 CSAM matches and the visual derivatives look like porn, they will be legally required to report to to NCMEC (a statutory quasi-government agency staffed by the FBI), even if all the photos are actually consensual adults.
[2]: If you never possess nor touch CSAM, it might be harder for you to get charged with CP charges. You might be looking at CFAA, blackmail or extortion charges; while your victim faces child pornography charges. This is basically an "amplification attack" on the real world judicial system.
Scaevolus 2021-08-19 02:30:09 +0000 UTC [ - ]
It's certainly possible, but I posit that the exploit chain necessary to get the capability to inject photos onto an arbitrary user's iPhone is valuable enough that it's more likely to be used for spying by repressive regimes than straight up blackmail-- and if you had such a capability, why bother with hash-colliding permutations of legal pornography? Why not plant CSAM directly onto the user's device?
Nearly all cloud storage services implement a scanner like this, and permit the same level of blackmail with a simpler attack chain, such as phishing Dropbox credentials to inject illegal material.
I think the more interesting attacks are governments colluding to add politically motivated non-CSAM material to the lists and then requiring Apple allow them to perform the human review to discover dissidents.
dannyw 2021-08-19 02:41:02 +0000 UTC [ - ]
If you plant material that matches CSAM hashes, you do none of that. The median ransomware actor might find this to thje fastest way to collect a thousand monero.
Also, you can distribute 30 media items per message via WhatsApp. There is a configurable setting for WhatsApp to save all received photos to your iCloud photo library. No exploits needed, you could probably weaponise this via an WhatsApp bot.
d0100 2021-08-19 02:27:56 +0000 UTC [ - ]
Even worse, just get a "teen" porn screengrab, pass it through the collider and you have pretty much a smoking gun
XorNot 2021-08-19 02:46:41 +0000 UTC [ - ]
So I suspect it would be easier then that (particularly since this whole hashing scheme has been surrounded with a lot of clear garbage - "1 in a trillion" -> on demand collisions in a couple of weeks?
ComputerGuru 2021-08-19 03:25:58 +0000 UTC [ - ]
XorNot 2021-08-19 04:04:11 +0000 UTC [ - ]
ComputerGuru 2021-08-19 16:33:38 +0000 UTC [ - ]
user-the-name 2021-08-19 11:34:24 +0000 UTC [ - ]
bArray 2021-08-19 02:41:51 +0000 UTC [ - ]
Next, embed your images in sites of interest, like:
* A meme in some group
* A document or 'leak'
* An email to a journalist
Wait for somebody to save it to their Apply device. Wait for it to be flagged and then use that as 'reasonable means to conduct a search'. When asking for a warrant, the agency would say something like "we detected possible CSAM on a device, the likelihood of a false match is extremely low" - a judge will hardly press further.
You now essentially have a weapon where you can search any Apple device in the name of preventing the distribution of CSAM.
Failing that, you could just have `document_leak.pdf` and download a file that is both a valid PDF and a child porn image, depending on which program you open it with.
godelski 2021-08-19 03:22:26 +0000 UTC [ - ]
Honestly this even adds to the danger of hash collisions because now you can get someone on a terrorist watch list as well as the kiddy porn list.
ec109685 2021-08-19 04:05:09 +0000 UTC [ - ]
ec109685 2021-08-19 04:04:28 +0000 UTC [ - ]
jazzyjackson 2021-08-19 05:22:23 +0000 UTC [ - ]
ipiz0618 2021-08-19 02:53:21 +0000 UTC [ - ]
jchw 2021-08-19 02:02:29 +0000 UTC [ - ]
PBnFlash 2021-08-19 02:17:01 +0000 UTC [ - ]
jchw 2021-08-19 02:25:47 +0000 UTC [ - ]
only_as_i_fall 2021-08-19 02:23:51 +0000 UTC [ - ]
They're hashing on feature space (so trivial cropping and such doesn't defeat this) but they have two totally separate methods of matching those hashes? Doesn't sound right to me...
jchw 2021-08-19 02:29:15 +0000 UTC [ - ]
> In a call with reporters regarding the new findings, Apple said its CSAM-scanning system had been built with collisions in mind, given the known limitations of perceptual hashing algorithms. In particular, the company emphasized a secondary server-side hashing algorithm, separate from NeuralHash, the specifics of which are not public. If an image that produced a NeuralHash collision were flagged by the system, it would be checked against the secondary system and identified as an error before reaching human moderators.
https://www.theverge.com/2021/8/18/22630439/apple-csam-neura...
For one reason or another Apple really wants to create this precedent, so it’s only natural they’re doing every last thing to make the feature hard to defeat.
zepto 2021-08-19 04:20:28 +0000 UTC [ - ]
onepunchedman 2021-08-19 02:22:45 +0000 UTC [ - ]
Until that day, just send known CSAM to any person you'd like to get in trouble (make sure they have icloud sync enabled), be it your neighbour or a political figure, and start a PR campaign accusing the person of being investigated for it. The whole concept is so inherently flawed it's crazy they haven't been sued yet.
dannyw 2021-08-19 02:27:38 +0000 UTC [ - ]
With the previous status quo:
1. The attacker faces charges of possessing and distributing child pornography
2. The victim may be investigated and charged with child pornography if LEO is somehow alerted (which requires work, and can be traced to the attacker).
Poor risk/reward payoff, specifically the risk outweighs the reward. So it doesn't happen (often).
---
With the new status quo of lossy, on-device CSAM scanning and automated LEO alerting:
1. The attacker never sends CSAM, only material that collides with CSAM hashes. They will be looking at charges of CFAA, extortion, and blackmail.
2. The victim will be automatically investigated by law enforcement, due to Apple's "Safety Voucher" system. The victim will be investigated for possessing child pornography, particularly if the attacker collides legal pornography that may fool a reviewer inspecting a 'visual derivative'.
Great risk/reward payoff. The reward dramatically outweighs the risk, as you can get someone in trouble for CSAM without ever touching CSAM yourself.
If you think ransomware is bad, just imagine CSAM-collision ransomware. Your files will be replaced* with legal pornography that is designed specifically to collide with CSAM hashes and result in automated alerting to law enforcement. Pay X monero within the next 30 minutes, or quite literally, you may go to jail, and be charged with possessing child pornography, until you spend $XXX,XXX on lawyers and expert testimony that demonstrates your innocence.
* Another delivery mechanism for this is simply sending collided photos over WhatsApp, as WhatsApp allows for up to 30 media images in one message, and has settings that will automatically add these images to your iCloud photo library.
fay59 2021-08-19 03:42:22 +0000 UTC [ - ]
sigmar 2021-08-19 04:11:37 +0000 UTC [ - ]
fay59 2021-08-19 04:47:23 +0000 UTC [ - ]
sigmar 2021-08-19 15:26:29 +0000 UTC [ - ]
fay59 2021-08-19 15:59:04 +0000 UTC [ - ]
ehsankia 2021-08-19 05:11:13 +0000 UTC [ - ]
Also, the whole point is that it's fairly easy to create a fake image that collides with one hash, but doing it for 2 is exponentially harder. It's hard to see how you could have an image that collides with both hashes (of the same image mind you).
ummonk 2021-08-19 06:30:32 +0000 UTC [ - ]
Of course, it won't be public (and if it ever became public they'd replace it with a different secret hash).
snovv_crash 2021-08-19 05:47:21 +0000 UTC [ - ]
iflp 2021-08-19 04:22:03 +0000 UTC [ - ]
lamontcg 2021-08-19 04:11:34 +0000 UTC [ - ]
Don't know why this hasn't already been used on other cloud services, but maybe it will be now that its been more widely publicized.
salawat 2021-08-19 05:42:56 +0000 UTC [ - ]
Or are we going to say secret evidence is just fine nowadays? Bloody mathwashing.
sethgecko 2021-08-19 07:46:05 +0000 UTC [ - ]
rnjesus 2021-08-19 06:27:20 +0000 UTC [ - ]
salawat 2021-08-19 17:03:45 +0000 UTC [ - ]
However, the line of thinking was if Apple has a secondary classifier to run against visual derivatives, the intent is it can say "CSAM/Not CSAM". Since the NeuralHash can collide, that means they'd need something to take in the visual derivatives, and match it vs an NN trained on actual CSAM. Not hashes. Actual.
Evidence, as far as I'm aware, is admitted to the public record, and a link needs to exist, and be documented in a publically and auditable way. That to me implies any results of a NN would necessarily require that the initial training set be included for replicability if we were really out to maintain the full integrity of the chain of evidence that is used as justification for locking someone away. That means a snapshot of the actual training source material, which means large CSAM dump snapshots being stored for each case using Apple's classifier as evidence. Even if you handwave the government being blessed to hold onto all that CSAM as fitting comfortably in the law enforcement action exclusions; it's still littering digital storage somewhere with a lotta CSAM. Also Apple would have to update their model over time, which would require retraining, which would require sending that CSAM source material to somewhere other than NCMEC or the FBI (unless both those agencies now rent out ML training infrastructure for you to do your training on leveraging their legal carve out, and I've seen or come across no mention of that.)
Thereby, I feel that logistically speaking, someone is commiting an illegal act somewhere, but no one wants to rock the boat enough to figure it out, because it's more important to catch pedophiles than muck about with blast craters created by legislation.
I need to go read the legislation more carefully, so just take my post as a grunt of frustration at how it seems like everyone just wants an excuse/means to punish pedophiles, but no one seems to be making a fuss over the devil in the details, which should really be the core issue in this type of thing, because it's always the parts nobody reads or bothers articulating that come back to haunt you in the end.
onepunchedman 2021-08-19 02:31:11 +0000 UTC [ - ]
copperx 2021-08-19 02:55:41 +0000 UTC [ - ]
Sure, there is hyperbole in OP's comment (CSAM ransomware and automated law enforcement aren't a thing yet), but we're a few steps from that reality.
Even worse, how long will it take until other cloud storage services such as Dropbox, Amazon S3, Google Drive et al implement the same features? Or worse, required by law to do so?
This sounds like the start of an exodus from the cloud, at least in the non-developer consumer space.
spullara 2021-08-19 03:39:26 +0000 UTC [ - ]
https://transparencyreport.google.com/child-sexual-abuse-mat...
onepunchedman 2021-08-19 03:13:50 +0000 UTC [ - ]
* With regard to the legislative branch, they can even mandate changes to this system they aren't allowed to disclose. Once this system is in place, what is stopping governments from forcing other sets of hashes for matching.
copperx 2021-08-19 03:21:10 +0000 UTC [ - ]
Now, to be fair, there would be a secondary private hash algorithm running on Apple's servers to minimize the impact of hash collisions, but what's important is that once a file matches a hash locally, the file isn't yours anymore -- it will be uploaded unencrypted to Apple's servers and examined. How easy would it be to shift focus from CSAM into piracy to "protect intellectual property"? Or some other matter?
onepunchedman 2021-08-19 04:03:38 +0000 UTC [ - ]
It's a shame as I really love some of their privacy-minded features (e.g. precision of access to the phone's sensors and/or media).
djrogers 2021-08-19 04:51:54 +0000 UTC [ - ]
They already do this. Google and Facebook have even issued reports detailing their various success rates…
cryptonector 2021-08-19 05:46:21 +0000 UTC [ - ]
robertoandred 2021-08-19 02:35:35 +0000 UTC [ - ]
silisili 2021-08-19 05:27:30 +0000 UTC [ - ]
I did work in automating abuse detection years back, and the US govt clearly tells you are not to open/confirm suspected, reported, or happened upon cp. There's a lot of other seemingly weird laws and rules around it.
robertoandred 2021-08-19 14:21:36 +0000 UTC [ - ]
silisili 2021-08-19 15:56:55 +0000 UTC [ - ]
It is of course possible that companies may get some special sign off from LE/NCMEC to do this kind of work - I won't argue with you on that as I truly don't know. I can just tell you my company did not, and was very harshly told how to proceed despite knowing the nature of what we were trying to accomplish. But, we weren't anywhere near Apple big.
I remember chatting with our legal team, who made it explicit that laws didn't to cover carve outs - basically 'seeing' was illegal. But as you can imagine, police didn't come busting down our doors for happening upon it and reporting it. If you have links to law where this is not the case, I'll gladly eat crow. I've never looked myself and relied on what the lawyers had said.
hypothesis 2021-08-19 04:08:09 +0000 UTC [ - ]
Why would person doing manual review risk his job in case if he’s unsure? Naturally he will just play it safe and report images.
ec109685 2021-08-19 04:24:39 +0000 UTC [ - ]
onepunchedman 2021-08-19 05:24:02 +0000 UTC [ - ]
ec109685 2021-08-19 06:34:48 +0000 UTC [ - ]
EGreg 2021-08-19 02:51:46 +0000 UTC [ - ]
croutonwagon 2021-08-19 03:08:31 +0000 UTC [ - ]
https://old.reddit.com/r/MachineLearning/comments/p6hsoh/p_a...
If Apple hasn't been honest about WHEN it was built into and added to their code base, why would anyone take their word for HOW its being used, or many of the other statements they are putting in their documents as of yet, at least until they are verified
heavyset_go 2021-08-19 03:10:29 +0000 UTC [ - ]
> This program is ambitious, and protecting children is an important responsibility. These efforts will evolve and expand over time.
copperx 2021-08-19 05:07:59 +0000 UTC [ - ]
"Think of the children" is the most recognizable trope in TV and film. They couldn't have phrased that to be more Orwellian.
copperx 2021-08-19 05:09:32 +0000 UTC [ - ]
shuckles 2021-08-19 03:01:28 +0000 UTC [ - ]
SCLeo 2021-08-19 03:32:08 +0000 UTC [ - ]
shuckles 2021-08-19 03:35:14 +0000 UTC [ - ]
zepto 2021-08-19 04:12:39 +0000 UTC [ - ]
SCLeo 2021-08-19 03:42:40 +0000 UTC [ - ]
shuckles 2021-08-19 03:46:51 +0000 UTC [ - ]
heavyset_go 2021-08-19 03:56:33 +0000 UTC [ - ]
You can't know this without independent audits.
shuckles 2021-08-19 04:29:11 +0000 UTC [ - ]
noduerme 2021-08-19 06:17:56 +0000 UTC [ - ]
shuckles 2021-08-19 08:32:48 +0000 UTC [ - ]
onepunchedman 2021-08-19 03:04:23 +0000 UTC [ - ]
norov 2021-08-19 03:32:48 +0000 UTC [ - ]
onepunchedman 2021-08-19 03:57:21 +0000 UTC [ - ]
cookiengineer 2021-08-19 04:00:34 +0000 UTC [ - ]
I don't know about you, but my parents certainly have lots of embarassing pictures of me in their photo album.
There will be so many false positives in that system, it's ridiculous. It doesn't necessarily have to be a false colliding hash, but legitimate use cases that - by definition - are impossible to train neural nets on unless the data is being used illegally by Apple.
ec109685 2021-08-19 04:32:51 +0000 UTC [ - ]
cookiengineer 2021-08-19 04:36:25 +0000 UTC [ - ]
If I share that picture of my child with my friends and loved ones on Facebook - at what "scale" is it considered to be added to that database as child porn?
1k shares? 10k? Who's the one eligible to decide that? The judicatives? I think this scenario is a constitutional crisis because there's no good solution to it in terms of law and order.
jazzyjackson 2021-08-19 04:51:14 +0000 UTC [ - ]
foepys 2021-08-19 05:39:08 +0000 UTC [ - ]
China will demand it to include pictures of the Tiananmen massacre.
noduerme 2021-08-19 06:15:06 +0000 UTC [ - ]
Even if what's in the database is 100% violently criminal as you suggest, and even if it remains limited to that material, we already have a process in place that denies the accused of even seeing the evidence against them if a hash matches. What a horrific, orwellian situation if someone sent you hash matches, the police raid your house and now you can't even see what they think they have or prove your own innocence.
ec109685 2021-08-19 06:32:50 +0000 UTC [ - ]
For you to get caught up in this dragnet, 30+ plus images have to match NeuralHash’s of known illegal images, thumbnails of those images have to also produce a hit when run through a private hash function that Apple only has, and two levels of reviewers have to confirm the match as well.
onepunchedman 2021-08-19 04:06:26 +0000 UTC [ - ]
zepto 2021-08-19 04:09:15 +0000 UTC [ - ]
ummonk 2021-08-19 06:34:06 +0000 UTC [ - ]
robertoandred 2021-08-19 02:34:03 +0000 UTC [ - ]
dannyw 2021-08-19 02:36:25 +0000 UTC [ - ]
Specifically, many criminal actors don't touch CSAM because it's wrong. But some of these criminal actors will happily abuse legal systems, e.g. SWATTing.
seph-reed 2021-08-19 03:21:58 +0000 UTC [ - ]
nseggs 2021-08-19 03:44:25 +0000 UTC [ - ]
zepto 2021-08-19 04:14:13 +0000 UTC [ - ]
This is not correct. Hash collisions won’t match the visual derivative.
saithound 2021-08-19 04:43:15 +0000 UTC [ - ]
The visual derivative is just a resized, very-low-resolution version of the uploaded image. "Matching the visual derivative" is completely meaningless. The visual derivative is not matched against anything, and there is no "original" visual derivative to match against.
If enough signatures match, Apple employees can decrypt the visual derivatives, and see if these extremely low resolution images look to the naked eye like they could come from CSAM. If so, they alert the authorities.. Given a way to obtain hash collisions, generating non-CSAM images that pass the visual derivative inspection is completely trivial.
noduerme 2021-08-19 06:23:49 +0000 UTC [ - ]
saithound 2021-08-19 07:14:39 +0000 UTC [ - ]
The exact details of the algorithm are not public, but based on the technical summary that Apple provided, it almost certainly goes something like this.
Your device generates a secret number X. This secret is split into multiple fragments using a sharing scheme. Your device uses this secret number every time you upload a photo to iCloud, as follows:
1. Your device hashes the photo using a (many-to-one, hence irreversible) perceptual hash.
2. Your device also generates a fixed-size low resolution version of your image (the "visual derivative"). The visual derivative is encrypted using the secret X.
3. Your device encrypts some of your personally identifying information (device ids, Apple account, phone number, etc.) using X.
4. The hash, the encrypted visual derivative, and the encrypted personally identifying information are combined into what Apple calls the "safety voucher". A fragment of your key is attached to the safety voucher, and the voucher is sent to Apple over the internet. The safety vouchers are sent in a "blinded" way (with another encryption key derived using a Private Set Intersection scheme detailed in the technical summary), so that Apple cannot link them to specific files, devices or user accounts unless there's a match.
5. Apple receives the safety voucher. If the hash in the received safety voucher matches that of known CSAM content in the government-provided hash database (as determined by the private set intersection scheme), the voucher is saved and stored by Apple, and the fragment of your secret key X is revealed and saved. (You'd assume that they filter out / discard your voucher if there's no match; but the technical summary doesn't explicitly confirm this; this means that they may store and use it in the future to run further scans).
6. If your account uploads a large number of matching vouchers, then Apple will gather enough fragments to reassemble your entire secret key X. Now that they know your secret key, they can use it to decrypt the "visual derivatives" stored in all your saved vouchers.
7. An Apple employee will then inspect the "visual derivatives", and if your photos look like CSAM (more precisely, this employee can't rule out by visual inspection that your photos are CSAM-related), they will proceed to use your secret key X (which they now know) to decrypt the personally revealing information contained in your safety voucher, and report you to the authorities.
Keep in mind that the employee looking at the visual derivative does not, and cannot, know what the original image is supposed to look like. The only judgment they get to make is whether the low-resolution visual derivative of your photo looks like it can plausibly be CSAM-related or not. Plainly speaking, they will check if a small, say 48x48 pixel, thumbnail of your photo looks vaguely like naked people or not.
foerbert 2021-08-19 04:28:23 +0000 UTC [ - ]
That said I think your statement is a bit too strong, but generally true. A hash collision is not going to inherently be visually confusing. However you claim that it is impossible for an image to be both visually confusing and a hash collision, which seems unlikely. The real question is going to be how much more effort it takes to do both.
zepto 2021-08-19 04:35:26 +0000 UTC [ - ]
Also, the information needed to create a full match simply is not available.
foerbert 2021-08-19 13:39:07 +0000 UTC [ - ]
Unless you're relying on it being computationally infeasible, but I'm not sure we know enough to consider that true at this point. Usually when we make statements on those grounds we do so with substantial proof. I don't think we know enough to do so here. I'm not even sure how feasible it is when you throw DL into the mix.
onepunchedman 2021-08-19 02:37:14 +0000 UTC [ - ]
onepunchedman 2021-08-19 02:38:46 +0000 UTC [ - ]
robertoandred 2021-08-19 02:50:46 +0000 UTC [ - ]
onepunchedman 2021-08-19 03:02:00 +0000 UTC [ - ]
robertoandred 2021-08-19 03:20:07 +0000 UTC [ - ]
onepunchedman 2021-08-19 03:28:44 +0000 UTC [ - ]
robertoandred 2021-08-19 02:40:48 +0000 UTC [ - ]
onepunchedman 2021-08-19 02:44:36 +0000 UTC [ - ]
robertoandred 2021-08-19 02:48:39 +0000 UTC [ - ]
copperx 2021-08-19 03:02:47 +0000 UTC [ - ]
simondotau 2021-08-19 04:03:02 +0000 UTC [ - ]
heavyset_go 2021-08-19 03:16:15 +0000 UTC [ - ]
robertoandred 2021-08-19 03:27:00 +0000 UTC [ - ]
A lot of people may not know how to avoid malware. But I don’t think very many of them would be so inept as to accidentally long press on child porn and tap “Add to Photos”.
philipswood 2021-08-19 03:46:56 +0000 UTC [ - ]
Fixed it for you.
The image to be saved doesn't have to be disturbing at all to trigger a hash collision.
The linked repo has code to modify an image to generate a hash collision with another unrelated image.
That's the whole point.
Dylan16807 2021-08-19 03:05:57 +0000 UTC [ - ]
Is that enough to cause an investigation? Maybe, maybe not, but I wouldn't want it to be a risk.
simondotau 2021-08-19 04:10:30 +0000 UTC [ - ]
While I've no doubt that there's a lot of "before and after" images (which are still technically CSAM even if they're not strictly child porn) and possibly many innocuous images, they would not have been flagged as "A1".
I'm sure there's probably still a few images flagged as A1 which shouldn't be in the database at all, but that number is going to be small. How many of these incorrectly flagged images are going to make their way into your photo library? One? Two?
You need 30 in order for your account to be flagged.
Dylan16807 2021-08-19 09:53:13 +0000 UTC [ - ]
simondotau 2021-08-19 14:25:51 +0000 UTC [ - ]
And what’s the likelihood that a human reviewer will see these 30 odd images and press the “yep it’s CSAM” button?
More likely as soon as Apple’s human review sees these oddball images, they’re going to investigate, mark those hashes as invalid, then contact their upstream data supplier who will fix their data and now those implausible images are now useless.
onepunchedman 2021-08-19 02:56:17 +0000 UTC [ - ]
systemvoltage 2021-08-19 02:37:37 +0000 UTC [ - ]
arsome 2021-08-19 02:51:18 +0000 UTC [ - ]
A sufficiently advanced catfishing attack could probably take advantage of this to get someone raided and have all their electronics confiscated.
Just send someone a zip of photos and let them extract it...
onepunchedman 2021-08-19 02:40:42 +0000 UTC [ - ]
robertoandred 2021-08-19 02:46:44 +0000 UTC [ - ]
But OK, say someone sends you a sunset that fools the hasher. Then what? Of course one match won’t do anything, so you’d need to download however many matching sunsets. Then what? The Apple reviewer would see they’re sunsets and you’d challenge the flag saying they’re sunsets. And if somehow NCMEC got involved, they’d see they’re just sunsets. And if law enforcement got involved, they’d see they’re just sunsets.
These proofs of concept might seem interesting from a ML pov, but all they do is just highlight why Apple put so many layers of checking into this.
heavyset_go 2021-08-19 03:22:04 +0000 UTC [ - ]
A real attack would be to take legal porn images and make them collide with illegal images, so when a human goes to review the scaled down derivative images, those images very well look like they could be CSAM. Since there are many of them, they'd get sent to law enforcement. Then law enforcement would raid the victim's home and take all of their electronic devices in order to determine if they can be charged with a crime or not.
abraae 2021-08-19 03:31:28 +0000 UTC [ - ]
simondotau 2021-08-19 04:22:29 +0000 UTC [ - ]
And that's assuming someone develops a hash collision which doesn't substantially mangle the photograph like the example offered on Github.
Specifically, only images categorised as "A1" are being included in the hash set on iOS. The category definitions are:
A = prepubescent minor
B = pubescent minor
1 = sex act
2 = "lascivious exhibition"
The categories are described in further detail (ugh) in this PDF, page 22: https://www.prosecutingattorneys.org/wp-content/uploads/Pres...
Syonyk 2021-08-19 04:43:39 +0000 UTC [ - ]
Do we know that for sure?
Apple has changed their mind enough times in the last week and a half that I'm convinced they're in full on defensive "wing it and say whatever will get people off our backs!" mode.
You can't read the threat modeling PDF and conclude that it was run through the normal Apple document review process. It reads nothing like a standard Apple document - it reads like a bunch of sleep deprived people were told to whip it up and publish it.
simondotau 2021-08-19 05:09:21 +0000 UTC [ - ]
abraae 2021-08-19 05:01:23 +0000 UTC [ - ]
But by fog of war I was thinking more like the victim already has some sleazy (though marginally legal) stuff on their computer, or a search led to a find of pot in their house, or they lied to try and get out of the rap, or perhaps the FBI offered them a deal and they took it because they saw no way out, or perhaps they were simply an unlikable individual who the jury took a dislike to.
Basically that things are not always clear cut, and they come out of the wrong side of things, in a situation created by Apple's surveillance.
simondotau 2021-08-19 05:08:47 +0000 UTC [ - ]
Surveillance is surveillance. It's a bit more obnoxious that a CPU which I paid money for is being used to compute the hashes instead of some CPU in a server farm somewhere (which I indirectly paid for) but the outcome is the same. The risk of being SWAT-ed is the same.
kuratkull 2021-08-19 05:35:57 +0000 UTC [ - ]
netr0ute 2021-08-19 03:00:48 +0000 UTC [ - ]
robertoandred 2021-08-19 03:29:38 +0000 UTC [ - ]
riffraff 2021-08-19 03:12:35 +0000 UTC [ - ]
We all want privacy but it seems odd to try to DoS this, with high risk for yourself and very little to gain.
Might be useful when the system turns into mass political surveillance tho.
mukesh610 2021-08-19 05:16:26 +0000 UTC [ - ]
systemvoltage 2021-08-19 02:48:43 +0000 UTC [ - ]
simondotau 2021-08-19 04:31:25 +0000 UTC [ - ]
If it was anything like the image used to demonstrate this technique on Github, it's unlikely that anyone would describe that sunset as "beautiful". They'd be more likely to describe it as "bugger, this JPEG file is corrupted."
Syonyk 2021-08-19 04:41:24 +0000 UTC [ - ]
It was quite literally less than 24h from "Oh, hey, I can collide this grey blob with a dog!" to "Hey, this thing that looks like cat hashes to the same thing as this dog!"
You really think this is going to end at this proof of concept stage?
simondotau 2021-08-19 04:57:55 +0000 UTC [ - ]
Regardless, this whole thing is moot because there are two classifiers, only one of which has been made public. Before any matches can make it to human review, photos in decrypted vouchers have to pass the CSAM match against a second classifier that Apple keeps to itself.
copperx 2021-08-19 05:13:57 +0000 UTC [ - ]
simondotau 2021-08-19 05:40:24 +0000 UTC [ - ]
jdlshore 2021-08-19 05:36:46 +0000 UTC [ - ]
zepto 2021-08-19 04:15:30 +0000 UTC [ - ]
throwaway384950 2021-08-19 02:53:29 +0000 UTC [ - ]
Here is the image (NSFW!): https://i.ibb.co/Ct64Cnt/nsfw.png
Hash: 59a34eabe31910abfb06f308
robertoandred 2021-08-19 02:58:54 +0000 UTC [ - ]
throwaway384950 2021-08-19 03:05:53 +0000 UTC [ - ]
Probably not the weird image I posted, which looks obviously suspicious. But maybe someone will make a program to find "cleaner" hash collisions that don't look suspicious.
pseudalopex 2021-08-19 05:06:45 +0000 UTC [ - ]
ec109685 2021-08-19 04:09:09 +0000 UTC [ - ]
cirrus3 2021-08-19 03:55:59 +0000 UTC [ - ]
marcan_42 2021-08-19 04:31:35 +0000 UTC [ - ]
least 2021-08-19 04:26:14 +0000 UTC [ - ]
It seems unlikely in the event that there was anything that they verified as CSAM they wouldn't pass it on just because they found a false positive in those thumbnails.
jobigoud 2021-08-19 08:32:46 +0000 UTC [ - ]
And they think there are enough of these people to create this very complicated system and risk a PR disaster?
Erwin 2021-08-19 08:50:01 +0000 UTC [ - ]
In 2018, the police indicted 1000 of them (tracking them down with Facebook's help). Legal results were a child-porn law judgement for 334 of them, and simpler penalties for 400.
The child-porn judgement was mostly suspended sentences, but it precludes working with children (as a teacher or even sports trainer if children under 15yo are involved) for between 10 and 20 years.
If there was a system that would have caught it sooner, prior to sharing, the spread would be minimized. The police took 3 years to form a plan to indict the 1000+ people.
azinman2 2021-08-19 08:36:39 +0000 UTC [ - ]
Banditoz 2021-08-19 01:46:02 +0000 UTC [ - ]
dannyw 2021-08-19 01:50:09 +0000 UTC [ - ]
Just wait and watch - I guarantee you that Apple will be talking about CSAM in at least one anti-trust legal battle about why they shouldn't be broken up. Because a walled garden means they can oppress citizens on behalf of governments better.
zepto 2021-08-19 04:22:22 +0000 UTC [ - ]
endisneigh 2021-08-19 02:36:41 +0000 UTC [ - ]
If someone looks at the two images wouldn’t they see they’re not the same and therefore the original image was mistakenly linked with the target
dannyw 2021-08-19 02:46:38 +0000 UTC [ - ]
So Apple will be looking at a low-res grayscale image of whatever the collided image is, which could be legal adult pornography (let's say: a screengrab of legal "teen" 18+ porn), but the CSAM filter tells it that it's abuse material!
What would you do as the Apple reviewer?
(Hint: You only have one option, as you are legally mandated to report).
ummonk 2021-08-19 06:41:15 +0000 UTC [ - ]
dragonwriter 2021-08-19 02:51:42 +0000 UTC [ - ]
This is false.
> No one except NCMEC is allowed to possess the target (CSAM material).
False. No one is allowed to knowingly possess it, without taking certain actions forthwith when they become aware of it. Obviously, prior to it being reviewed as it is, neither the reviewer nor Apple has knowledge that it is actual or even particularly likely CSAM.
Dylan16807 2021-08-19 03:18:05 +0000 UTC [ - ]
Yeah, Apple might be able to look at the uploaded image. But the reviewers don't have a copy of the original image added to the database, which is the "target".
endisneigh 2021-08-19 04:01:28 +0000 UTC [ - ]
If it was, then would it matter if it wasn’t the original?
HeyImAlex 2021-08-19 05:05:13 +0000 UTC [ - ]
Alupis 2021-08-19 02:46:54 +0000 UTC [ - ]
SCUSKU 2021-08-19 03:26:53 +0000 UTC [ - ]
shuckles 2021-08-19 03:29:28 +0000 UTC [ - ]
jobigoud 2021-08-19 08:21:49 +0000 UTC [ - ]
m3kw9 2021-08-19 13:49:59 +0000 UTC [ - ]
mam3 2021-08-19 09:48:25 +0000 UTC [ - ]
ryanmarsh 2021-08-19 01:54:28 +0000 UTC [ - ]
Lawfare
ec109685 2021-08-19 04:12:03 +0000 UTC [ - ]
kuratkull 2021-08-19 05:58:26 +0000 UTC [ - ]
ec109685 2021-08-19 06:37:37 +0000 UTC [ - ]
robertoandred 2021-08-19 02:14:00 +0000 UTC [ - ]
dannyw 2021-08-19 02:48:22 +0000 UTC [ - ]
Dylan16807 2021-08-19 03:21:47 +0000 UTC [ - ]
arsome 2021-08-19 02:59:05 +0000 UTC [ - ]
Or you can probably just wait a minute and pay an... enterprising individual to sell you such a list on a darknet market though, or perhaps even find one posted on the clearnet soon enough.
robertoandred 2021-08-19 03:17:43 +0000 UTC [ - ]
belltaco 2021-08-19 04:18:13 +0000 UTC [ - ]
https://old.reddit.com/r/MachineLearning/comments/p6hsoh/p_a...
arsome 2021-08-19 04:19:37 +0000 UTC [ - ]
animanoir 2021-08-19 03:05:36 +0000 UTC [ - ]
K5EiS 2021-08-19 07:43:42 +0000 UTC [ - ]
You can read more about it here: https://www.apple.com/child-safety/pdf/CSAM_Detection_Techni...
saithound 2021-08-19 11:40:21 +0000 UTC [ - ]
Part 1/2
Q: I heard that Apple employees inspect a "visual derivative" of your photos before reporting you to the authorities. Doesn't this mean that, even if you modify images so their hash matches CSAM, the visual derivative won’t match?
A: No. "Matching the visual derivative" is completely meaningless. The visual derivative of your photo cannot be matched against anything, and there is no such thing as an "original" visual derivative to match against. Let me elaborate.
The visual derivative is nothing more than a low resolution thumbnail of the photo that you uploaded. In this context, a "derivative" simply refers to a transformed, modfied or adapted version of your photo. So a "visual derivative" of your photo means simply a transformed version of your photo that still identifiably looks like the photo you uploaded to iCloud.
This thumbnail is never matched against known CSAM thumbnails. The thumbnail cannot be matched against known CSAM thumbnails, most importantly because Apple doesn't possess a database of such thumbnails. Indeed, the whole point of this exercise is that Apple really doesn't want to store CSAM on their servers!
Instead, an Apple employee looks at the thumbnails derived from your photos. The only judgment call this employee gets to make is whether it can be ruled out (based on the way the thumbnail looks) that your uploaded photo is CSAM-related. As long as the thumbnail contains a person, or something that looks like the depiction of a person (especially in a vaguely violent or vaguely sexual context, e.g. with nude skin or with injuries) they will not be able to rule out this possibility based on the thumbnail alone. You can try it yourself: consider three perfectly legal and work-safe thumbnails of a famous singer [1]. The singer is underage in precisely one of the three photos. Can you tell which one?
All in all, there is no "matching" of the visual derivatives. There is a visual inspection, which means that if you reach a certain threshold, a person will look at thumbnails of your photos. Given the ability to produce hash collisions, an adversary can easily generate photos that fail visual inspection. This can be accomplished straightforwardly by using perfectly legal violent or sexual material to produce the collision (e.g. most people would not suspect foul play if they got a photo of genitals from their Tinder date). But more sophisticated attacks [2] are also possible, especially since the computation of the visual derivative happens on the client, so it can and will be reverse engineered.
Q: I heard that there is a second hash function that Apple keeps secret. Isn't it unlikely that an adversarial image would trigger a collision on two distinct hashing algorithms?
A: No, it's not unlikely at all.
The term "hash function" is a bit of a misnomer. When people hear "hash", they tend to think about cryptographic hash functions, such as SHA256 or BLAKE3. When two messages have the same hash value, we say that they collide. Fortunately, cryptographic hash functions have several good properties associated with them: for example, there is no known way to generate a message that yields a given predetermined hash value, no known way to find two different messages with the same hash value, and no known way to make a small change to a message without changing the corresponding hash value. These properties make cryptographic hash functions secure, trustworthy and collision-resistant even in the face of powerful adversaries. Generally, when you decide to use two unrelated cryptographic hash algorithms instead of one, you make finding a collision at least twice as difficult for the adversary.
However, the hash functions that Apple uses for identifying CSAM images are not "cryptographic hash functions" at all. They are "perceptual hash functions". The purpose of a perceptual hash is the exact opposite of a cryptographic hash: two images that humans see/hear/perceive (hence the term perceptual) to be the same or similar should have the same perceptual hash. There is no known perceptual hash function that remains secure and trustworthy in any sense in the face of (even unsophisticated) adversaries. Most importantly, it is not guaranteed that using two unrelated perceptual hash functions makes finding collisions more difficult. In fact, in many contexts, these adversarial attacks tend to transfer: if they work against one model, they often work against other models as well [3].
To make matters worse, a second, secret hash function can be used only after the collision threshold has been passed (otherwise, it would have to be done on the device, but then it cannot be kept secret). Since the safety voucher is not linked directly to a full resolution photo, the second hashing has to be performed on the tiny "visual derivative", which makes collisions all the more likely.
Apple's second hash algorithm is kept secret (so much so that the whitepapers released by Apple do not claim and do not confirm its existence!). This means that we don't know how well it works. We can't even rule out the second hash algorithm being a trivial variation (or completely identical) to the first hash algorithm. Moreover, it's unlikely that the second algorithm was trained on a completely different dataset than the first one (e.g. because there are not many such hash algorithms that work well; moreover, the database of known CSAM content is really quite small compared to the large datasets that good machine learning algorithms require, so testing is necessarily limited). This suggests that transfer attacks are likely to work.
saithound 2021-08-19 11:40:33 +0000 UTC [ - ]
Q: If the second, secret hash algorithm is based on a neural network, can we think of its weights (coefficients) as some kind of secret key in the cryptographical sense?
A: Absolutely not. If (as many suspect) the second hash algorithm is also based on some feature-identifying neural network, then we can't think of the weights as a key that (when kept secret) protects the confidentiality and integrity of the system.
Due to the way perceptual hashing algorithms work, having access to the outputs of the algorithm is sufficient to train a high-fidelity "clone" that allows you to generate perfect adversarial examples, even if the weights of the clone are completely different from the secret weights of the original network.
If you have access to both the inputs and the outputs, you can do much more: by choosing them carefully [4], you can eventually leak the actual secret weights of the network. Any of these attack can be executed by an Apple employee, even one who has no privileged access to the actual secret weights.
Even if you have proof positive that nobody could have accessed the secret weights directly, the entire key might have been leaked anyway! Thus, keeping the weights secret from unauthorized parties does not suffice to protect the confidentiality and integrity of the system, which means that we cannot think of the weights as a kind of secret key in the cryptographical sense.
Q: I heard that it's impossible to determine Apple's CSAM image hashes from the database on the device. Doesn't this make a hash attack impossible?
A: No. The scheme used by Apple (sketched in the technical summary [6]) ensures that the device doesn't _learn_ the result of the match purely from the interaction with server, and that the server doesn't learn information about images whose hash the server doesn't know. The claim that it's "impossible to determine Apple's CSAM image hashes from the database on the device" is a very misleading rephrasing of this, and not true.
Q: Doesn't Apple claim that there is only a one in one trillion chance per year of incorrectly flagging a given account?
A: Apple does claim this, but experts on photo analysis technologies have been calling bullshit [8] on their claim since day one.
Moreover, even if the claimed rate was reasonable (which it isn't), it was derived without adversarial assumptions, and using it is incredibly misleading in an adversarial context.
Let me explain through an example. Imagine that you play a game of craps against an online casino. The casino will throw a virtual six-sided die, secretly generated using Microsoft Excel's random number generator. Your job is to predict the result. If you manage to predict the result 100 times in a row, you win and the casino will pay you $1000000000000 (one trillion dollars). If you fail to predict the result of a throw, you lose and pay the casion $1 (one dollar).
In an ordinary, non-adversarial context, the probability that you win the game is much less than one in one trillion, so this game is very safe for the casino. But this number, one in one trillion, is based on naive assumptions that are completely meaningless in adversarial context. If your adversary has a decent knowledge of mathematics at the high school level, the serial correlation in Excel's generator comes into play, and the relevant probability is no longer one in one trillion. It's 1 in 216 instead! Whenfaced with a class of sophomore math majors, the casino will promptly go bankrupt.
Q: Aren't these attacks ultimately detectable? Wouldn't I be exonerated by the exculpatory evidence?
A: Maybe. IANAL. I wouldn't want to take that risk. While matching hashes are probably not sufficient to convict you, and possibly not sufficent to take you into custody, but it's more than sufficient to make you a suspect. Reasonable suspicion is enough to get a warrant, which means that your property may be searched, your computer equipment may be hauled away and subjected to forensic analysis, etc. It may be sufficient cause to separate you from your children. If you work with children, you'll be fired for sure. It'll take years to clear your name.
And if they do charge you, it will be in Apple's best interest not to admit to any faults in their algorithm, and to make it as opaque to the court as possible. The same goes for NCMEC.
Q: Why should I trust you? Where can I find out more?
A: You should not trust me. You definitely shouldn't trust the people defending Apple using the claims above. Read the EFF article [7] to learn more about the social dangers of this technology. Consult Apple's Threat Model Summary [5], and the CSAM Detection Technical Summary [6]: these are biased sources, but they provide sketches of the algorithms and the key factors that influenced the current implementation. Read HackerFactor [8] for an independent expert perspective about the credibility of Apple's claims. Judge for yourself.
[1] https://imgur.com/a/j40fMex
[2] https://graphicdesign.stackexchange.com/questions/106260/ima...
[3] https://arxiv.org/abs/1809.02861
[4] https://en.wikipedia.org/wiki/Chosen-plaintext_attack
[5] https://www.apple.com/child-safety/pdf/Security_Threat_Model...
[6] https://www.apple.com/child-safety/pdf/CSAM_Detection_Techni...
[7] https://www.eff.org/deeplinks/2021/08/apples-plan-think-diff...
[8] https://www.hackerfactor.com/blog/index.php?/archives/929-On...
seph-reed 2021-08-19 01:50:46 +0000 UTC [ - ]
I think every meme should get pumped through this, just for lulz.
guerrilla 2021-08-19 07:52:28 +0000 UTC [ - ]
fnord77 2021-08-19 03:30:44 +0000 UTC [ - ]
why stop at CSAM? Pirated material like movies next?
heavyset_go 2021-08-19 03:50:01 +0000 UTC [ - ]
Some smart TVs do automated content recognition so the manufacturers can spy on what you're watching and sell the data to the highest bidders.
Syonyk 2021-08-19 04:50:27 +0000 UTC [ - ]
It's basically, "If we've come up with a way to grab it, we do. And send it to our servers. And do what we want with it."
It literally includes:
> We may receive information about the browser and devices you use to access the Internet, including our services, such as device types and models, unique identifiers (including, for Roku Devices, the Advertising Identifier associated with that device), IP address, operating system type and version, browser type and language, Wi-Fi network name and connection data, and information about other devices connected to the same network.
Emphasis mine. They literally have given themselves permission to nmap your LAN and upload the results!
cirrus3 2021-08-19 04:04:45 +0000 UTC [ - ]
You have made a huge leap from scanning for pre-existing CSAM while in transit to a cloud service to scanning frame buffers on device in real-time. You should get some type of Olympic medal for such a leap.
This tech is to catch the lowest possible hanging fruit of the dumbest of all CSAM-sharing/saving folks as required by law.
robertoandred 2021-08-19 02:04:59 +0000 UTC [ - ]
tibbar 2021-08-19 02:08:43 +0000 UTC [ - ]
Dylan16807 2021-08-19 03:30:13 +0000 UTC [ - ]
m2com 2021-08-19 01:53:57 +0000 UTC [ - ]
sodality2 2021-08-19 01:57:28 +0000 UTC [ - ]
neom 2021-08-19 02:00:56 +0000 UTC [ - ]
m2com 2021-08-19 02:03:20 +0000 UTC [ - ]
neom 2021-08-19 02:05:23 +0000 UTC [ - ]
Alupis 2021-08-19 02:19:37 +0000 UTC [ - ]
They're scanning anything you upload to iCloud (and have been for some time) but now also scan everything on your device too.
JimDabell 2021-08-19 02:24:44 +0000 UTC [ - ]
prawn 2021-08-19 05:00:13 +0000 UTC [ - ]
The more likely path to trouble is legal NSFW material that's been engineered.
kuratkull 2021-08-19 06:03:58 +0000 UTC [ - ]
esyir 2021-08-19 10:53:46 +0000 UTC [ - ]
Use porn as the base images. The more petite, flat and young looking, the better. The moderators are already going to be tuned in to csam, so all you need to do is to give them a slight push.
m2com 2021-08-19 02:00:22 +0000 UTC [ - ]
blintz 2021-08-19 03:23:29 +0000 UTC [ - ]
Several people have suggested simply layering several different perceptual hash systems, with the assumption that it's difficult to find a colliding image in all of them. This is pretty suspect - there's a reason we hold a decades-long competition to select secure hash functions. Basically, a function can't generally achieve cryptographic properties (like collision-resistance, or difficulty of preimage computation) without being specifically designed for it. By it's nature, any perceptual hash function is trivially not collision resistant, and any set of neural models are highly unlikely to be preimage-resistant.
The really tough thing to swallow for me is the "It was never supposed to be a cryptographic hash function! It was always going to be easy to make a collision!" line. If this was such an obvious attack, why wasn't it mentioned in any of the 6+ security analyses? Why wasn't it mentioned as a risk in the threat model?
shuckles 2021-08-19 03:27:26 +0000 UTC [ - ]
https://www.apple.com/child-safety/pdf/Security_Threat_Model...
ec109685 2021-08-19 03:38:20 +0000 UTC [ - ]
shuckles 2021-08-19 03:40:57 +0000 UTC [ - ]
blintz 2021-08-19 06:41:14 +0000 UTC [ - ]
Is that security-through-obscurity? If the model and weights for the second hash function became public, then we could still construct a collision on both functions, right?
shuckles 2021-08-19 08:30:19 +0000 UTC [ - ]
cirrus3 2021-08-19 03:47:20 +0000 UTC [ - ]
Yea collisions are technically, possible. Apple has accounted for that. What is your point?
Hashes are at the core of a lot of tech, and collisions are way easier and more likely to happen in those in many cases, but suddenly this is an issue for ya'll?
anishathalye 2021-08-19 01:29:42 +0000 UTC [ - ]
IncRnd 2021-08-19 02:19:57 +0000 UTC [ - ]
I'd like to share the following paper for anyone else who may be interested. It is about watermarking rather than a preimage attack.
"Adversarial Embedding: A robust and elusive Steganography and Watermarking technique" https://arxiv.org/abs/1912.01487
Unfortunately, the existence of invisible watermarking demonstrates a separate attack on the hash. Instead of a preimage attack, this might be able to change the hash of an image that is suspected of already being a match. A true-positive would be changed into a false-negative.
GistNoesis 2021-08-19 08:41:53 +0000 UTC [ - ]