Evidence of fraud in an influential field experiment about dishonesty
function_seven 2021-08-17 15:53:11 +0000 UTC [ - ]
Ariely's response[1] to this puts the fraud on the insurance company. He rightly notes that better data anomaly testing would have caught this, but I also wonder why the company "knew" which hypothesis to put their thumb on? And where did the impetus come from to double the number of records rather than leave N=6,744?
[1] http://datacolada.org/storage_strong/DanBlogComment_Aug_16_2...
frostburg 2021-08-17 15:55:37 +0000 UTC [ - ]
xondono 2021-08-17 17:46:47 +0000 UTC [ - ]
And if I was in charge of the study (or anyone with enough experience I think), that “test” should have been kept secret from the company.
This to me implies at least blurry boundaries between the researcher (in this case Ariely) and someone at the company.
I will also admit not to be very impartial to this, being to this day very unconvinced by Arielys research and especially his conclusions and the platform he has built around them.
function_seven 2021-08-17 16:11:22 +0000 UTC [ - ]
I'm going through the responses from the other 3 study authors, and I'm seeing a pattern in their replies. They're all—as gently and politely as possible—laying it at Dan Ariely's feet. (He does so as well in his own response; but there's a tiny whiff of skepticism—at least in my reading between lines—that he was blameless.) Mr. Bazerman's response seems the strongest in this regard.
From Francesca Gino[1]:
> I start all my research collaborations from a place of trust and assume that all of my co-authors provide data collected with proper care and due diligence, and that they are presented with accuracy. In the case of Study 3, I was not involved in conversations with the insurance company that conducted the field experiment, nor in any of the steps of running the study or analyzing the data.
From Max H. Bazerman[2]:
> The first time I saw the combined three-study paper was on February 23, 2011. On this initial reading, I thought I saw a problem with implausible data in Study 3. I raised the issue with a coauthor and was assured the data was accurate. I continued to ask questions because I was not convinced by the initial responses. When I eventually met another coauthor responsible for this portion of the work at a conference, I was provided more plausible explanations and felt more confidence in the underlying data. I would note that this coauthor quickly showed me the data file on a laptop; I did not nor did I have others examine the data more carefully.
From Nina Mazar[3]:
> I want to make clear that I was not involved in conducting the field study, had no interactions with the insurance company, and don’t know when, how, or by whom exactly the data was collected and entered. I have no knowledge of who fabricated the data.
and
> This whole situation has reinforced the importance of having an explicit team contract, that clearly establishes roles, responsibilities, and processes
[1] http://datacolada.org/storage_strong/Gino-memo-data-colada-A...
[2] http://datacolada.org/storage_strong/fraud.resonse.max_.8.13...
[3] http://datacolada.org/storage_strong/20210816_NM-Response2Da...
namelessoracle 2021-08-17 17:02:49 +0000 UTC [ - ]
While the insurance company as a whole didn't. There may have been someone inside the insurance company who wanted to justify spending funds to do this research and wanted a laurel on their cap about how they improved the accuracy of self reporting by X or Y value.
duxup 2021-08-17 16:15:56 +0000 UTC [ - ]
You see this in engineering failures when a bunch of companies or groups are involved to limited extents and everyone does their part, but nobody does something important because it wasn't defined who would do that.
rz2k 2021-08-17 16:27:43 +0000 UTC [ - ]
Using the uniform distribution makes it seem like it wasn't one of the researchers or anyone at the insurance company who has studied actuarial science.
frankster 2021-08-18 19:17:35 +0000 UTC [ - ]
bediger4000 2021-08-17 15:58:11 +0000 UTC [ - ]
meowface 2021-08-17 16:03:50 +0000 UTC [ - ]
prvc 2021-08-17 16:10:24 +0000 UTC [ - ]
IncRnd 2021-08-17 16:26:16 +0000 UTC [ - ]
jjk166 2021-08-17 16:50:39 +0000 UTC [ - ]
Dishonesty in a study on dishonesty is contrary to what I would expect and, at least in my opinion, amusing as a result.
civilized 2021-08-18 01:56:46 +0000 UTC [ - ]
> The work was conducted over ten years ago by an insurance company with whom I partnered on this study. The data were collected, entered, merged and anonymized by the company and then sent to me. This was the data file that was used for the analysis and then shared publicly.
On a plain reading, Ariely clearly states here that the data file prepared by the insurance company was the same as the one used for the analysis and shared publicly. But we already know that he modified the file:
- He is listed as the creator and last modifier of the published Excel file.
- In an email to his collaborator Nina, he admitted that he switched the outcome labels in an attempt to make the data easier to understand. So clearly the published file was not identical to the one provided by the insurance company.
So I worry that Ariely is taking us for a ride here. He had the most incentive to get the desired results, far more than the nameless insurance company.
The original insurance company dataset, if it exists, is probably lying around in an email somewhere. They were, after all, able to provide the final version, which was prepared not too long after the original.
woofie11 2021-08-18 02:35:36 +0000 UTC [ - ]
This stuff is common, especially in elite schools, and especially among researchers with a lot of popular uptake. Stuff like this brings academic jobs.
Universities have no incentive to fire high-impact professors, and internal channels are populated by friends.
Competition for faculty positions is 1000:1 at elitist schools; jobs aren't available for those who cheat.
radicaldreamer 2021-08-18 06:50:18 +0000 UTC [ - ]
woofie11 2021-08-18 14:01:12 +0000 UTC [ - ]
1) Results don't replicate.
2) Look up her article in Nature, which claims to be preregistered, then look at the actual preregistration. See the data mismatch.
3) Read her book, and look at the claims in the introduction. Compare those to effect sizes. Are the claims supported or contradicted even by her (already baked) data? By replication studies?
Read the Boaler books and studies. You'll find a lot of fake information too. Not just studies, but also historical stuff (e.g. passages about Einstein and other famous folks).
This is kind of an open secret in the ed research community. Everyone knows about it. Few dare talk in public. You can see outcomes for those who did. And for the general public, it's too technical. That a preregistration comes before data collection isn't something one can explain to a popular audience.
This is the culture at MIT, Brown, Stanford, and I believe many other elite schools. Those three are just where I have first-hand information.
frankster 2021-08-18 19:19:28 +0000 UTC [ - ]
This doesn't prove much, but it's an embarrassing coincidence for him.
frostburg 2021-08-17 15:54:09 +0000 UTC [ - ]
I don't see how it would be plausible for the insurance company collecting the data to independently tamper with it in that specific way (and getting the typeface wrong) before passing it to the unsuspecting researchers.
Oh, and of course they immediately taught the result to MBAs and executives. I wonder how long it'll take to filter out of the system.
civilized 2021-08-17 23:58:25 +0000 UTC [ - ]
If Ariely were involved in the fraud, wouldn't he have resisted the move to publish the data? It is very easy for researchers to make up excuses to not share data, and very difficult to force them to do so.
In the absence of an answer to this question, I find it easier to believe in shenanigans at the company. I agree it's hard to imagine what their motivation was, but it's even harder to understand why the researchers published their data so readily if one of them fabricated it.
The incompetence of the fraud doesn't really push me one way or the other. Industry and academia are definitely both extremely capable of doing incompetent, dishonest nonsense with data.
function_seven 2021-08-18 00:57:47 +0000 UTC [ - ]
Okay, so that last paragraph is pure speculation. But people—even smart people!—do nonsensical things all the time. “If he was guilty, why would he do X?” is rarely a good defense.
All that being said, I’m still more likely to believe the fabrication was done by someone at the insurance company. But it would be better if we got more detail on who Dan was working with. Or the exact method of data delivery. (The file in question shows Dan as the original creator in metadata, and has the Cambria/Calibri issue. How exactly did that happen?)
Not holding my breath on that, though.
DominikPeters 2021-08-18 03:30:57 +0000 UTC [ - ]
duxup 2021-08-17 16:11:04 +0000 UTC [ - ]
Does the insurance company have any involvement / motivation?
Outside someone at the insurance company who want an outcome of the paper to fit some goal of their own, hard to imagine the insurance company would "care" about the results enough to mess with the data. Although I'm open to the possibility that someone was just lazy and they wanted another dataset and so someone just fabricated it based on another dataset to just get it done with.
milliondollar 2021-08-17 19:52:32 +0000 UTC [ - ]
Hanlon's (and Ocham's) Razor all the way here. Laziness / stupidity wins.
duxup 2021-08-17 20:55:05 +0000 UTC [ - ]
civilized 2021-08-18 00:16:08 +0000 UTC [ - ]
bsder 2021-08-17 22:20:45 +0000 UTC [ - ]
I have fabricated data to shut up my political chain more than once in my life. Why? Because they kept pestering me after being told that the data doesn't exist yet but will exist naturally at some point in the future.
So, I can fight with my management chain because some VP has "collect data about X" on his quarterly goals and simply won't take "No" for an answer. Or I can feed him crap data that he will most likely forget about. And if the data is actually important, the data will fix itself in <n> months when I collect it.
Most probably, the data never gets looked at and I never waste the time collecting it. All good. I'm a wonderful team player that gets his job done. Probability: 95%
Or, possibly, some intern comes to me in 18 months asking why my data seems to be ... off. Cool. Unbelievably, someone is really using that data. I give a "Hrm. I'll go look at that." prioritize the poor intern, collect the data and give them an attaboy for being so diligent. Intern is happy and his boss thinks he's extra diligent. Probability: 4%
Or, if the data was actually important, I collected it and resubmitted it myself at the first point we could realistically collect it because I wanted it for myself, too. Probability: 1%
However, if that fabricated data somehow escaped the company and people depended upon it, yeah, egg yolk on the face all around, and I might get fired. Probability: 0% to a three digit engineering approximation.
civilized 2021-08-18 00:25:43 +0000 UTC [ - ]
bsder 2021-08-18 02:07:13 +0000 UTC [ - ]
It wasn't just my group being harassed. It was probably 15+ design groups. Sure, that VP eventually got nuked, but fighting with a shitty VP generally results in you losing your job before he does.
Politics is a thing. You pick your battles--you only get so many bullets. Too many people here on HN think that fighting every single slight makes you honorable. No, it actually makes you jerk--shitty things happen even at the best places and you need to deal with them without pissing everybody off--it's called being an adult. Sure, at some point enough shitty things happen that you should leave. Prior to that you need to learn how to deal with things so that your team is protected.
Feeding that VP fabricated data meant that he thought I was "good guy" team player. My chain VP got less political heat. My team got an extra positive evaluation for generating data early and going "above and beyond". Everybody on our side got back to doing their job instead of something stupid that would never help us.
All this at the possible cost that I might have to personally say "Whoops, I screwed that up. My bad." 18 months down the road for a single VP who may not even be there that long. I'm gonna take that tradeoff 99 times out of 100.
Now, is that the case here? Don't know and it doesn't look like it. However, don't rule out the fact that someone got "tasked" with something that was obstructing them and did the absolute minimum thing to make it go away.
civilized 2021-08-18 02:22:27 +0000 UTC [ - ]
ms9 2021-08-19 06:01:02 +0000 UTC [ - ]
nn3 2021-08-17 19:13:04 +0000 UTC [ - ]
superjan 2021-08-17 20:29:43 +0000 UTC [ - ]
vincent-toups 2021-08-17 17:26:28 +0000 UTC [ - ]
derbOac 2021-08-17 19:30:42 +0000 UTC [ - ]
woofie11 2021-08-18 02:40:53 +0000 UTC [ - ]
Re-analyzing data 20 times, changing methodologies (median versus mean, handling of outliers, etc.) typically is enough to get an interesting result, and isn't enough to raise alarms. Most people are competent enough to do something like that.
Credit theft is rampant at MIT as well. Financial schemes too. No one does a darned thing about it either.
vlovich123 2021-08-17 17:37:54 +0000 UTC [ - ]
[1] https://statmodeling.stat.columbia.edu/2012/04/27/how-to-mis... [2] https://en.wikipedia.org/wiki/Darrell_Huff
skynetv2 2021-08-17 16:05:29 +0000 UTC [ - ]
radicaldreamer 2021-08-18 06:53:53 +0000 UTC [ - ]
ms9 2021-08-19 04:45:52 +0000 UTC [ - ]
ghostbrainalpha 2021-08-18 16:10:15 +0000 UTC [ - ]
Sort of like Scientific Snopes, is there anything like that already?
ms9 2021-08-19 04:41:08 +0000 UTC [ - ]
The bigger issue is that just about all these star researchers end up with MORE MORE MORE disease, which is taking over academia. People are working on more projects, giving more talks and writing more papers with more studies with more moderators and more mediators using more data sources and more research assistants and more post docs while also trying to write more books and give more talks to more audiences. Sorry, but a mess up is inevitable with that many things going on. Still, I hope this does not turn into a witch hunt after one guy, because the truth is that mistakes like this likely happen to all the star academics who have overextended themselves. How could they not?
IncRnd 2021-08-17 16:07:28 +0000 UTC [ - ]
https://archive.is/zayEm