Whisper Essay 1 - Guilty as Hell
Whisper has exposed all user information by leaving their production database cluster open for the last several weeks.
- Story: Whisper has exposed all user information by leaving their production database cluster open for the last several weeks. This first post will generally summarize some of the less sensitive data that was exposed. The military data will be shown in Post 2 in a few hours as the classified data is removed.
- Authentication: All user passwords have been exposed via oauth codes, tokens, and other login credentials.
- Revealed: Sexual fetish groups, suicide groups, and hate group membership of users can all be seen. Whether or not a user is a predator, if they are banned from posting near high schools, and their private messages can all be viewed.
- Military: Detailed, personal, and live data can be viewed of service members on bases, missile silos, and US embassies throughout the world.
- Counterclaim: Whipser.sh will contest these assertions and lie to protect the company. The ES cluster has been downloaded and preserved along with detailed network logs to prevent this.
- Contact: both the FBI and the Washington Post were informed at various points during this investigation.
- Kik: appeared to be partially compromised in this breach. This will be documented in a second or third essay
- Five: Whisper.sh will argue all Whispers are public anyway so this open database doesn’t matter. What they mean though is that yes the Whisper Post with about 5 fields of metadata was available to all
- Ninety: What we found though was the same Whisper Post but with about 90 fields of metadata that ranged from last known geolocation to the actual password token.
Since 2014 there has been significant concern about the security, anonymity, and safety of the Whisper app. These concerns we think are not only valid but are worse than was realized.
The Whisper database is 5 TB in size and stretches across 75 different servers. This is all text data, which is extraordinary to see. All user images and videos are accessible as well, but hosted elsewhere in cloud storage buckets.
There is lots of damaging material and information that can be backwards engineered from the data made available. Most often that is not necessary because people simply disclose their real names in either posts or private messages to other users, and this can be viewed in the database. Some examples of the compromising content people post on Whisper:
It is possible to backwards link every post above to the original user.
From there one has the geocoordinates of nearly every place they’ve visited, and the ability to log into their account with their password/credentials. Depending on when the account was created and how much the user engaged with the app, dozens and dozens of fields of metadata can be reviewed.
Yet from a legal theory standpoint that is a very “2013ish” strategy and it relies on the courts believing that average end-users can tell the effective privacy of an app or service. Even professionals struggle with this. It is hard to imagine most judges, particularly after the first few cases have been tried, adhering to this viewpoint.
If you create a user right now in Whisper, it will show up online in the database being described here. Our tests showed that within about 20-25 seconds, all of the information you entered would appear in the main database cluster. This user, “hackingelasticsearchlegally”, was created just a few minutes before this paragraph now was typed out:
All of the columns holding data are as follows:
Some of the top feeds/groups, which are used to collect blackmail on members, particularly military, is shown below. We go into this in a little bit more detail in Essay 2 and then finally we show, based on Gary King's research at Harvard, how this statistically and historically reflects the tactics of the MSS and its predecessor organizations.
As you can see, nearly every of the top groups is dedicated to getting a member to perform an action that clearly could be used against them later. "Sexy Lady Selfies", "Sexual Confessions", and "Roleplay Only" are designed to trick users into totally destabilizing their lives.
Whisper’s database contains a significant amount of information on children throughout the United States.
It is possible for instance to find all of the sexual messages posted by the teen and pre-teen children of US congresspersons. For instance, if you take any user who has set foot in the Capitol, and also one of the deeply wealthy private schools that dot the Northern Virginia / DC capital area,
In 2014 Whisper endured a very large amount of criticism over its privacy practices. The Guardian wrote a particularly scathing article, a screenshot of which we can see above. Whisper then wrote a 5 page detailed response to this article. Because the database has data going back to 2014 we can analyze some of these claims. A few things jump out:
Whisper explicitly lied to The Guardian and their investors about their practices. These were not subtle lies but statements that claimed things when in fact Whisper was doing the exact opposite.
The data collected by Whisper since 2014 has grown more detailed and aggressive. Not only did they lie in 2014, they continued and accelerated the practices.
This database mostly goes back to the founding of Whisper in 2012. It is not clear if all records have been kept, but a very overwhelming majority of them certainly have. Also as the total size of the database is 5 TB and it would not make sense for Whisper to really have deleted any records, knowing the data/marketing value they could hold, one can conclude for a second time that most of the data since 2012 has been preserved.
One way to show this specifically is with timestamps. There is something in computer science called Epoch Time, or the number of seconds since January 1st, 1970. We find values pretty equally distributed from 1350000000 (latter 2012) or so to 1550000000 (latter 2019 and all the way up to today).
Whisper also seems to have gone to great lengths to keep and document content that should have been deleted. First there is an S3 bucket called “whisper-deleted.s3.amazonaws.com”
Second, there are many record types that show for instance all the groups you’ve *unsubscribed* from, or all the previous usernames you used to have. There is something of a spirit of “German record keeping” in the diligence to maintain so much for so long. We do not believe it is asking much to think that for some strange reason, right or wrong, MediaLab considers this data very valuable and worth maintaining, increasing, and keeping.
All user passwords/login credentials have been exposed. It is possible to login as any user anywhere. Here is an example of the exact shell code you would run if you wanted to see how everything worked at a very low level for one particularly important service. We simply send the key and token, which are Base64’d together, to the message API endpoint, and then we will receive a 2XX http status code. From there we have lots of options as documented in the TigerText API.
The Whisper app rates users on how likely they are to be a sexual predator. Currently it has rated 9,000 users with a probability of 100%. Another 10,000 have a probability of 50%, as seen below.
The last time the predator probability score was updated is also made available, as seen here:
Would it be possible to learn a little more about this user? We can see that they likely are involved in Spokane Community College.
Additionally, if a user has been banned for soliciting a minor, that will come up as well.
A sample of 10,000 records yields the following distribution. We can see that illegal or inappropriate sexual / content behavior seems to account for 80% of bans. Spamming and being under age make up the remaining 20%.
Misc Security Observations:
- Whisper appears to have abandoned its Bug Bounty program in 2015 for unknown reasons. They paid out a modest amount of bounties before leaving the platform of Hackerone.com entirely.
- The copyright date of Whisper.sh still reads “2017”. To see something small like this suggests that the site didn’t really have the focus of the engineering team and many things were just left as is and never fixed.