Whisper Essay 1 - Guilty as Hell

Whisper has exposed all user information by leaving their production database cluster open for the last several weeks.

Ghost

Mar 4, 2020 • 8 min read

Summary:

Story: Whisper has exposed all user information by leaving their production database cluster open for the last several weeks. This first post will generally summarize some of the less sensitive data that was exposed. The military data will be shown in Post 2 in a few hours as the classified data is removed.
Authentication: All user passwords have been exposed via oauth codes, tokens, and other login credentials.
Revealed: Sexual fetish groups, suicide groups, and hate group membership of users can all be seen. Whether or not a user is a predator, if they are banned from posting near high schools, and their private messages can all be viewed.
Military: Detailed, personal, and live data can be viewed of service members on bases, missile silos, and US embassies throughout the world.

Furthermore:

Counterclaim: Whipser.sh will contest these assertions and lie to protect the company. The ES cluster has been downloaded and preserved along with detailed network logs to prevent this.
Contact: both the FBI and the Washington Post were informed at various points during this investigation.
Kik: appeared to be partially compromised in this breach. This will be documented in a second or third essay

Lastly:

Five: Whisper.sh will argue all Whispers are public anyway so this open database doesn’t matter. What they mean though is that yes the Whisper Post with about 5 fields of metadata was available to all
Ninety: What we found though was the same Whisper Post but with about 90 fields of metadata that ranged from last known geolocation to the actual password token.

Beginning:

Since 2014 there has been significant concern about the security, anonymity, and safety of the Whisper app. These concerns we think are not only valid but are worse than was realized.

The Whisper database is 5 TB in size and stretches across 75 different servers. This is all text data, which is extraordinary to see. All user images and videos are accessible as well, but hosted elsewhere in cloud storage buckets.

There is lots of damaging material and information that can be backwards engineered from the data made available. Most often that is not necessary because people simply disclose their real names in either posts or private messages to other users, and this can be viewed in the database. Some examples of the compromising content people post on Whisper:

It is possible to backwards link every post above to the original user.

From there one has the geocoordinates of nearly every place they’ve visited, and the ability to log into their account with their password/credentials. Depending on when the account was created and how much the user engaged with the app, dozens and dozens of fields of metadata can be reviewed.

Whisper has tried to protect themselves against this we think by disclosing all of this in bit in their privacy policy. It should be noted now that companies are almost shockingly open in their policies, nearly to the point of self-incrimination. From time to time it is good to actually read them just to see what we mean here. Additionally anytime a change in privacy policy email is sent out, that can effectively be used to infer breaches.

Yet from a legal theory standpoint that is a very “2013ish” strategy and it relies on the courts believing that average end-users can tell the effective privacy of an app or service. Even professionals struggle with this. It is hard to imagine most judges, particularly after the first few cases have been tried, adhering to this viewpoint.

User Data:

If you create a user right now in Whisper, it will show up online in the database being described here. Our tests showed that within about 20-25 seconds, all of the information you entered would appear in the main database cluster. This user, “hackingelasticsearchlegally”, was created just a few minutes before this paragraph now was typed out:

All of the columns holding data are as follows:

admin_delete
admin_email
admin_id
age
apns_token
apns_token_regular
apns_token_urban
app_id
approved_whisper_count
appsflyer_id
banned
banned_from_feeds
banned_from_high_schools
banned_from_list
banned_from_messaging
banned_from_messaging_ts
bot_message_count
campaign
chat_profile_updated_ts
chat_rating
chat_rating_count
client_version
conversation_counters
conversation_migration_version
conversations_received_count
conversations_started_count
country
created_by_admin
nickname
nickname_history
nickname_ts
crossed_paths_count_version
crossed_paths_unlock_ts
crossed_paths_unlocked
crosspath_unlocked_push_sent
crossroads_bucket
deleted_count
deleted_flag_count
deleted_user_count
device_id
disabled
feed_ids
feed_ids_v2
feed_types
feeds
feeds_migration_version
first_whisper_created
flagged_count
flagged_count_since_trusted
gender
geo_lat
geo_lon
geo_title
geo_title_update_ts
geohash
good_creator
has_datametrical_profile
has_mixpanel_profile
has_mixpanel_profile_ts
has_received_sme_message
hearts_per_whisper
ifa
input_language
interested_in
intersection_count
intersection_creator_count
ip
is_inside_walled_garden
key_fingerprint
language
last_conversation_time
last_crosspath
last_crosspath_activity_feed_update
last_crosspath_avg
last_crosspath_push_number
last_current_poi_create
last_current_poi_post
last_feed_unlock
last_heart
last_hyper_local_nearby_whisper
last_intersection_count_update_ts
last_location_update_ts
last_login
last_my_feed_lookup_time
last_my_feed_read_time
last_nearby_user_update_ts
last_nearby_whisper
last_new_feed_post
last_reply
last_significant_feeds
last_updated
last_updated_token_ts
last_whisper_of_the_day
last_whisper_text
last_whisper_ts
last_wid_ts
limited_ad_tracking
locale
location
lat
lon
},
location_meta
location_permission_level
me2_count
me2s_migration_version
migration_version
mixpanel_ab_cohort
most_active_hour
new_pin
nux
only_nearby_conversations
osm_ids
n150940434,
r114690,
r148838,
r253556,
r4468307,
r4468409,
w43356824
],
pin
pin_enabled
pin_selected
post_create_view
predator_probability
predator_probability_update_ts
public_key
public_uid
puid
push_comment_reply
push_crosspath
push_current_poi_create
push_current_poi_post
push_feed_unlock
push_geo
push_heart
push_new_feed_post
push_popular_story
push_reply
push_significant_feeds
push_wotd
regenerate_keys
region_validation
registered
reply_whisper_count
shared_secret
sme
state
suspected
suspected_date
system_locale
testing_features
timezone
token_type
top_level_whisper_count
trusted
ts
tt_key
tt_migration
tt_secret
tt_token
uid
unread_notifications
unsubscribed_school
update_last_nearby_on_login
updated_from
urban
urban_lock_screen
version
walled_garden_reason
walled_garden_ts
whisper_count
whispers_approved_since_untrusted
whispers_deleted_by_admin_since_trusted
whispers_deleted_by_flag_since_trusted
whispers_forbidden_since_trusted
whispers_to_be_approved

Some of the top feeds/groups, which are used to collect blackmail on members, particularly military, is shown below. We go into this in a little bit more detail in Essay 2 and then finally we show, based on Gary King's research at Harvard, how this statistically and historically reflects the tactics of the MSS and its predecessor organizations.

As you can see, nearly every of the top groups is dedicated to getting a member to perform an action that clearly could be used against them later. "Sexy Lady Selfies", "Sexual Confessions", and "Roleplay Only" are designed to trick users into totally destabilizing their lives.

Children:

Whisper’s database contains a significant amount of information on children throughout the United States.

Here are the exact Whisper coordinates for someone who claimed to be 15 and was tweeting from a middle school.

Here is the record of the said user fleshed out in more detail.

It is possible for instance to find all of the sexual messages posted by the teen and pre-teen children of US congresspersons. For instance, if you take any user who has set foot in the Capitol, and also one of the deeply wealthy private schools that dot the Northern Virginia / DC capital area,

The Guardian:

In 2014 Whisper endured a very large amount of criticism over its privacy practices. The Guardian wrote a particularly scathing article, a screenshot of which we can see above. Whisper then wrote a 5 page detailed response to this article. Because the database has data going back to 2014 we can analyze some of these claims. A few things jump out:

Whisper explicitly lied to The Guardian and their investors about their practices. These were not subtle lies but statements that claimed things when in fact Whisper was doing the exact opposite.

The data collected by Whisper since 2014 has grown more detailed and aggressive. Not only did they lie in 2014, they continued and accelerated the practices.

Time:

This database mostly goes back to the founding of Whisper in 2012. It is not clear if all records have been kept, but a very overwhelming majority of them certainly have. Also as the total size of the database is 5 TB and it would not make sense for Whisper to really have deleted any records, knowing the data/marketing value they could hold, one can conclude for a second time that most of the data since 2012 has been preserved.

One way to show this specifically is with timestamps. There is something in computer science called Epoch Time, or the number of seconds since January 1st, 1970. We find values pretty equally distributed from 1350000000 (latter 2012) or so to 1550000000 (latter 2019 and all the way up to today).

Whisper also seems to have gone to great lengths to keep and document content that should have been deleted. First there is an S3 bucket called “whisper-deleted.s3.amazonaws.com”

Second, there are many record types that show for instance all the groups you’ve *unsubscribed* from, or all the previous usernames you used to have. There is something of a spirit of “German record keeping” in the diligence to maintain so much for so long. We do not believe it is asking much to think that for some strange reason, right or wrong, MediaLab considers this data very valuable and worth maintaining, increasing, and keeping.

Tokens:

All user passwords/login credentials have been exposed. It is possible to login as any user anywhere. Here is an example of the exact shell code you would run if you wanted to see how everything worked at a very low level for one particularly important service. We simply send the key and token, which are Base64’d together, to the message API endpoint, and then we will receive a 2XX http status code. From there we have lots of options as documented in the TigerText API.

Predator:

The Whisper app rates users on how likely they are to be a sexual predator. Currently it has rated 9,000 users with a probability of 100%. Another 10,000 have a probability of 50%, as seen below.

The last time the predator probability score was updated is also made available, as seen here:

Would it be possible to learn a little more about this user? We can see that they likely are involved in Spokane Community College.

Additionally, if a user has been banned for soliciting a minor, that will come up as well.

A sample of 10,000 records yields the following distribution. We can see that illegal or inappropriate sexual / content behavior seems to account for 80% of bans. Spamming and being under age make up the remaining 20%.

Misc Security Observations:

Whisper appears to have abandoned its Bug Bounty program in 2015 for unknown reasons. They paid out a modest amount of bounties before leaving the platform of Hackerone.com entirely.

The copyright date of Whisper.sh still reads “2017”. To see something small like this suggests that the site didn’t really have the focus of the engineering team and many things were just left as is and never fixed.