PART 1 - The Business of Business in China

Several books have viewed both the medieval Catholic Church and the USSR like giant corporate firms.

It is through this same lens that we should view China today.

There is no separation between government and business in China. The businesses are separate from each other to some extent. They only exist in separate departments in the same way companies like Amazon are structured.

This concept is not particularly novel. The earliest corporations, for instance, existed only as brief, limited ventures granted at the whim of the state in the 17th century. To imagine that in this day and age all of the major corporations of a country operate at the behest and under the direction of the government is not radical. Indeed the concept that a corporation doesn’t operate at least somewhat at the behest of the state is likely the more radical of a proposition one could make.

So not only do all of these firms in China operate under the same “Board of Directors” so to speak, there are significant redundancies. Multiple firms don’t exist for competition like they do in the US; they exist to give the illusion of such competition, It is likely if the PRC had their way, there would only be a single corporation for each function of the government, all on a neat flow chart gently rising to the very top committees of the CCP.

The purpose of this introduction is to lead into the claim that Xiaomi, Mi, Hualai, and Wyze are all, in fact, the same company. Not only is this claim not that outlandish, but it is probable we are mistaken and have missed more companies that form this single, giant corporation.

First, Xiaomi is pretty much upfront about also being Mi. In fact the website for Xiaomi is Mi.com. While one could forgive them for this confusing marketing, it does not help that in the description of their offerings they state they are in the business of “smartphones, mobile apps, laptops, bags, trimmers, earphones, MI Television, shoes, fitness bands, and many other products.”  

Second, Wyze cameras are little more than rebranded Xiaomi cameras. This point is made somewhat humorously and directly by a Redditor here. While white labeling is nothing new in the US, we find it amusing that Wyze seems to take this to an extraordinary level by having extensive Mi customer data in its possession. We will show evidence for that particular claim later.

Finally, we will conclude by this section with one last remark. At first Xiaomi reminds one of a conglomerate from the 1960s (RCA in the US at the time was jokingly called Radios, Chickens, and Automobiles for how many acquisitions of disparate nature it had made). Yet digging further one comes to a better, more modern comparison. In 2011 The New Yorker called Goldman Sachs a “great vampire squid wrapped around the face of humanity, relentlessly jamming its blood funnel into anything that smells like money.” This is not quite right for our purposes though as this octopus, a group of secretive Chinese companies that include not only Wyze and Xiaomi but many more, does quite a bit. It jams its tentacles into your eyes to see what you see, into your veins to know how you feel, and finally crawl into your brain to be aware of everything you hear and ultimately to infer whatever you think, could consider, or have done.

Part 2 - Wyze is Mi is Xiaomi is Hualai is Kingsoft

2a - The Git Server

A self-hosted Gitlab server was left open in China. It is known that it belongs to Wyze because of the X.509 certificate on the device. We want to emphasize the mathematically irrefutable nature of all this. This domain belonged to Wyze based on the screenshots below, both the certificate and the source code. The nonces and fingerprints can be viewed here, confirming not only the location in China but that Wyze has knowingly in advance taken out a certificate for https://gitlab.wyzelabs.com.cn. There is a second Git server at https://git.wyzcam.com that the first partially mirrors from / to. Both servers are hosted in China and can be seen in the third screenshot.

This site shows gitlab.wyzelabs.com.cn
This site shows gitlab.wyzelabs.com.cn
git.wyzecam.com hosting information

This source code detailed a few things. First, Wyze intends shortly to collect the height, weight, gender, and BMI of its users. We reviewed data which shows they have already collected this type of biometric health information on several hundred users in the US and Europe. Also, comments suggesting daily protein intake, bone density, and several other medical fields their current equipment does not appear to collect were found. Whether they plan to collect this through cameras and AI Vision alone, or perhaps a Wyze scale or a Wyze wrist glucose monitor, remains to be seen.

Second, the source code has multiple databases the addresses were hardcoded into. This is significant because these servers were left open on the public internet (no firewall that restricts say access to just a few IP addresses that passwords were allowed to come from). So the address was all you needed to get started bruteforcing a correct password combination into the database. This was neither needed or done because the credentials for accessing these databases were hardcoded AS WELL into the source code. Yet even this was unnecessary to access the database because if one simply logged in without any password...the database immediately granted access.

I will stop at this point to note that only the databases hosted in mainland China did appear to have protection, and the US databases did not appear to have passwords (although they might very well have on other interfaces we did not see). It’s strange that they seem to be protecting data in China more than data in the United States and this should be noted.

Also, it is interesting where the second Git server was hosted (https://git.wyzecam.com). Not only was it in mainland China, but in an IP space that is usually not seen for hosting “civilian” applications. It was out of Shanghai, where the most notorious of the Chinese Advanced Persistent Threat groups operate. Shanghai is a large city, but what matters more is that the telecom hosting the service indicates a location of Shanghai rather than the actual coordinates. As you can observe from the screenshot above, the server resides in an IP space owned by Beijing Kingsoft Cloud Internet Technology, or Kingsoft in ASN 59019.

A few more things of note:

  1. The Founder & former CEO of Kingsoft, Jun Lei, who led their IPO, is currently the founder and CEO of Xiaomi. This is who seems to act as a simple white label company for Wyze.
  2. Kingsoft has known ties to APT groups, partly out of the ubiquity of their WPS software in China (equivalent to MS Office in the US). Their founder attended the China National Defense University, somewhat equivalent to the US Westpoint or British Sandhurst, and graduated with a degree in Information Systems in 1984.
  3. Kingsoft changed its name to Cheetah Mobile in 2017. Its shares are listed on the NYSE under the stock ticker CMCM.
  4. Kingsoft/Cheetah Mobile has been criticized for having, at best, freeloaded from the Microsoft Virus Information Alliance which they joined in 2009. It does not appear that they have ever contributed any information to the group, and others have complained on the record about a lack of participation and that perhaps the company was simply looking to gain information on what viruses were currently being detected by the group. Kingsoft’s Office & other software has been flagged some 27 times (search "Kingsoft" & “Cheetahmobile” & “Cheetah Mobile”) by China’s CNVVD vulnerability database, a database which RecordedFuture has said is routinely altered to conceal MSS influence.The same criticism has also been applied to CERT China it should be noted.

So what is the conclusion here?

  • It appears that US Wyze data has been online and accessible over the internet since the source code was first uploaded, which was 11 months ago
  • At least some source code was hosted in China, in an unusual and highly concerning network range, and it appears it contained the login credentials to the Wyze Master Data Warehouse in the United States.
  • Log files from all three of the open US-hosted databases show extensive access by multiple parties throughout the world. This data should be seen as irrefutable and at the 100% confidence mark level. Over 59,000,000 log files that recorded network access were used to come to this conclusion.

2c - The Database Server:

To recap the above, source code of Wyze was found that led us back to the now 3rd open production database of Wyze in the United States, their data warehouse. Inside this database were four schemas (which MySQL confusingly refers to not as schemas but as “databases” as well). You can see these below:

The 4 major databases (sometimes called "schemas") of Wyze. The other four seen are default MySQL schemas you would see on any installation.

We are only going to look at one schema for now, “db_GE”.  It contains a significant amount of detailed information, including what appears to be:

  1. Access to the credentials of the 24,000 users who connected Alexa devices
  2. Information about any task set up via the IFFTT api as well as credentials for administering this connection
  3. For some users what appears to be detailed face and facial recognition data... for all users the profile picture they uploaded at minimum. (See GE_Device_Reco_Person)
  4. More detailed internal networking information for each users home than was previously seen in our last post.
  5. Preferences, set ups, triggers, and timers that each user had configured in the application such as “wife home” or “kids home early from daycare.”
  6. The full certificate chain used to connect the devices to the cloud. This would have allowed interception of most video feeds it appears at this time.
  7. Technical P2P information that indicates that the devices can communicate in hard to detect and highly exploitable ways. See this recent post for more information on this particular vulnerability / setup.
Above is the full set of tables from the GE_db. In the next post we will go into more detail on these and many other topics discussed here we again want to emphasize.

Part 2d - The Alibaba Connection:

We plan to go into detail on the Alibaba connection in either a post early next week or the following. For now, we want to share just general hosting information of Wyze in China. First Wyze actually operates under many domains, such as wyzecam.com, wyzelabs.com, wyze.com. wyzemanagement.com, and a few others we will see later on.

Second, focusing specifically on wyzecam.com, and according to Shodan, there are at least 6 production servers that Wyze operates at a somewhat significant cost in mainland China. Third, because Shodan has been around sometime and is trivial to block, it is better to use Censys or BinaryEdge when possible (Shodan can still be incredibly useful though). Data from Censys, by using a special mathematical hash found in X.509 certs, indicates a very large infrastructure presence and is more in-line with what we would expect from a company processing the live video streaming data of millions of users.

A Shodan search that shows a very limited view of Wyze's internet facing infrastructure. It is still enough though to identify that a significant portion of their business is hosted in mainland China.

Part 3 - Additional Screenshots of Evidence from Part 1

The first set of screenshots comes from the Machine Learning Data Visualization tool featured in some of the newer versions of Kibana. We find it helpful for analyzing data. Some of these machine learning jobs ran over statistical subsets of data (minimum 5,000 documents per shard and there is usually a minimum of 500 shards in the cluster). However, we have forced the task to run over every single document, which means at times over 300,000,000 documents where analyzed. If you see something like 901,000 documents, that means for that particular field, there were only that many documents to scan.

This shows the breakdown of devices for Wyze. As you can see in the left top hand corner, 2,822,000 devices reported back to his logging infrastructure. Only about 50% of these devices are in the US and Canada.
In the top row, middle square, you will see lots of writing in Chinese. If you wish to Translate the text in Google manually, be sure to select Traditional, not Simplified under Mandarin. This particular square is interesting because it is not actually user data of any kind but administrative data for the database server itself. That means whoever was administering this data could read, write, and type Chinese fluently enough that they preferred it over English. we believe Wyze has already said another employee set up this server, so that rules out the founder. The question is who?
This shows a few more fields of data gathered by Wyze. Any alert that the cameras or security systems have generated is recorded. The model of the phone, the version and type of phone the app is running on, What kind of state such as power_on, power_off, alarm_on, etc the device is in. 
The question I have had dozens of people ask me is if you could intercept and/or look at the video of any user (remember there are close to 3 million of them). The answer is unfortunately yes for all Wyze devices. 

In reg ards to the caption above, at this time what we will say specifically on the record is this: it is (or at least was) theoretically possible for any individual anywhere in the world to access the live video feeds of every single Wyze camera that was online. Two hypothetical scenarios for how this could have happened:

  • using the API tokens from the production logging server to login as someone else
  • the private certificate files for the devices (every camera appears to have stored copies of their full certificate chain on the MySQL server) and then intercepting the data

The second method is a little more involved, and the first would have been what we would have done as an attacker. We plan to make a full POC of how this attack would have worked soon.

Part 4 - Temporary Highly Simplified Architecture of Wyze Infrastructure

For one of the co-authors of this piece, I would like to as a personal aside note that I began working as a Cloud Architect about a decade ago. To emphasize just how long ago that was, you can see here what the size of AWS was then and now. We don’t think I’ve ever drawn a diagram quite like this but for our purposes this draw.io architecture will work fine. Later on, we will post the full diagrams of all 200+ of Wyze’s servers. First network flow is not meant to be taken literally, although this diagram does indicate that information first flows to both Elasticsearch databases before being piped to MySQL. If you look at the list of database tables posted earlier, you will see one that contains the word “Kinesis”. This refers to a specific AWS service that integrates extensively with Elasticsearch, AWS Kinesis. We will return to this later.