PolicyTimeMachine

Stay informed. Stay protected.


Visualization-TikTok-Privacy Policy

The following graphs were drawn to illustrate the changes among different versions of privacy policy documents of TikTok in 6 years. The results show that TikTok kept adding specific data type which is related to the user content, and kept modifying controversial data category like the location. Although the collection of users’ biometric data like keystroke patterns led to a negative association with data leakage and tracking (Paul Mozur et al., 2022), TikTok did not modify it and added faceprints and voiceprints. Other than that, the location and duration of data retention only have been reported in the last 2 years.

Main Changes in 6 years

The first graph reflects the main modification of each distinct version of TikTok’s privacy policy. In the beginning, the platform modified its statement of collecting users’ location from “they will”, to “they may”, and they distinguished location info and GPS, stating that people have an opt-out option for GPS, but did not mention that of location info.

In 2019, the platform added content about children younger than 13 and added a line that people can request to delete their info. Additionally, TikTok stated that using aggregated or de-identified info is not subject to the policy, and they always say that the duration of the data storage will be as long as it can be, yielding to the law but even when users stop using the services, they can still have their data.

After 2020, TikTok added lots of content about user content like “live”, “audio recording”, pre-loading content, and content without effect, as also some terms related to paying using digital coins, and some virtual items. In 2021, TikTok finally added a statement with the data retention location, change from US to Singapore, and outside users’ living country.

What kind of data does TikTok collect?

It is suggested that although many versions of the privacy policy were updated by TikTok, the main categories of data they would collect changed slightly. However, the number of specific data they mentioned became quite larger, and the data was more comprehensive. The following shows the data categories which are proposed in the current version of the privacy policy, the width of each color represents the number of data types under the category. By systematically reading all versions of the policy, I have observed that the language and expressions used by the platform are ambiguous, for example, in the earliest versions, it used the statement ‘platform will collect,’ while in subsequent versions, it commonly uses the expression ‘platform may collect,’ without clearly informing users about what specific data they must provide in order to obtain the corresponding services.

Instead,  the platform puts the initiative in the hands of the user, stating nearly half of the information as ‘User provide’. Additionally, there is also a great amount of data that was described as “automatically collected when users use the service”, and some other data that the platform states could be gained from other sources like other social media applications and third parties.  The data from the platform refers to the data generated by users through content publishing and interaction with and within the platform.

By using this expression, it seems the platform downplays its role as the primary actor in data collection. At the same time, the platform does not provide clear explanations for optional and required data, but it does indicate some data categories that require user consent, as shown in the diagram above. Overall, the diagram indicates that the majority of data without optional options comes from platform usage and automatic collection.

There is a noticeable part of “other data”,

But What Is That Data? 

It could be found that some data can not be allocated to any of the data categories in the codebook, so I just noted them here.

In 2017, TikTok added it would collect data from other social media when users use their account to log in on TikTok, their user profile, email, and other usage data would be collected. In 2019, the platform stated that they might collect data of one specific user from other users, and added keystroke patterns in their terms. Also, biometric identifiers like faceprints and voiceprints have also been included in 2021. Some specific data like the data on the clipboard and pre-loading data were also added to be collected. 

Now, we have a sense of what kind of data TikTok would collect, but

who would they share those data with? and for what purpose?

Similarly, the platform only named the recipients, without telling which exact data would be transferred to whom. There are 4 main recipients that are Service Providers and Business Partners (light blue), affiliations (green), authority & law agencies (orange), and third parties (deep blue), and there are a lot of sub-categories in the first light blue area. But in the earliest version of the policy, the platform listed the most third-party entities like advertisers and advertising networks; cloud storage providers; analytics and search engine providers; IT service providers; data centers, and the servers of our host providers.

The description of data usage is vague, and people are not aware of how each of the data mentioned before is used by the platform. And the platform tends to just list the purpose. The light green color noted the one enables people to opt-out, and people can also choose not to share GPS, and not to receive advertising emails.