Trump’s trade war has overshadowed the other battle – supremacy in AI. After China’s State Council released a national strategy for AI development in July 2017, there was a race to understand what this would mean. Both China and AI are difficult enough to comprehend on their own. Many interpretations made at that time have become entrenched myths, such as:
- China’s AI outcomes will be a result of unified, cohesive, singular decision making
- China will “win” the technology race because of vast, unprotected, unified data
- China does not care about AI ethics or safety
Let’s look at these in turn.
Myth: China’s AI outcomes will be a result of unified, cohesive, singular decision making
One of the most persistent myths about the Chinese Communist Party is that it drives decision making via a set of coordinated plans, or what Jennifer Pan, Assistant Professor of Communication at Stanford University, calls “a hive mind” of party, companies and president.
As Pan pointed out in her presentation at the Stanford Human-centered AI symposium, the Chinese system is made up of different individuals, motivated by different opposing incentives. It is flawed, dysfunctional and filled with people pursuing different goals.
A report by the Future of Humanity Institute at the University of Oxford also busts this myth: while the central government plays an important guiding role, bureaucratic agencies, private companies, academic labs and subnational governments are all pursuing their own interests to stake out their claims to China’s AI future.
There are a lot of people involved, all manipulating information to influence what each other knows about what’s going on. This means that the decisions about technology and its use are driven by internal politics and are far from being made by a unified and monolithic group.
The role of China’s social credit system in underpinning AI advancement is also misunderstood, according to Pan. Outsiders can perceive the social credit system as the ultimate Black Mirror tool for AI development, but the truth is messier, especially when it comes to how it will drive the technology. The reality is that there are multiple social credit systems – provincial and state – and they are all competing over data. “Whoever comes out first has more influence, and more power, and more money,” says Pan.
Myth: China will “win” because of vast, unprotected, unified data
Scoring progress in AI is difficult. But fundamentally, AI comes down to hardware, data, research and algorithms, talent and the commercial AI ecosystem.
Analysts persistently state that the amount of data in China is the core advantage compared with the US. It’s true to a certain extent: 1.4 billion people with deep smartphone penetration, combined with 24/7 online and offline data collection equals a staggering amount of data.
The reality of data is far more complex. Data are not a single-dimensional input to AI – data vary in quality, diversity, depth and access as well as quantity. And while critical for AI development today, much of the research effort has the goal of reducing the reliance on vast data scale, which means that it’s important to understand what other attributes of data are important for AI in the longer term.
Quantity: the size of China’s population and penetration of smart phones is misleading, according to analysts at Macropolo, a Chicago-based think-tank specializing in China’s economic arrival. The more important comparison is China’s reach into global users. In this, the US tech giants have been more successful. The real advantage that Chinese companies have is one of faster scale-up by relying on domestic users, while the US may have the advantage of a higher ceiling of total users.
Depth: the depth of data that Chinese firms capture is more, based on different aspects of user behavior. The more an algorithm is trained on different types of user behavior, the more sophisticated its recommendations or predictions can be for that user. China is a clear winner because leading tech companies capture more about users’ offline world. Real-world activities such as bike-sharing, appointments booked and meals ordered are captured more by Chinese tech companies such as Tencent and Alibaba than they are in the US by Google, Apple or Amazon, for example.
Quality: There are three dimensions to quality: accuracy, structure and storage of training data. In this, the US has the edge, according to Macropolo, because its data tend to be more reliable, and much more of its data have been digitized and stored in easily retrievable formats. This will be particularly true in healthcare.
Diversity: this measure refers to the amount of data heterogeneity and it’s important because it gives AI algorithms breadth and diversity. Without diversity, algorithms are particularly accurate at narrow tasks but are easily stumped on something new. The US holds a clear advantage because of its diverse domestic population and the global user base of many Silicon Valley companies.
Access: When it comes to public spaces, China wins. Partnerships between China’s leading facial recognition startups and law enforcement are similarly vacuuming up hundreds of millions of face scans, using them to stitch together a national surveillance system. With resistance to facial recognition building in the US, it’s easy to see China advancing on this front.
Dimensions of data – China versus the US
From a technical perspective, data are only one component, with hardware, research and algorithms and the commercial AI sector activity also important drivers. According to the Oxford University study, the data advantage would have to be over 4 times that of each of the three other drivers. Proxy measures of the other drivers are evolving all the time and include financing measures, market shares of semiconductor products, numbers of AI experts, conference presentations and numbers of start ups. In 2018, their AI Potential Index placed China’s capabilities at about half of the US’s.
Metrics for various drivers in China’s AI development
Beyond data and other technical factors, comes talent. China has invested heavily in AI talent, but is having trouble retaining it. According to Macropolo, “China has been successful in producing AI talent, evidenced by the rapid growth of AI human capital over the last decade. But talent acquisition is only one part of the puzzle—equally important is retaining that talent so they contribute to China’s AI aspirations over the long term. On the retention front, however, China has not done nearly as well.”
While China has been highly successful at developing top AI talent, well over half of that talent eventually ended up in the US rather than getting hired by domestic companies and institutions. Most of the government resources went into expanding the talent base rather than creating incentives and an environment in which they stay.
Current location of China-educated AI scientists
AI is more than data. While China has some data advantage, the relationship between AI and data is more complex and sophisticated than quantity based on population size. “Winning” could also be defined in different ways and almost certainly will be messier and more difficult to define than our current proxies allow.
China does not care about AI ethics or safety
How China approaches regulation will play a vital part in navigating AI’s unique risks. China’s goal is to have constructed comprehensive AI laws and regulations – including safety, control, ethics and policy—by 2030. While there is low engagement (pdf) with Western countries and institutions on discussions of AI safety, this doesn’t mean there is no discussion within China. A wide range of Chinese researchers are involved in translating IEEE’s Ethically Aligned Design report and there are growing efforts by Chinese scholars to tackle the hard questions of AI governance.
An indicator of China’s ambitions is how it shapes international AI standards, including for ethics. An influential book by Tencent researchers, Artificial Intelligence: A National Strategic Initiative for Artificial Intelligence, calls for strong regulations and that “China should also actively construct the guidelines of AI ethics, play a leading role in promoting inclusive and beneficial development of AI. In addition, we should actively explore ways to go from being a follower to being a leader in areas such as AI legislation and regulation, education and personnel training, and responding to issues with AI.”
A pain point in the development of AI in China is data liberalization. In January 2018, the Chinese government released a new national standard on the protection of personal information, which contained more comprehensive and onerous requirements than even GDPR. As Ding states, “this vigorous and unresolved debate over data privacy combats common misperceptions of China’s relatively lax privacy protections and is an important one to follow as China advances in AI.”
It’s challenging to understand the AI landscape in China, which shouldn’t come as a surprise. This makes it particularly difficult to get a handle on the co-called “AI arms race.” The reality is more complex, dynamic and political and will undoubtably become even more so as the trade-induced tensions between the US and China continue.