Some of the world’s most valuable firms use machine learning to create value for their stakeholders, be it owners, customers, suppliers, or employees. Notable examples include Google, Facebook, and Apple, who not only create unique services that are superior to those of competitors but also erect entry barriers in the process. The secret to their success is their activation of data network effects. We outline this crucial mechanism underlying their success.
Data network effects can be effectively illustrated through an example. A driver asks Waze, the traffic navigation software for smartphones, for the fastest possible route from, say, Stockholm to Paris. This route may not be the shortest. Waze not only accesses GPS navigation and maps but also continuously collects real-time position and movement data from its users. It combines this information with other data such as accidents, major social events, weather forecasts, and actual weather. Together with historical data, this information is analyzed continuously by machine learning algorithms. These algorithms identify patterns in the data, which are then used to derive predictions that translate into route recommendations. Because such recommendations are more precise than those of navigation systems that only use GPS and maps, Waze attracts a large userbase.
”Unlike network effects, however, data network effects generate value not only from the size of a network but also from the scale of machine learning to make predictions as accurate as possible.”
A growing number of users generate an increasing volume of unique user data on actual traffic situations and drivers’ chosen routes. The data is continuously analyzed to offer updated predictions of future traffic situations. Waze thus acquires historical traffic data, which represents a unique asset that is central for making precise route recommendations. The more Waze is used, the more unique data is generated about actual travel routes. In turn, predictions improve, attracting new users and retaining existing ones. This recurring loop exemplifies a data network effect, where the scale of machine learning from a network of service users generates more value for each user because that service is customized to each user’s requirements. Activated data network effects may also act as an entry barrier against potential competitors, who lack historical user data and are therefore unable to generate accurate predictions like the pioneering firm. This loop in turn acts as a user lock-in mechanism by demotivating users from migrating to an alternative provider with inferior services. In this sense, Amazon’s customer product recommendations, Google’s search services, and Facebook’s matching of member profiles to ads have all activated data network effects that make it near impossible for newcomers to compete head-to-head. Ultimately, Google acquired Waze for nearly $1 billion.
The different kinds of network effects for value creation
Data network effects can be distinguished from more conventional network effects, where value comes solely from the size of a network. In a network of actors, the value of an offering may come from both the offering’s inherent properties and the fact that the offering has many other users. For instance, the more users a telephone network has, the more potential connections the telephone network offers to each user, and hence the greater the value that is offered to users. Facebook, Google, Uber, Airbnb, and many other tech firms exploit network effects, which make them successful and serve as entry barriers. Unlike network effects, however, data network effects generate value not only from the size of a network but also from the scale of machine learning to make predictions as accurate as possible.
Key success factors in activating data network effects
Research suggests a number of key factors in the successful activation of data network effects, namely in making predictions that generate perceived value for users1. The first two factors refer to machine learning capability in terms of the accuracy and speed of predictions in relation to a user’s task. For example, if a driver stopped at an intersection requests the fastest route from a navigation system, the driver expects the recommendation within seconds, not minutes or hours. Similarly, if the recommended route from Stockholm to Paris predicts 19 hours of non-stop driving, but it turns out to require 29 hours, then the user’s satisfaction will be low.
The next two success factors, data quantity and data quality, concern data stewardship. Depending on the user and the technology, large sets of data are required to detect the right patterns that can be translated into rules for predictions. If there is insufficient data, the algorithms will not be able to produce accurate productions. However, the mere volume of data is not enough. The quality of the data is also important. Data quality refers both to the accuracy of data and to the range of cases. For example, vehicles operate in very different contextual conditions, say, from the Arctic to sub-Saharan Africa, but the gathered data may represent only central Europe. In such a case, the range of vehicle usage situations will not be represented richly enough, which may generate misrepresented patterns and thus poor predictions for a given vehicle.
The next two key success factor are performance expectancy and effort expectancy. These factors relate to the user-centricity of the service. Performance expectancy refers to the level of a user’s belief that using the service will help to complete a task, such as driving from Stockholm to Paris as quickly as possible. In short, the higher the performance expectancy of the service, the better. Effort expectancy, on the other hand, refers to a user’s belief that using the service will be free from effort, or easy. In short, the easier to use, the better. The importance of these factors lies in the fact that they promote or hinder users’ actual use of the service. The more the service is used, the more unique data is generated. Hence, the identified patterns are better, and the predictions and recommendations improve.
The final two key success factors relate to the legitimacy of the service. Legitimacy refers to a user’s belief that the service, and its stakeholders (including owners) behave in conformity with the legal and socio-cultural contextual social norms. This legitimacy can be operationalized in terms of personal data use and prediction explainability. In the case of personal data use, users would probably be unhappy if a route recommendation service provided personal use data to, say, an intelligence agency without user acceptance. In the case of the second factor, when the recommended route does not provide the fastest possible route, but the app explains that the shortest route has been prioritized for emergency vehicles, then users would probably be satisfied.
As this discussion implies, these factors are highly sensitive to the actual application, be it a recommendation of a new book by Amazon or a recommendation for a lifesaving dose of a drug. Also, the listed factors interact, meaning that assessing one factor may require simultaneous assessment of another. For instance, when a higher quality of data is obtained, a lower data volume may be needed.
Strategic value of data network effects
Research suggests that the success factors discussed earlier are meant to lead to the operational success of data network effects. However, these factors do not concern the strategic value of activating these effects. A firm can invest in numerous solutions for various uses of machine learning technologies, where each can succeed or fail in activating data network effects. The strategic question is then as follows: If successfully activated, will such data network effects attract new users by creating a superior service and demotivate existing users from migrating to competitors? If the answer is yes, then such an activated data network effect may lock in users and lock out emerging competitors. Hence, data network effect activation is an entry barrier and has strategic value for firms. The evidence suggests that it is notoriously challenging for firms to enter a market where one or two pioneers have activated data network effects, typically in tandem with size-based network effects. Our research suggests that, although it is very difficult, there are pathways to entry. One is the niche-centered approach, where an entrant firm identifies an underserved service space with an attractive target group. This approach is exemplified by the cases of LinkedIn and ResearchGate, who covered niches that the massmarket social media platform Facebook underserved. Another pathway to entry is to acquire legitimacy to access user data, either because regulators grant such access (as with banking) or because the new entrant somehow convinces the data owners to grant such access.
Recommended reading
> 1 Gregory, R. W., Henfridsson, O., Kaganer, E., & Kyriakou, H. (2021). The role of artificial intelligence and data network effects for creating user value. Academy of Management Review, 46(3), 534–551.