Grey box testing for AI Transparency

Apr 20

Without the user interface of the World Wide Web, the Internet would be much less usable by the general public. The Internet has become a global, public good, deeply integrated with everyday life. Algorithms and web services, the underlying architecture of our user experience, are primarily owned by corporations who operate these systems under closed-source intellectual property protection. Intellectual property rights effectively turn every user into a black box tester, a tester who uses software but is not permitted to see the code's structure or design.

It does not take a computer scientist to be concerned that systems that guide our shopping, our online searches, our information, and soon our cars, and maybe even our laws, are not rigorously scrutinized outside of these corporate silos. Governments, whose role is to protect citizens from potential exploitation by firms, struggle to understand their role in managing these systems. They are even adopting them, despite an under-defined regulatory environment.

Today, using user data, the major technology companies curate our personal information and purchase history, facilitate the spread of news, support the identification of criminals, and even identify some cancers and other health ailments. Soon these systems might drive our cars, police our streets, identify potential criminals before they have committed a crime, and collect and save data across various metrics to assist with their technologies.

Government occupies a unique data space with access to some of the most personal and significant data in a citizen's life, such as tax information, incarceration background, and social security data. In our personal lives, to a certain extent, we control how much and what brands of technology we bring into our homes. In the public sphere, we relinquish some control to the government who increasingly leans on technological efficiency to manage public life. What is the government's responsibility to scrutinize the (currently secret and proprietary) privately owned technologies when they are used as a form of public support and control?

How can algorithmic regulation address Government’s responsibilities and challenges? One strategy lies in a form of software testing called Grey Box testing. Grey Box testing protects some of the code, while allowing researchers to test for the overall architecture and function of the system.

This chart describes different types of goods and where they fall in public use.

The intersection of IP and control

The first person to come up with a new idea, say the invention of super-sugary, coca filled, carbonated drink, or an algorithm that lets users type a query about almost anything into a site, returning specific information related to that question, gets to keep that invention as IP. Specifically, when it is not publicly registered and disclosed as a patent, it is called a trade secret, proprietary to the company, and protected by law.

These laws preclude someone from taking credit for an idea that they did not invent. If anyone could steal an idea and call it their own, there is less incentive to innovate. IP protections allow technology companies to keep their code in their silo of protected information.

Whether for public or private use, the law treats intellectual property algorithms the same. Publically funded machine learning for police facial recognition software or a private company's free website, like Google search and Facebook's content curation algorithm, are all protected.

The distinction to be made here is that if technology executes laws, such as in the police example, it is drawing from a power previously monopolized by the state; only the state has the authority to arrest wrongdoers. Placing that control in the hands of algorithms, which are private companies' intellectual property, outsources some of the government's power and responsibility to firms.

In a democracy, the public endows the governments with a mandate to rule, entrusting to them the legitimate use of force to create and uphold standards for citizen rights, among other things. In modern capitalist Democracies, firms are often thought to be more efficient in developing and deploying technology and services. Efficiency, however, is not the only metric important to the public good. It would be a mistake to assume firms' incentives to develop and implement technology mirror those of the government.

Increasing reliance on private software, which governments are often not legally permitted to evaluate, diminishes the government's ability to ensure these services are distributed equitably, with citizen privacy at the forefront. The challenge is to review the code and build privacy rights that do not diminish the intellectual property rights that spur innovation.

Using white-box testing to regulate public good algorithms

In the current regulatory framework, much of the code included in public systems is private, exclusive, and secret. If a local municipality insisted that a company operating traffic cameras add anti-bias functions into those cameras, that code would still be the firm's property. Firms are not obligated to share code with the local government even if it determines who receives parking tickets or enforces speed laws.

If governments seek to retain control, one method is to curate carefully restricted access to some of the software design, white box testing. Google's search algorithm will likely never move into the open-source realm, and doing so might disincentivize innovation. However, suppose Google were to design a public system. In that case, the government could create a white box testing task force, allowing them to test the software in a highly protected and secure environment to ensure that it adhered to equality standards and other laws.

White box testing allows a programmer to see the internal design and infrastructure of software for evaluation purposes. It is a highly technical form of testing, requiring the tester to know the source code's inner workings to evaluate its purpose and function in a real-world context.

In this case, positioning the tester, governments, to define the level of bias that is acceptable (acknowledging that no system is wholly free from bias), test for it, and mandate modifications to the software if it is not performing to the accepted level. In some cases, they could even develop a patch or additional code requirements for technology in public use, standardizing government anti-bias implementation.

Government led-white box testing does not fully solve the intellectual property debate. Depending on the software, governments may need access to some of the most lucrative and secret algorithms and code in the world, which some firms may not be willing to provide.

The current setup governments relinquish power over citizen rights and privacy to private firms. Changing the locus of power may diminish trust in digital governance and fundamentally change the power structure of modern democracy. In the public sphere, transparent standards for technology are a government responsibility.

The balance between encouraging innovation, particularly innovation in governance, while protecting citizens from firms' interests, is not a one-size-fits-all solution. Using a secure, white-box testing situation such as the one proposed in this article begins to set up a framework for increased governmental oversight of the digital, public services that may soon be integrated into our daily lives.

Jordan Shapiro

Grey box testing for AI Transparency

The intersection of IP and control

Using white-box testing to regulate public good algorithms

PPI Blog: Improving Government Customer Service Should Be Highlighted in Biden's State of the Union

Machine Readable Laws: Macro-considerations for government digitalization strategy